Abstract
Recent studies have identified various forms of
bias in language-based models, raising concerns
about the risk of propagating social biases against
certain groups based on sociodemographic factors
(e.g., gender, race, geography). In this study, we
analyze the treatment of age-related terms across
15 sentiment analysis models and 10 widely-used
GloVe word embeddings and attempt to alleviate
bias through a method of processing model training data. Our results show significant age bias is
encoded in the outputs of many sentiment analysis
algorithms and word embeddings, and we can alleviate this bias by manipulating training data