Fairness in Recommender Systems

This page is a work in progress to (1) maintain a non-exhaustive index of resources, e.g., events, projects, websites, related to fairness and algorithmic bias in recommender systems, and to (2) present an ever-growing literature review on the recent and most interesting (in my biased view) works in this area.

Literature

Recommender systems are related to classification and ranking, so some seminal related work in those areas is presented first.

Fairness in Classification

Fairness, abstractly, means to not discriminate against individuals or groups. In classification, fairness is defined for the subjects receiving the classification outcome, e.g., the applicants seeking a loan. The following concepts, although introduced in the context of classification, can be generalized to other tasks. An overview of the various fairness concepts is presented in 2018_FATML On Formalizing Fairness in Prediction with Machine Learning.

A distinction is often made on whether fairness refers to parity (equality) in treatment or in impact. Fairness through unawareness, designed to ensure parity in treatment, is the concept that the algorithm should not consider sensitive (aka protected) attributes (e.g., gender, race, age), which can be used to discriminate. This can be ineffective, as other attributes may be correlated and thus be used to predict the sensitive ones.

Statistical parity (aka demographic parity) targets parity in impact, and mandates that classification outcomes should be balanced across groups. For a binary classifier, this means that the positive rate should be the same across groups. This idea forces all people in the group to be treated equally, which may be undesirable.

Equal opportunity, introduced in 2016_NIPS Equality of Opportunity in Supervised Learning, is the concept that the true positive rate should be the same across groups.

The two aforementioned concepts address fairness for protected groups, sets of individuals that share a value in some sensitive attribute, e.g., women, blacks, and thus target group fairness. In contrast, in individual fairness, the goal is to enforce that similar individuals receive similar outputs.

Fairness in Ranking

In ranking, fairness is typically defined for the objects to be ranked, e.g., candidates applying for a position. An overview of the related concepts and methods appears in 2018_CIKM_DAB Fairness and Transparency in Ranking (keynote).

A simple interpretation of statistical parity in ranking is to ensure that the ratio of protected individuals that appear within a prefix of the ranking is above a given proportion, as in 2017_SSDBM Measuring Fairness in Ranked Outputs. Then, one can check whether a given ranking satisfies this condition by means of a statistical test, discussed in 2017_CIKM FAIR - A Fair Top-k Ranking Algorithm.

To achieve statistical parity under this notion, one needs to ensure that at each ranking prefix, a minimum number of protected members appears. A simple method to ensure fairness with a minimum loss of utility was proposed in 2017_CIKM FAIR - A Fair Top-k Ranking Algorithm. Individuals are ranked decreasingly by their utility, and the ranked list is built incrementally. At each position, the individual with the highest utility is inserted, among those that would not violate the fairness constraint.

Not all positions within a ranking carry the same importance, with top positions receiving higher levels of attention, a phenomenon known as position bias. Fairness of attention (or exposure) is an individual fairness notion arguing that each ranked object should receive attention proportional to its utility. As this is impossible to satisfy in a single ranking, the goal is to achieve attention/exposure fairness in expectation, as in 2018_KDD Fairness of Exposure in Rankings, or equivalently aim for amortized fairness over a series of rankings, as in 2018_SIGIR Equity of Attention - Amortizing Individual Fairness in Rankings.

To achieve attention fairness, one can either constrain utility and maximize fairness, or vice versa. 2018_SIGIR Equity of Attention - Amortizing Individual Fairness in Rankings aims for the former optimization problem in amortization, and formulate an integer linear program. 2018_KDD Fairness of Exposure in Rankings aims for the latter in expectation, and pose a linear program, which is more more efficient to solve.

Fairness in Recommenders

The following discussion primarily concerns collaborative filtering techniques. Typically, in recommender systems we talk about recommendations of items to users. However, in the fairness context, it makes sense to abstract to objects and subjects. First, individuals that may be discriminated against, can take either role. For example, when recommending job offers, the subjects are people that may be discriminated based on race or gender; when looking for a ride, the recommended objects are taxi drivers; in social networks, friend suggestions involve individuals as subjects and objects. Therefore, fairness in recommenders can have multiple viewpoints, as discussed in 2017_FATML Multisided Fairness for Recommendation: fairness for subjects, termed consumer fairness or C-fairness, and fairness for objects, termed producer fairness or P-fairness.

Second, talking about subject and objects helps to better draw analogies to the classification and ranking tasks. Recall, that a recommender can be viewed in two ways: as a classifier, where a particular object is deemed relevant or not for all subjects, and as a ranker, where for a particular subject, the objects are to be ranked.

Fairness for subjects

2017_NIPS Beyond Parity - Fairness Objectives for Collaborative Filtering considers group fairness for subjects in memory-based collaborative filtering techniques that predict ratings (e.g., matrix factorization). Fairness is achieved at the ideal state in which the protected group incurs rating prediction errors in parity with the non-protected group. Deviation from this ideal state is quantified with four different metrics, which are introduced in the learning process as regularization terms.

The same concept is achieved in a different manner in 2018_FAT Balanced Neighborhoods for Multi-sided Fairness in Recommendation. When aggregating other subjects' opinions, as in user-based collaborative filtering, the idea is to ensure the protected group is well represented. In a user-based variant of 2011_ICDM Slim - Sparse Linear Methods for Top-n Recommender Systems, this means that the total weight assigned to the protected and non-protected group members should be equal. Then, to achieve this notion of fairness, a measure of deviation from this state is added as a regularization term.

Fairness for objects

Group fairness for objects is studied in 2018_RecSys Calibrated Recommendations; it may help to picture groups of objects in analogy to item categories, like movie genres. For a particular subject, a list of recommendations is fair (called calibrated in the paper) if it contains objects of various groups with a ratio that is equal to the group ratio present in the subject's input preferences. For example, if a user has liked 7 romance and 3 action movies, a recommended list is fair if it contains 70% romance and 30% action movies. Deviation from this state is quantified by a distance metric (KL divergence). To ensure fair recommendations, a greedy iterative re-ranking approach (post-processing), in the spirit of MMR, is taken. The goal is to construct a list that balances the utility of the objects selected and the list's deviation from the input preferences.

In a similar spirit, 2018_CIKM_DAB Bias Disparity in Recommendation Systems is concerned with group fairness for objects. Acknowledging that biases can be inherent in the input data, a recommender is regarded fair when it does not amplify existing biases, i.e., avoids bias disparity. Specifically, fairness is achieved when the recommender compiles a set of objects, such that the ratio of objects from various groups (output bias) is the same as the ratio present in the subject's input preferences (input bias). Deviation from this ideal state is captured by the relative change in the ratio per group. A re-ranking method is proposed to achieve fairness in the case of a single protected group: it works by iteratively swapping a low-utility non-protected object with a high-utility protected object.

Fairness for groups of subjects

A different line of work concerns group recommender systems, where the goal is to recommend to a group of subjects, e.g., friends looking to enjoy a movie. In this case, fairness is understood in terms of how well the system respects the individual preferences of the group members. Note that here, the term group refers to the set of subjects collectively receiving a recommendation, and is thus not to be confused with the concept of protected groups in group fairness. Actually, fairness in group recommendations is an individual fairness notion.

2017_WWW Fairness in Package-to-Group Recommendations

2017_RecSys Fairness-Aware Group Recommendation with Pareto-Efficiency

2019_SAC Top-N Group Recommendations with Fairness

Events

FAT*: ACM Conference on Fairness, Accountability, and Transparency.
FATML: Fairness, Accountability, and Transparency in Machine Learning Workshop. Co-located with: NIPS 2014, ICML 2015, DTL 2016, KDD 2017, ICML 2018.
FATREC: Workshop on Responsible Recommendation. Co-located with: RecSys 2017, RecSys 2018.
FairUMAP: Fairness in User Modeling, Adaptation and Personalization Workshop. Co-located with: UMAP 2018.
DAB: Workshop on Data and Algorithm Bias. Co-located with: CIKM 2018.

More events are listed at:

Projects

More projects are listed at:

http://www.fatml.org/resources/relevant-projects

Contact

For comments, please contact me by email.