Fairgen is on a mission to make applications of AI systematically fair. The first question we always get is what does being fair even mean? Interestingly, the consensus from AI ethics experts is that there can be no such consensus. There are plenty of fairness definitions, which are mutually exclusive -- they can not all be satisfied at the same time.
The right notion of fairness should hence be decided upon by a community of stakeholders (e.g., employees, customers, executives, regulators) in a case-by-case fashion. Outlined below are the four main fairness notions for anyone interested in making their data pipeline ethical.
1. Outcome-based Demographic Fairness
The most common notion of fairness is demographic parity, which evaluates the disparity in the number of positive outcomes in any subpopulation (e.g., ethnicity, gender, nationality).
Example: Loan fairness can be evaluated using the adverse impact ratio (AIR) -- which measures the disparity in the number of loans allocated to males and females, or other protected subpopulations. In order to be fair according to the AIR, a bank needs to allocate the same number of loans proportionally to each protected subgroup (e.g., to males and females).
In order to satisfy demographic notions of fairness, it is typically required to perform positive discrimination as certain ethnic subpopulations may be less educated, or have less favorable financial background and hence be less inclined to obtain a positive outcome.
2. Performance-based Demographic Fairness
Another fairness option consists in evaluating the performance of a decision pipeline for different subgroups. The goal in this case is making sure the algorithm is as performant when making decisions on different protected subgroups. A few variations exist, such as Equalised Odds, which requires the true positive and false positive rates in each protected subgroup to be equal.
Example: A healthcare company implemented a detection algorithm for illness X. However they noticed their algorithm has an accuracy of 99% when detecting said illness in male patients, and of 89% when applying it to female patients. In order to be fairer, they aim to bring up the female accuracy towards the level of the male accuracy.
Note that demographic fairness (see 1.) may be counter-productive in this example as it would enforce detecting the same number of ill patients in male and female subgroups even though there may be less ill patients in one of these subgroups.
3. Fairness Through Unawareness (FTA)
FTA consists in removing the sensitive attribute from the data, i.e., removing the sensitive column from the table. While it may seem intuitive that it will make an AI less biased, it is actually pretty ineffective in practice as the sensitive information is encoded in other proxy variables, which will leak the bias in the rest of the decision pipeline.
Example: To reduce the bias in their loan decision process, a bank removes the column gender. However, they have education as a variable, and e.g., 65% of Math students in universities are males, hence education is a good proxy for gender. If the bank leverages education as a feature in its decision pipeline, bias can still be perpetrated.
4. Individual Fairness
Finally, individual fairness is satisfied if similar individuals receive similar decisions. This fairness notion is hence related to robustness of AI algorithms. It however does not have much implication on subgroup notions of fairness -- and is typically contradicting these.
Example: Insurance X is interested in individual fairness. It aims to make sure that two similar individuals applying for insurance will obtain a similar pricing.