Is my model fair? It depends on what you mean.
Everyone agrees: machine learning models should be fair, meaning that they should not discriminate against unprivileged groups and exacerbate social inequalities. While we can agree in principle, the details immediately lead to disagreement. Various definitions of fairness can and do conflict. Understanding and prioritizing between notions of fairness is a key responsibility for our leaders, from government regulators down through corporate risk management officers.
Here, I’ll give a few key points on model fairness and save the detailed discussion for the future.
Ignorance as compliance?
A prominent default approach of data scientists is to restrict model inputs to exclude any protected class information (like race, religion, etc)*. This approach avoids ‘direct discrimination’ and follows the basic fairness principle that two identical people of different protected classes should be treated the same. This definition of fairness has been highlighted recently in the debate on race-based college admissions where the Supreme Court decided colleges cannot explicitly consider students’ race when evaluating their applications.
The problem with this approach is that it does not address the danger of ‘indirect discrimination’, which occurs when model inputs act as a proxy for protected class information. Redlining is the classic American example, where loan approval models were based on neighborhood. Since neighborhoods were so segregated, the models effectively discriminated based on race.
In today’s world, the redlining example is misleading because it is too simple. Nowadays, models often have hundreds of inputs or more. A slight effect from various inputs can combine to create a large aggregate impact. This sort of indirect discrimination is difficult to detect and may often be the accidental output of well-meaning data scientists.
Bias testing: the promise and the peril
Alternatively, it is increasingly popular to require model creators to use protected class information, rather than ignoring it. In this approach, the data scientists typically measure a ‘fairness metric’ and calculate it for each protected class group. A model is fair if the metrics are similar between groups. The testing approach is so attractive because it applies objective, measurable criteria.
But disputes immediately arise when it comes to defining a ‘fairness metric.. As an example, consider the potential racial discrimination in a model that determines car loan approval:
Is the model fair if it grants approvals to each racial group at similar rates? Here, the ‘fairness metric’ is approval rate
Or is it more fair to focus on accuracy, perhaps by ensuring that a racial group’s high approval rate is justified by that group’s high rate of successful loan repayment. Here the ‘fairness metric’ is model precision.
Reasonable people will make different choices, and there are countless additional options available. To make matters worse, it is often impossible to satisfy multiple metrics simultaneously.
What can we do?
Given the messy set of options available for bias testing, most companies will simply revert to the default option of restricting model inputs. In fact, a company that voluntarily performs bias testing opens themselves up to legal liability or public scrutiny. As a relevant case study, consider when Meta performed internal studies finding that instagram use is correlated with poor mental health outcomes, particularly for teenage girls. The study results eventually leaked and led to public outcry and congressional hearings. Meta leadership were either brave or ignorant in pursuing the internal study in the first place. From this episode, most companies will learn to avoid potentially self-critical analysis, rather than making their products safer.
While moving beyond the default to new fairness metrics is difficult, I still have hope. However we define fairness, simply raising awareness of the risks will lead to more thoughtful models. And some model governance practices like including diverse voices are more universally accepted. As we develop new regulations, I look forward to a more public debate where we can hash out the best approaches.
*note on language: “protected class” has a technical definition referring to any characteristic that it is illegal to discriminate based on. What is included will differ by context and jurisdiction. My arguments apply equally well to characteristics which, although legal, we may still want to avoid discriminating based on.