Being colorblind doesn’t mean that color doesn’t exist. Similarly, not including sensitive factors such as race and sex into algorithms doesn’t mean the algorithms won’t carry biases formed on race or sex. Those biases are ingrained into society, hence the data. Most algorithms are literal; their outputs are a function of the patterns they observe.
Nonetheless, a common technique that developers have applied is straight omission despite its continuous failure. Kwok from Yale’s School of Management explains when race is removed from racially biased algorithms, a subtler biased “latent discrimination” is introduced where other factors, such as income or location that are correlated with race, essentially serve as proxies for race. The Harvard Business Review also investigated an employment recruitment scenario and found that proxies could predict gender with 91% accuracy in data.
The omission strategy extends beyond just individual scenarios, though. During a recent conference on AI Regulation at California Western School of Law, a French panelist noted that France doesn’t have to deal with the racial bias issue since they simply do not collect race as a factor. This is due to the GDPR, which prohibits the use of ‘special categories of data’ (Article 9). This includes sensitive factors as well as proxies that may reveal them. It is phrased as follows:
Processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation shall be prohibited.
Countries subject to the GDPR, such as France, still have racial biases. They are just unable to be measured since the data is never collected. However, one could argue that perhaps biases don’t need to be “fixed” since our algorithm should reflect real life. When ProPublica criticized the maker of COMPAS, a recidivism algorithm that found black defendants to be nearly twice as likely to be classified as high risk compared to their white counterparts, the algorithm maker and researchers responded that it was mathematically impossible to have an algorithm that didn’t result in such racial gaps due to the impact of race on the recidivism.
This reasoning is problematic since algorithms can amplify and perpetuate biases. For example, predictive policing tends to drive law enforcement to black and brown areas based on past data. However, the past data is biased based on heightened racial tensions, and increased law enforcement in such areas increases arrests, skewing future data and increasing the racial disparity among arrestees.
We need a solution to prevent algorithms from perpetuating cycles of existing biases, and simply ignoring sensitive factors only masks the issue. The U.S. lacks a regulatory framework that allows organizations to measure and mitigate their own bias. The White House Office of Science and Technology’s AI Blueprint outlines thorough recommendations for best practices. However, the lack of enforcement undermines the Blueprint’s effectiveness, as evidenced by the harmful impact of biased algorithms being deployed. Since sweeping bans such as the GDPR Article 9 will do little to mitigate bias, I argue that the policymakers’ role shouldn’t be to tell developers how to minimize bias but rather do its part as a regulator to strictly hold them accountable through audits.
Here is a sample auditing framework that draws heavily from the National Institute of Standards and Technology’s (NIST) identification of three primary categories for AI bias: systemic, computational, and human.
Assessment of AI System Objectives
Purpose of System
Assumptions Regarding Fairness and Bias
Organizational Norms (e.g. Implicit Bias Training)
Diversity of Team
Data Management and Analysis
Data Collection Oversight
Proxy Identification
Algorithm Development and Model Training
Transparent Design
Bias Mitigation Techniques Used
Testing and Evaluation