Artificial intelligence and machine learning are becoming common in research and everyday life, raising concerns about how these algorithms work and the predictions they make. For example, when Apple released its credit card over the summer, there were claims that women were given a lower credit limit than otherwise identical men were. In response, Sen. Elizabeth Warren warned that women “might have been discriminated against, on an unknown algorithm.”
On its face, her statement appears to contradict the way algorithms work. Algorithms are logical mathematical functions and processes, so how can they discriminate against a person or a certain demographic?
Creating an algorithm that discriminates or shows bias isn’t as hard as it might seem, however. As a first-year graduate student, my advisor asked me to create a machine-learning algorithm to analyze a survey sent to United States physics instructors about teaching computer programming in their courses. While programming is an essential skill for physicists, many undergraduate physics programs do not offer programming courses, leaving individual instructors to decide whether to teach programming.
The task seemed simple enough. I’d start with an algorithm in Python’s scikit-learn library to create my algorithm to predict whether a survey respondent had experience teaching programming. I’d supply the physics instructor’s responses to the survey and run the algorithm. My algorithm then would tell me whether the instructors taught programming and which questions on the survey were most useful in making that prediction.
When I did that, however, I noticed a problem. My algorithm kept finding that only the written response questions (and none of the multiple-choice questions) differentiated the two groups of instructors. When I analyzed those questions using a different technique, I didn’t find any differences between the instructors who taught and did not teach programming! It out turned that I had been using the wrong algorithm the whole time.
My example may seem silly. So what if I chose the wrong algorithm to predict which instructors teach programming? But what if I had instead been creating a model to predict which patients should receive extra care? Then using the wrong algorithm could be a significant problem.
Yet, this isn’t hypothetical as a recent study in Science showed. In the study, researchers examined an algorithm created to find patients who may be good fits in a “high-risk care management” program. For white and black patients the algorithm identified as having equal risk, the black patient was sicker than the white patient. Thus, even though the black patient was sicker than the white patient, the algorithm saw the two patients as having equal needs.
Just as in my research, the health care company had used the wrong algorithm. The designers of the algorithm created it to predict health care costs rather than the severity of the illness. As a result, since white patients have better access to care and hence spend more on health care, the algorithm assigned white patients, who were less ill, the same level of risk as more ill black patients. The researchers claim that similar algorithms are applied to around 200 million Americans each year, so who knows how many lives may have been lost to what the study authors called a “racial bias in an algorithm”?
What then can we do to combat this bias? I learned that I used an incorrect algorithm because I visualized my data, saw that my algorithm’s predictions were not aligned with what my data or previous research said, and could not remove the discrepancy regardless of how I changed my algorithm. Likewise, to combat any bias, policy ideas need to focus on the algorithms and the data.
To address issues with the algorithm, we can push for algorithms’ transparency, where anyone could see how an algorithm works and contribute improvements. Given that most commercial machine learning algorithms are considered proprietary information, companies may not be willing to share their algorithms.
A more practical route may be to occasionally test algorithms for potential bias and discrimination. The companies themselves could conduct this testing, as the House of Representatives’ Algorithm Accountability Act would require, or the testing could be performed by an independent nonprofit accreditation board, such as the proposed Forum for Artificial Intelligence Regularization (FAIR).
To make sure the testing is fair, the data themselves need to be fair. For example, crime-predicting algorithms analyze historical crime data, in which people from racial and ethnic minority groups are overrepresented, and hence the algorithm may make biased predictions even if the algorithm is constructed correctly. Therefore, we need to ensure that representative data sets are available for testing.
Getting these changes to occur will not come easily. As machine learning and artificial intelligence become more essential to our lives, we must ensure our laws and regulations keep pace. Machine learning is already revolutionizing entire industries, and we are only at the beginning of that revolution. We as citizens need to hold algorithm developers and users accountable to ensure that the benefits of machine learning are equitably distributed. By taking appropriate precautions, we can ensure that algorithmic bias is a bug and not a feature of future algorithms.