Human choices influence almost every aspect of AI, from curating datasets to labeling training data. This creates a particularly complex hurdle for computer scientists: how to stop human biases making their way into AI systems, with potentially harmful results.
The stakes are high. With AI now used to make weighty decisions in banking, hiring and law enforcement, it can help decide, for instance, who gets a bank loan, job interview or parole. Enter Aida Mostafazadeh Davani, a USC computer science Ph.D. student, whose work focuses on computational social sciences and the ethics of AI.
Advised by Morteza Deghani, an associate professor of psychology and computer science, Davani is part of USC’s Computational Social Sciences Laboratory, where researchers and students come together to apply computer science methods to problems related to human behavior and psychology.
When AI learns social stereotypes
One area where bias can creep in is hate speech detection—the automated task of determining if a piece of text contains hate speech, especially on social media. In particular, these biases are often based on stereotypes: the fixed, overgeneralized ideas we have about a particular type of person or thing.
If humans training AI identify hate speech based on social stereotypes, the thinking goes, then the AI will ultimately do the same. This can determine, for instance, which tweets go viral and which get removed.
In a recent paper, presented at the virtual CogSci 2020 conference Aug. 1, Davani and her co-authors found that hate speech classifiers did indeed learn human-like social stereotypes.
Specifically, the team analyzed the data in relation to the stereotype content model (SCM), a social psychology theory, which hypothesizes that all group stereotypes and interpersonal impressions form along two dimensions: warmth and competence.
“The data that’s out there is the result of biases that already exist in our society.” Aida Mostafazadeh Davani
In this case, the researchers determined that people are less likely to label text as hate speech when the groups referred to are perceived as highly competent, but less warm. Conversely, they are more likely to label text as hate speech when the group is perceived as warm, but less competent. Two categories people tend to stereotype as warmer but less competent? Women and immigrants.
Identifying this type of bias is important because, to make AI systems fair, we have to first be aware of the bias’s existence—and identifying this requires the expertise of psychologists and computer scientists working together.
“The data that’s out there is the result of biases that already exist in our society,” said Davani. “Some people think we can use machine learning to find biases in the system, but I don’t think it’s that easy. We first have to know these biases exist. That’s why, in our lab, we focus on identifying this type of bias in society, and then connecting it to how the model becomes biased as a result.”
Under-reporting of hate crimes
Originally from Iran, where she earned a master’s degree in computer software engineering from the Sharif University of Technology in Tehran, fairness and equity are at the heart of Davani’s work.
“At a certain point in your education, I think you ask yourself: how will the work I’m doing impact society?” said Davani.
“Right now, machine learning is being used in many applications and people trust it, but if you have the knowledge to look deeper and see if it is doing harm, I think that is more important than just trying to make the models more accurate. You want to figure out who is being hurt, not just look at people as data points.”
In her previous work, Davani and her colleagues examined the issue of under-reported hate crimes in the U.S. using computational methods. Hate crime in the U.S remain vastly under-reported—by victims, law enforcement and local press.
For instance, in 2017, agencies as large as the Miami Police Department reported zero incidents of hate. This “seems unrealistic,” wrote Davani and her co-authors in the paper, published in Empirical Methods in Natural Language Processing (EMNLP).
In addition, there are no official reports from a large number of US cities regarding incidents of hate. To address this issue, Davani and her co-authors used event extraction methods and machine learning to analyze news articles and predict instances of hate crime.
“You want to figure out who is being hurt, not just look at people as data points.” Aida Mostafazadeh Davani
Then, they used this model to detect instances of hate crimes in cities for which the FBI lacks statistics. Comparing the model’s predictions to the FBI reports, they were able to establish that incidents of hate are under-reported compared to other types of crime in the local press.
Using event detection methods in conjunction with local news articles, they were then able to provide conservative estimates of the occurrence of hate crimes in cities with no official representation. The researchers state the models’ predictions are lower bound estimates—in other words, the real number is probably even higher.
A possible application of this method is creating a real-time hate crime detector based on online local news agencies, providing researchers and community workers with estimated number of hate crimes where statistics do not exist.
In her future work, Davani plans to continue to design approaches to reduce social group bias in machine learning to support the fair treatment of groups and individuals, which requires an understanding of social stereotypes and social hierarchies.
As her supervisor, Dehghani said he look forward to seeing Davani make waves in the field.
“Aida is a brilliant computer scientist and, at the same time, can fully engage and understand social scientific theories,” he said. “She is well on her way to becoming a star in the field of fairness in AI.”