Language (Technology) is Power: A Critical Survey of “Bias” in NLP

Abstract

We survey 146 papers analyzing “bias” in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing “bias” is an inherently normative process. We further find that these papers’ proposed quantitative techniques for measuring or mitigating “bias” are poorly matched to their motivations and do not engage with the relevant literature outside of NLP. Based on these findings, we describe the beginnings of a path forward by proposing three recommendations that should guide work analyzing “bias” in NLP systems. These recommendations rest on a greater recognition of the relationships between language and social hierarchies, encouraging researchers and practitioners to articulate their conceptualizations of “bias”—i.e., what kinds of system behaviors are harmful, in what ways, to whom, and why, as well as the normative reasoning underlying these statements—and to center work around the lived experiences of members of communities affected by NLP systems, while interrogating and reimagining the power relations between technologists and such communities.

Press summary

Language plays many roles in communication: positive roles like conveying knowledge or relating to others, and problematic roles like transmitting stereotypes and serving as barriers to entry. A large body of research has been conducted recently focusing on the problematic ways in which language is used, and which natural language processing (NLP) systems pick up on, generally all scoped as “bias” in NLP. We survey 146 papers on “bias” in NLP systems, finding vague and inconsistent conceptualizations of “bias,” describing a wide range of system behaviors. Often these conceptualizations are given without explicit statements of why these system behaviors are harmful, in what ways, and to whom, and without the social values behind these statements. For example, some papers on “racial bias” focus on race-based stereotypes expressed in written text, while others focus on how NLP systems perform differently on language written by speakers from different racialized communities. This is a problem because it limits the ability of the NLP community to recognize and measure progress, or even to debate whether certain “debiasing” approaches are mitigating harms or actually creating new ones.

We provide three recommendations for research on “bias” in NLP systems. First, we recommend that the NLP community discuss “bias” in the context of the broader social conditions in which language is used, including the ways that beliefs about language shape society. For example, society’s anti-Black beliefs stigmatize African-American English (AAE) as ungrammatical or offensive; this stigmatization is reinforced by toxicity detection systems that incorrectly flag AAE as more toxic than Mainstream U.S. English. Second, we urge the NLP community to be precise about what social values are being called upon when describing “bias,” and in what ways particular system behaviors are harmful. Third, because language is a social phenomenon, we encourage the NLP community to draw more heavily upon research traditions in HCI, social computing, sociolinguistics, and beyond to understand how NLP systems impact different communities in different ways. The NLP community currently has the power to decide how NLP systems are developed and deployed; to be more equitable, development and deployment decisions must be made accessible and accountable to the communities most likely to be impacted by these systems.

Press coverage

“Microsoft Researchers Say NLP Bias Studies Must Consider Role of Social Hierarchies like Racism” (opens in new tab)

Failures of imagination: Discovering and measuring harms in language technologies

Auditing natural language processing (NLP) systems for computational harms remains an elusive goal. Doing so, however, is critical as there is a proliferation of language technologies (and applications) that are enabled by increasingly powerful natural language generation and representation models. Computational harms occur not only due to what content is being produced by people, but also due to how content is being embedded, represented, and generated by large-scale and sophisticated language models. This webinar will cover challenges with locating and measuring potential harms that language technologies—and the data they ingest or generate—might surface, exacerbate, or cause. Such harms can range from more overt issues, like surfacing offensive speech or reinforcing stereotypes, to more subtle issues, like nudging users toward undesirable patterns of behavior or triggering…