Microsoft at FAccT 2024: Advancing responsible AI research and practice

Published

Microsoft at ACM FAccT 2024

The integration of AI and other computational technologies is becoming increasingly common in high-stakes sectors such as finance, healthcare, and government, where their capacity to influence critical decisions is growing. While these systems offer numerous benefits, they also introduce risks, such as entrenching systemic biases and reducing accountability. The ACM Conference on Fairness, Accountability, and Transparency (ACM FaccT 2024) tackles these issues, bringing together experts from a wide range of disciplines who are committed to the responsible development of computational systems.

Microsoft is proud to return as a sponsor of ACM FAccT 2024, underscoring our commitment to supporting research on responsible AI. We’re pleased to share that members of our team have taken on key roles in organizing the event, contributing to the program committee and serving as a program co-chair. Additionally, seven papers by Microsoft researchers and their collaborators have been accepted to the program, with “Akal badi ya bias: An exploratory study of gender bias in Hindi language technology,” receiving an award for Best Paper. 

Collectively, these research projects emphasize the need for AI technologies that reflect the Microsoft Responsible AI principles of accountability, inclusiveness, reliability and safety, fairness, transparency, and privacy and security. They underscore the importance of addressing potential risks and harms associated with deployment and usage. This post highlights these advances.

Spotlight: blog post

GraphRAG auto-tuning provides rapid adaptation to new domains

GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.

Paper highlights

A framework for exploring the consequences of AI-mediated enterprise knowledge access and identifying risks to workers

Anna Gausen, Bhaskar Mitra, Siân Lindley

Recent AI developments, especially LLMs, are significantly impacting organizational knowledge access and reshaping workplaces. These AI systems pose risks due to their interaction with organizational power dynamics. This paper introduces the Consequence-Mechanism-Risk framework to help identify worker risks, categorizing them into issues related to value, power, and wellbeing. The framework aims to help practitioners mitigate these risks and apply it to other technologies, enabling better protection for workers.

A structured regression approach for evaluating model performance across intersectional subgroups

Christine Herlihy, Kimberly Truong, Alex Chouldechova, Miro Dudík

Disaggregated evaluation is a process used in AI fairness assessment that measures AI system performance across different subgroups. These subgroups are defined by a mix of demographic or other sensitive attributes. However, the sample size for intersectional subgroups is often very small, leading to their exclusion from analysis. This work introduces a structured regression approach for more reliable system performance estimates in these subgroups. Tested on two publicly available datasets and several variants of semi-synthetic data, this method not only yielded more accurate results but also helped to identify key factors driving performance differences. 

Akal badi ya bias: An exploratory study of gender bias in Hindi language technology

Best Paper Award

Rishav Hada, Safiya Husain, Varun Gumma, Harshita Diddee, Aditya Yadavalli, Agrima Seth, Nidhi Kulkarni, Ujwal Gadiraju, Aditya Vashistha, Vivek Seshadri, Kalika Bali

Existing research on gender bias in language technologies primarily focuses on English, often overlooking non-English languages. This paper introduces the first comprehensive study on gender bias in Hindi, the third most spoken language globally. Employing diverse techniques and field studies, the authors expose the limitations in current methodologies and emphasize the need for more context-specific and community-centered research. The findings deepen the understanding of gender bias in language technologies in Hindi and lay the groundwork for expanded research into other Indic languages.

“I’m not sure, but…”: Examining the impact of large language models’ uncertainty expression on user reliance and trust

Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, Jennifer Wortman Vaughan

LLMs can produce convincing yet incorrect responses, potentially misleading users who rely on them for accuracy. To mitigate this issue, there have been recommendations for LLMs to communicate uncertainty in their responses. In a large-scale study on how users perceive and act on LLMs’ expressions of uncertainty, participants were asked medical questions. The authors found that first-person uncertainty expressions (e.g., “I’m not sure, but…”) decreased participants’ confidence in the system and their tendency to agree with the system’s answers, while increasing the accuracy of their own answers. In contrast, more general uncertainty expressions (e.g., “It’s unclear, but…”) were less effective. The findings stress the importance of more thorough user testing before deploying LLMs.

Investigating and designing for trust in AI-powered code generation tools

Ruotong Wang, Ruijia Cheng, Denae Ford, Tom Zimmermann

As tools like GitHub Copilot gain popularity, understanding the trust software developers place in these applications becomes crucial for their adoption and responsible use. In a two-stage qualitative study, the authors interviewed 17 developers to understand the challenges they face in building trust in AI code-generation tools. Challenges identified include setting expectations, configuring tools, and validating suggestions. The authors also explore several design concepts to help developers establish appropriate trust and provide design recommendations for AI-powered code-generation tools.

Less discriminatory algorithms

Emily Black, Logan Koepke, Pauline Kim, Solon Barocas, Mingwei Hsu

In fields such as housing, employment, and credit, organizations using algorithmic systems should seek to use less discriminatory alternatives. Research in computer science has shown that for any prediction problem, multiple algorithms can deliver the same level of accuracy but differ in their impacts across demographic groups. This phenomenon, known as model multiplicity, suggests that developers might be able to find an equally performant yet potentially less discriminatory alternative.

Participation in the age of foundation models

Harini Suresh, Emily Tseng, Meg Young, Mary Gray, Emma Pierson, Karen Levy

The rise of foundation models in public services brings both potential benefits and risks, including reinforcing power imbalances and harming marginalized groups. This paper explores how participatory AI/ML methods, typically context-specific, can be adapted to these context-agnostic models to empower those most affected.

Conference organizers from Microsoft

Program Co-Chair

Alexandra Olteanu 

Program Committee

Steph Ballard 
Solon Barocas 
Su Lin Blodgett*
Kate Crawford 
Shipi Dhanorkar 
Amy Heger
Jake Hofman*
Emre Kiciman*
Vera Liao*
Daniela Massiceti 
Bhaskar Mitra 
Besmira Nushi*
Alexandra Olteanu 
Amifa Raj
Emily Sheng 
Jennifer Wortman Vaughan*
Mihaela Vorvoreanu*
Daricia Wilkinson

*Area Chairs

Career opportunities

Microsoft welcomes talented individuals across various roles at Microsoft Research, Azure Research, and other departments. We are always pushing the boundaries of computer systems to improve the scale, efficiency, and security of all our offerings. You can review our open research-related positions here.

Related publications

Continue reading

See all blog posts

Research Areas

Related events