When studying responsible AI (artificial intelligence), most of the time we are studying its impact on people and society. Sociologists, psychologists, and media scientists have long-term accumulation and research results in these areas. When we talk about fairness, we would better work with sociologists to analyze how AI could lead to stratification of society and polarization of people’s opinions. As we study interpretability, we also hope to discuss with psychologists why people essentially need more transparent models, and how to best show the inner mechanisms of AI models. Communication scientists can help us gain a deeper understanding of AI models used in information distribution. From another perspective, we are also very interested in the application of responsible AI in these interdisciplinary areas, to help solve their problems. In this workshop, we invited researchers from different disciplines to discuss with us how we can jointly advance research in responsible AI.
Speakers
James A. Evans (opens in new tab)
Professor, Director of Knowledge Lab
The University of Chicago
Pascale Fung (opens in new tab)
Professor
Hong Kong University of Science & Technology
Rui Guo (opens in new tab)
Associate Professor of Law
Renmin University of China
Fang Luo (opens in new tab)
Professor
Beijing Normal University
Beibei Shi
Senior Research Program Manager
Microsoft Research Asia
David Stillwell (opens in new tab)
Professor
University of Cambridge Judge Business School
Xiaohong Wan (opens in new tab)
Professor
Beijing Normal University
Xiting Wang
Principal Researcher
Microsoft Research Asia
Fangzhao Wu
Principal Researcher
Microsoft Research Asia
Xing Xie
Senior Principal Research Manager
Microsoft Research Asia
Yongfeng Zhang (opens in new tab)
Assistant Professor
Rutgers University
Lidong Zhou
Corporate Vice President, Managing Director
Microsoft Research Asia
Jonathan Zhu (opens in new tab)
Chair Professor of Computational Social Science
City University of Hong Kong
Jun Zhu (opens in new tab)
Bosch AI Professor
Tsinghua University
Agenda
Time | Session |
---|---|
8:30–8:40 | Opening remarks | video (opens in new tab) Lidong Zhou (opens in new tab), Microsoft Research Asia |
8:40–8:45 | Group photo All speakers and attendees |
8:45–9:15 | Keynote: Responsible AI Research at MSR Asia | video (opens in new tab) | slides (opens in new tab) Xing Xie (opens in new tab), Microsoft Research Asia Abstract: With the rapid development of artificial intelligence, its social responsibility has received extensive attention. In this talk, I will present our recent research that we have conducted to solve important problems such as privacy protection, explainability, and ethics in AI. In particular, I will describe how we design methods to address the challenges posed by big models, including the huge overhead of computation and communication, and the impact of model complexity on its transparency and bias. At the same time, I will also share some of our thoughts on the crucial role of interdisciplinary research in this field. |
Session 1: Social Impact of AI Chair: Beibei Shi, Microsoft Research Asia | |
9:15–9:45 | Research Talk: Towards Human Value Based NLP | video (opens in new tab) | slides (opens in new tab) Pascale Fung (opens in new tab), Hong Kong University of Science & Technology Abstract: The AI “arms race” has reached a point where different organizations in different countries are competing to build ever larger “language” models in text, in speech, in image and so on, trained from ever larger collections of databases. Our society in general, our users in particular, are demanding that AI technology be more responsible – more robust, fairer, more explainable, more trustworthy. Natural language processing technologies built on top of these large pre-trained language models are expected to align with these and other human “values” in because they impact our lives directly. The core challenge of “value-aligned” NLP (or AI in general) is twofold: 1) What are these values and who defines them? 2) How can NLP algorithms and models be made to align with these values? In fact, different cultures and communities might have different approaches to ethical issues. Even when people from different cultures happen to agree on a set of common principles, they might disagree on the implementation of such principles. It is therefore necessary that we anticipate value definition to be dynamic and multidisciplinary. I propose that we should modularize the set of value definitions as external to the development of NLP algorithms, and that of large pretrained language models and encapsulate the language model to preserve its integrity. We also argue that value definition should not be left in the hands of NLP/AI researchers or engineers. At best, we can be involved at the stage of value definition but we engineers and developers should not be decision makers on what they should be. In addition, some values are now enshrined in legal requirements. This argues further that value definition should be disentangled from algorithm and model development. In this talk, I will present initial experiments on value based NLP where we allow the input to an NLP system to have human defined values or ethical principles for different output results. I propose that many NLP tasks, from classification to generation, should output results according to human defined principles for better performance and explainability. |
9:45–10:15 | Research Talk: The Long March Towards AI Fairness | video (opens in new tab) | slides (opens in new tab) Rui Guo (opens in new tab), Renmin University of China Abstract: To protect people from unfair treatment or discrimination, conventional wisdom from the legal academia points to certain protected factors or social group categories to identify and prevent prohibited behaviors or biases. This has caused problems in the context of Artificial Intelligence (AI). This talk uses an example of disability discrimination to highlight the role of stereotypes and the difficulty to achieve AI fairness. Dealing with stereotypes requires our deeper reflection on the problem of moral agency in AI. |
10:15–10:45 | Research Talk: On the Adversarial Robustness of Deep Learning | video (opens in new tab) | slides (opens in new tab) Jun Zhu (opens in new tab), Tsinghua University Abstract: Although deep learning methods have obtained significant progress in many tasks, it has been widely recognized that the current methods are vulnerable to adversarial noise. This weakness poses serious risk to safety-critical applications. In this talk, I will present some recent progress on adversarial attack and defense for deep learning, including theory, algorithms and benchmarks. |
Session 2: Responsible AI: an Interdisciplinary Approach Chair: Xing Xie, Microsoft Research Asia | |
10:45–12:00 | Panel Discussion | video (opens in new tab) Host: Xing Xie (opens in new tab), Microsoft Research Asia Panelists: Pascale Fung (opens in new tab), Hong Kong University of Science & Technology Rui Guo (opens in new tab), Renmin University of China Jun Zhu (opens in new tab), Tsinghua University Jonathan Zhu (opens in new tab), City University of Hong Kong Xiaohong Wan (opens in new tab), Beijing Normal University |
Session 3: Responsibility in Personalization Chair: Fangzhao Wu, Microsoft Research Asia | |
14:00–14:30 | Research Talk: Towards Trustworthy Recommender Systems: From Shallow Models to Deep Models to Large Models | video (opens in new tab) | slides (opens in new tab) Yongfeng Zhang (opens in new tab), Rutgers University Abstract: As the bridge between humans and AI, recommender system is at the frontier of Human-centered AI research. However, inappropriate use or development of recommendation techniques may bring negative effects to humans and the society at large, such as user distrust due to the non-transparency of the recommendation mechanism, unfairness of the recommendation algorithm, user uncontrollability of the recommendation system, as well as user privacy risks due to the extensive use of users’ private data for personalization. In this talk, we will discuss how to build trustworthy recommender systems along the progress that recommendation algorithms advance from shallow models to deep models to large models, including but not limited to the unique role of recommender system research in the AI community as a representative Subjective AI task, the relationship between Subjective AI and trustworthy computing, as well as typical recommendation methods on different perspectives of trustworthy computing, such as causal and counterfactual reasoning, neural-symbolic modeling, natural language explanations, federated learning, user controllable recommendation, echo chamber mitigation, personalized prompt learning, and beyond. |
14:30–15:00 | Research Talk: Evidence-based Evaluation for Responsible AI | video (opens in new tab) | slides (opens in new tab) Jonathan Zhu (opens in new tab), City University of Hong Kong Abstract: Current efforts on responsible AI have focused on why AI should be socially responsible and how to produce responsible AI. An equally important question that hasn’t been adequately addressed is how responsible the deployed AI products are. The question is ignored most of the time, occasionally answered by anecdotal evidence or casual evaluation. We need to understand that good evaluations are not easy, quick, or cheap to carry out. On the contrary, good evaluations rely on evidence that are systematically collected based on proven methods, completely independent from the process, data, and even research staff responsible for the relevant AI products. The evidence-based medicine practice over the last two decades has provided a relevant and informative role model for the AI industry to follow. |
15:00–15:30 | Research Talk: Personalizing Responsibility within AI Systems: A Case for Designing Diversity | video (opens in new tab) James Evans (opens in new tab), The University of Chicago Abstract: Here I explore the importance of personalizing our assessment of particular humans’ values, objectives, and constraints, both at the outset of a task, and ongoingly in systems trusted to augment human capacity. Moreover, augmenting human capacity requires augmenting human perspectives. The wisdom of crowds hinges on the independence and diversity of their members’ information and approach. Here I explore how the wisdom of scientific, technological and business crowds for sustained performance and advance operate through a process of collective abduction—the collision of deduction and induction—wherein unexpected findings stimulate innovators to forge new insights to make the surprising unsurprising. Drawing on tens of millions of research papers and patents across the life sciences, physical sciences and patented inventions, here I show that surprising designs and discoveries are the best predictor of outsized success and that surprising advances systematically emerge across, rather than within researchers or teams; most commonly when innovators from one field surprisingly publish problem-solving results to an audience in a distant and diverse other. This scales insights from my prior work that shows how across innovators, teams and fields, connection and conformity is associated with reduced replication and impeded innovation. Using these principles, I simulate processes of scientific and technological search to demonstrate the relationship between crowded fields and constrained collective inferences, and I illustrate how inverting the traditional artificial intelligence approach to avoid rather than mimic human search enables the design of trusted diversity that systematically violates established field boundaries and is associated with marked success of innovation predictions. I conclude with a discussion of prospects and challenges in a connected age for trusted and sustainable augmentation through the design and preservation of personalized difference. |
Session 4: Interpretability and Psychology Chair: Xiting Wang, Microsoft Research Asia | |
15:30–16:00 | Research Talk: Personality Predictions from Automated Video Interviews: Explainable or Unexplainable Models? | video (opens in new tab) | slides (opens in new tab) David Stillwell (opens in new tab), University of Cambridge Abstract: In automated video interviews (AVIs), candidates answer pre-set questions by recording responses on camera and then interviewers use them to guide their hiring decisions. To reduce the burden on interviewers, AVI companies commonly use black-box algorithms to assess the quality of responses, but little academic research has reported on their accuracy. We collected 694 video interviews (200 hours) and self-reported Big Five personality. In Study 1 we use machine learning to predict personality from 1,710 verbal, facial, and audio features. In Study 2, we use a subset of 653 intuitively understandable features to build an explainable model using ridge regression. We report the accuracies of both models and opine on the question of whether it would be better to use an explainable algorithm. |
16:00–16:30 | Research Talk: Interpretability, Responsibility and Controllability of Human Behaviors | video (opens in new tab) | slides (opens in new tab) Xiaohong Wan (opens in new tab), Beijing Normal University Abstract: When judging whether a man should take his responsibility for his behavior, the judger often evaluates whether his behavior is interpretable and under his controllability. However, it is difficult to evaluate such quantities from external observers, as the processes and internal states inside the brain are intangible. Furthermore, it is also difficult to evaluate these internal states to details and their causality by himself, even as the owner of the behaviors. Many of human behaviors are driven by fast and intuitive processes, leaving post-hoc explanations of these processes. Even for those controlled processes, the explanations remain largely unclear. In this talk, I would like to discuss these issues in terms of neural mechanisms underlying human behaviors. |
16:30–17:00 | Research Talk: Development of a Game-Based Assessment to Measure Creativity | video (opens in new tab) | slides (opens in new tab) Fang Luo (opens in new tab), Beijing Normal University Abstract: Creativity measurement is the basis of creativity research. For a long time, traditional creativity tests have many limitations. First of all, traditional tests place too much emphasis on ‘novelty’ and ignore ‘suitability’. Second, divergent thinking is equated with creative thinking. Third, the evaluation index is single. Fourth, the traditional test tasks are simple and abstract, divorced from real problems, and lack of ecological validity. The purpose of this study was to develop a game-based assessment to measure creativity, collect log-file data, and realize the assessment of various creative thinking abilities. The creativity game test used the ‘Design of Evidence Centers’ as a framework to construct three problem situations around ‘prehistoric human life’ and participants ‘synthesized’ creative solutions by combining cards. The log-file data and criterion test data of 515 college students were collected. Study one showed that the test had good psychometric properties. In the second study, a Bayesian network was constructed with key operations as nodes to explore the influence of individual insight level on game response, so as to provide evidence for construct validity. The answering experience of the creativity game test was highly evaluated by the participants, which reflected the advantages and prospects of the game test. |
Workshop organizers
Xing Xie (Chair), Microsoft Research Asia
Beibei Shi (Chair), Microsoft Research Asia
Xiting Wang (Chair), Microsoft Research Asia
Fangzhao Wu (Chair), Microsoft Research Asia
Weizhe Shi, Microsoft Research Asia
Xiaoyuan Yi, Microsoft Research Asia
Bin Zhu, Microsoft Research Asia
Dongsheng Li, Microsoft Research Asia
Jindong Wang, Microsoft Research Asia
Microsoft’s Event Code of Conduct
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. This includes events Microsoft hosts and participates in, where we seek to create a respectful, friendly, and inclusive experience for all participants. As such, we do not tolerate harassing or disrespectful behavior, messages, images, or interactions by any event participant, in any form, at any aspect of the program including business and social activities, regardless of location.
We do not tolerate any behavior that is degrading to any gender, race, sexual orientation or disability, or any behavior that would violate Microsoft’s Anti-Harassment and Anti-Discrimination Policy, Equal Employment Opportunity Policy, or Standards of Business Conduct (opens in new tab). In short, the entire experience at the venue must meet our culture standards. We encourage everyone to assist in creating a welcoming and safe environment. Please report (opens in new tab) any concerns, harassing behavior, or suspicious or disruptive activity to venue staff, the event host or owner, or event staff. Microsoft reserves the right to refuse admittance to or remove any person from company-sponsored events at any time in its sole discretion.