User research makes your AI smarter

July 1, 2019

Share this page

By Penny Marsh Collisson and Gwyneth Hardiman

Cartoon image with young person raising their right hand. Background is graphic image of shapes and design of various colors.

Note: This article was originally published on July 1, 2019, on Medium

Some things we’re learning about doing UX research on AI at Microsoft

As AI grows more prevalent, it’s changing what people expect from technology and how they interact with it. That means that every UX-er and customer-obsessed product person needs to consider not only how to create effective AI experiences, but also how to collect customer feedback along the way. Easy…right?

The good news is that many traditional research tools can help gauge customer reactions to AI. Ethnographies, focus groups, prototype research, customer surveys, and logs are all still relevant. However, AI systems differ from classical systems in that they’re context aware, personal, and able to learn over time. They also have limited accuracy and unique failure modes. These things introduce new challenges and opportunities when researching the UX of AI.

Today we’ll share practical tips for researching the UX of AI that we’ve learned along the way at Microsoft.

Diversify your recruit

Foreground: Pencil drawings of seven people, both men and women. Background a large green circle.

Illustrations by Michaelvincent Santos

As UX-ers, it’s our responsibility to ensure that the experiences we deliver embrace diversity and respect multiple contexts and capabilities. That’s especially important with AI. If your AI UX is only usable for a subset of users, potentially harmful bias will creep into your AI models. An arbitrary sample of participants, or even a split on basic demographics like gender and age, will not be enough to ensure your AI is inclusive.

Even during early feedback stages, recruit for a wide array of characteristics such as these:

Attitudes toward AI and privacy
Profiles of tech adoption
Levels of tech self-efficacy
Geographies
Social contexts and norms
Physical, cognitive, or emotional abilities

Fake it till you make it with Wizard of Oz techniques

Foreground: The back of a person looking at a large orange image. Background: A large orange image with a face traced in the color

During early prototyping stages, it can be hard to get a good read on how your AI is going to work for people. Your prototype might be missing key functionality or interactivity that will impact how participants respond to the AI experience. Wizard of Oz studies have participants interact with what they believe to be an AI system, while a human “behind the curtain” simulates the behavior that the AI system would demonstrate.

For example, a participant might think the system is providing recommendations based on her previous selections when a person in another room is actually providing them. When people can earnestly engage with what they perceive to be an AI, they will form more complete mental models, while interacting with the experience in more natural ways.

Integrate people’s real stuff into your AI prototype

A cartoon image of a person juggles several paper like documents that are connected by red lines

If study participants see generic content, their reactions may mislead you. People respond differently when the experience includes their real, personalized content such as photos, contacts, and documents.

Imagine how you feel about a program that automatically detects faces in photographs. Now, imagine seeing the faces of your loved ones identified by the system. Your reaction may be very different when you see people you know. You’ll need to spend extra time pre-populating your prototype with people’s “real” content, but it will be worth the effort.

Reference a person instead of an AI

Background: Green and yellow silhouette of a person's facial profile. Foreground: The same person stands within their silhouette

AI has a lot of hype and folklore around it. For that reason, referencing AI can cue participants to make certain assumptions — both good and bad — about their experience. For example, participants might key in on highly publicized stories of bias or failure in AI systems. Or they could assume AI is more capable and perfect than it will ever be. Getting participants to think about how a human could help them can be a good way to glean insight about where AI can be useful.

Here are some alternatives to talking about AI:

Invite participants to share how they currently enlist other people to achieve their goals.
Ask participants how they would want a human expert to behave.

Understand the impact of your AI getting it wrong

A drawing of a person stands in a purple circle, streaks of blue colors surround them

AI isn’t perfect. It’s probabilistic, fallible, and will make mistakes. Especially early in the design cycle, it can be easy to create perfect prototypes and get an overly optimistic response to your UX. While planning evaluations, build in realistic quirks or pitfalls to bridge the gulf between the shiny concept and realistic product execution. Once you understand how your AI’s failure modes impact people, you can design to mitigate their impact.

Here are a few methods to consider:

Intentionally introduce things into your prototype that are likely to be “wrong.”
Ensure that system interactions in your Wizard of Oz studies include different kinds of errors.
Take participants down different paths: things are right, a little right, a little wrong, totally wrong.
Invite conversation about where failures would be most impactful to their experience.

Dive into mental models

A drawing of a woman snorkeling in a large red and orange circle

People don’t need to understand the nuts and bolts behind the technology powering AI to have a positive experience with it.But they need a mental model of the system — what it delivers, when, and why — to have realistic expectations of the system’s capabilities. It can be easy to assume that people correctly understand how your AI works, when frequently their understanding is wrong (even if they’re confident about it!). Once we locate the gaps in people’s mental models, we’re better equipped to shore them up with our designs.

To understand how participants envision your AI system, try this:

Ask participants to write down the “rules” for how the system works. For example, give them a result and ask them to explain why and how the system produced it.
Have participants imagine that a human gave them a specific result. Ask what it is about the data, or their interactions, that would have caused the human to give them that result.

Highlight exceptions as well as trends

Sketch of a person reaching up to an orange block. Blue background.

People will have different experiences with an AI depending on their context, the content they bring in, and the way they interact with the system. There are challenges with extracting qualitative insights around AI systems based on what most people do, or how they react, when every person’s experience is so personal. As you roll up results, pay close attention to outliers. Understand why participants had the unique experience they did within your sample. This is particularly important when evaluating the experience across a diverse audience.

What are some things you’ve learned about doing UX research on AI? Tweet us your thoughts @MicrosoftRI or like us on Facebook and join the conversation.

Gwenyth Hardiman is a senior design researcher. Penny Collisson is a user research manager working on AI and Assistant in Office and Windows.

Microsoft XC Research