Beyond Success Rate: Utility as a Search Quality Metric for Online Experiments

Widad Machmouchi; Ahmed Awadallah; Imed Zitouni; Georg Buscher

Beyond Success Rate: Utility as a Search Quality Metric for Online Experiments

Widad Machmouchi ,
Ahmed Awadallah ,
Imed Zitouni ,
Georg Buscher

2017 Conference on Information and Knowledge Management | November 2017

Published by ACM

Download BibTex

User satisfaction metrics are an integral part of search engine development as they help system developers to understand and evaluate the quality of the user experience. Research to date has mostly focused on predicting success or frustration as a proxy for satisfaction. However, users’ search experience is more complex than merely being either successful or not. As such, using success rate as a measure of satisfaction can be limiting. In this work, we propose the use of utility as a measure of searcher satisfaction. This concept represents the fulfillment a user receives from con-suming a service and explains how users aim to gain optimal overall satisfaction. Our utility metrics measure the user satisfac-tion by aggregating all their interaction with the search engine. These interactions are represented as a timeline of actions and their dwelltimes, where each action is classified as having a posi-tive or negative effect on the user. We examine sessions mined from Bing logs, with multi-point scale assessment of searcher satisfaction and show that utility is a better proxy for satisfaction compared to success. Leveraging that data, we design metrics of searcher satisfaction that assess the overall utility accumulated by a user during her search session. We use real user traffic from millions of users in an A/B setting to compare utility metrics to success rate metrics. We show that utility is a better metric for evaluating searcher satisfaction with the search engine, and a more sensitive and accurate metric when compared to predicting success. These metrics are currently adopted as the top-level met-ric for evaluating the thousands of A/B experiments that are run on Bing each year.