Click-based Hot Fixes for Underperforming Torso Queries

  • Masrour Zoghi ,
  • Tomáš Tunys ,
  • Lihong Li ,
  • Damien Jose ,
  • Junyan Chen ,
  • Chun Ming Chin ,
  • Maarten de Rijke

Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) |

Published by ACM - Association for Computing Machinery

Ranking documents using their historical click-through rate (CTR) can improve relevance for frequently occurring queries, i.e., socalled head queries. It is difficult to use such click signals on nonhead queries as they receive fewer clicks. In this paper, we address the challenge of dealing with torso queries on which the production ranker is performing poorly. Torso queries are queries that occur frequently enough so that they are not considered as tail queries and yet not frequently enough to be head queries either. They comprise a large portion of most commercial search engines’ traffic, so the presence of a large number of underperforming torso queries can harm the overall performance significantly. We propose a practical method for dealing with such cases, drawing inspiration from the literature on learning to rank (LTR). Our method requires relatively few clicks from users to derive a strong re-ranking signal by comparing document relevance between pairs of documents instead of using absolute numbers of clicks per document. By infusing a modest amount of exploration into the ranked lists produced by a production ranker and extracting preferences between documents, we obtain substantial improvements over the production ranker in terms of page-level online metrics. We use an exploration dataset consisting of real user clicks from a large-scale commercial search engine to demonstrate the effectiveness of the method. We conduct further experimentation on public benchmark data using simulated clicks to gain insight into the inner workings of the proposed method. Our results indicate a need for LTR methods that make more explicit use of the query and other contextual information.