Characterizing and Predicting Voice Query Reformulation

The 24th ACM International Conference on Information and Knowledge Management (CIKM 2015) |

Published by ACM

Publication

Voice interactions are becoming more prevalent as the usage of voice search and intelligent assistants gains more popularity. Users frequently reformulate their requests in hope of getting better results either because the system was unable to recognize what they said or because it was able to recognize it but was unable to return the desired response. Query reformulation has been extensively studied in the context of text input. Many of the characteristics studied in the context of text query reformulation are potentially useful for voice query reformulation. However, voice query reformulation has its unique characteristics in terms of the reasons that lead users to reformulating their queries and how they reformulate them. In this paper, we study the problem of voice query reformulation. We perform a large scale human annotation study to collect thousands of labeled instances of voice reformulation and non-reformulation query pairs. We use this data to compare and contrast characteristics of reformulation and non-reformulation queries over a large a number of dimensions. We then train classifiers to distinguish between reformulation and non-reformulation query pairs and to predict the rationale behind reformulation. We demonstrate through experiments with the human labeled data that our classifiers achieve good performance in both tasks.