From Devices to People: Attribution of Search Activity in Multi-User Settings
- Ryen W. White ,
- Ahmed Awadallah ,
- Adish Singla ,
- Eric Horvitz
The 23rd International World Wide Web Conference (WWW 2014) |
Online services rely on unique identifiers of machines to tailor offerings to their users. An implicit assumption is made that each machine identifier maps to an individual. However, shared machines are common, leading to interwoven search histories and noisy signals for applications such as personalized search and advertising. We present methods for attributing search activity to individual searchers. Using ground truth data for a sample of almost four million U.S. Web searchers—containing both machine identifiers and person identifiers—we show that over half of the machine identifiers comprise the queries of multiple people. We characterize variations in features of topic, time, and other aspects such as the complexity of the information sought per the number of searchers on a machine, and show significant differences in all measures. Based on these insights, we develop models to accurately estimate when multiple people contribute to the logs ascribed to a single machine identifier. We also develop models to cluster search behavior on a machine, allowing us to attribute historical data accurately and automatically assign new search activity to the correct searcher. The findings have implications for the design of applications such as personalized search and advertising that rely heavily on machine identifiers to custom-tailor their services.
Copyright is held by the International World Wide Web Conference Committee (IW3C2). IW3C2 reserves the right to provide a hyperlink to the author's site if the Material is used in electronic media. WWW'14, April 7-11, 2014, Seoul, Korea. ACM 978-1-4503-2744-2/14/04.