December 10, 2010

NIPS 2010 Workshop: Machine Learning in Online ADvertising (MLOAD 2010)

Hybrid Bidding for Keyword Auctions
Ashish Goel (Stanford University)
Abstract
Search auctions have become a dominant source of revenue generation on the Internet. Such auctions have typically used per-click bidding and pricing. We propose the use of hybrid auctions where an advertiser can make a per-impression as well as a per-click bid, and the auctioneer then chooses one of the two as the pricing mechanism. We assume that the advertiser and the auctioneer both have separate beliefs (called priors) on the click-probability of an advertisement. We first prove that the hybrid auction is truthful, assuming that the advertisers are risk-neutral. We then show that this auction is different from the existing per-click auction in multiple ways: 1) It takes into account the risk characteristics of the advertisers. 2) For obscure keywords, the auctioneer is unlikely to have a very sharp prior on the click- probabilities. In such situations, the hybrid auction can result in significantly higher revenue. 3) An advertiser who believes that its click-probability is much higher than the auctioneer’s estimate can use per-impression bids to correct the auctioneer’s prior without incurring any extra cost. 4) The hybrid auction can allow the advertiser and auctioneer to implement complex dynamic programming strategies. As Internet commerce matures, we need more sophisticated pricing models to exploit all the information held by each of the participants. We believe that hybrid auctions could be an important step in this direction.

AdPredictor – Large Scale Bayesian Click-Through Rate Prediction in Microsoft’s Bing Search Engine
Thore Graepel and Joaquin Quiñonero Candela
Abstract>
In the past years online advertising has grown at least an order of magnitude faster than advertising on all other media. Bing and Yahoo! have recently joined forces: all ads on both search engines are now served by Microsoft adCenter and all search results on Yahoo! are powered by Bing. Accurate predictions of the probability that a user clicks on an advertisement for a given query increase the efficiency of the ads network and benefit all three parties involved: the user, the advertiser, and the search engine. This talk presents the core machine learning model used by Microsoft adCenter for click prediction: an online Bayesian probabilistic classification model that has the ability to learn efficiently from terabytes of web usage data. The model explicitly represents uncertainty allowing for fully probabilistic predictions: 2 positives out of 10 instances or 200 out of 1000 both give an average of 20%, but in the first case the uncertainty about the prediction is larger. We discuss some challenges in machine learning for online systems, such as valid metrics, causal loops and biases in the training data.

Click Modeling in Search Advertising: Challenges and Solutions
Jianchang Mao (Yahoo! Labs)
Abstract
Sponsored search is an important form of online advertising that serves ads that match user’s query on search result page. The goal is to select an optimal placement of eligible ads to maximize a total utility function that captures the expected revenue, user experience and advertiser return on investment. Most search engines use a pay-per-click model where advertisers pay the search engine a cost determined by an auction mechanism (e.g., generalized second price) only when users click on their ad. In this case, the expected revenue is directly tied to the probability of click on ads. Click is also often used as a proxy for measuring search user experience, and is a traffic driver for advertisers. Therefore, estimation of the probability of click is the central problem in sponsored search. It affects ranking, placement, quality filtering and price of ads.
Estimating click probability given a query-ad-user tuple is a challenging statistical modeling problem for a large variety of reasons, including click sparsity for the long tail of query-ad-user tuples, noisy clicks, missing data, dynamic and seasonal effects, strong position bias, selection bias, and externalities (context of an ad being displayed). In this talk, I will provide an overview on some of the machine learning techniques recently developed in Advertising Sciences team at Yahoo! Labs to deal with those challenges in click modeling. In specific, I will briefly describe: (i) a temporal click model for estimating positional bias, externalities, and unbiased user-perceived ad quality in a combined model; (ii) techniques for reducing sparsity by aggregating click history for sub-queries extracted with a CRF model and by leveraging data hierarchies; and (iii) use of a generative model for handling missing click history features. The talk is intended to give a flavor of how machine learning techniques can help solve some of the challenging click modeling problems arising in online advertising.
Dr. Jianchang (JC) Mao is a Vice President and the head of Advertising Sciences at Yahoo! Labs, overseeing the R&D of advertising technologies and products, including Search Advertising, Contextual Advertising, Display Advertising, Targeting, and Categorization. He was also a Science/Engineering director responsible for development of backend technologies for several Yahoo! Social Search products, including Y! Answers and Y! MyWeb (Social Bookmarks). Prior to joining Yahoo!, Dr. Mao was Director of Emerging Technologies & Principal Architect at Verity Inc., a leader in Enterprise Search (acquired by Autonomy), from 2000 to 2004. Prior to this, Dr. Mao was a research staff member at the IBM Almaden Research Center from 1994 to 2000. Dr. Mao’s research interest includes Machine Learning, Data Mining, Information Retrieval, Computational Advertising, Social Networks, Pattern Recognition and Image Processing. He received an Honorable Mention Award in ACM KDD Cup 2002, IEEE Transactions on Neural Networks Outstanding Paper Award in 1996, and Honorable Mention Award from the International Pattern Recognition Society in 1993. Dr. Mao served as an associate editor of the IEEE Transactions on Neural Networks, 1999-2000. He received his Ph.D. degree in Computer Science from Michigan State University in 1994.

Digital Advertising: Going from Broadcast to Personalized Advertising
James G. Shanahan (Independent Consultant)
Abstract
Online advertising is a form of promotion that uses the Internet and World Wide Web for the expressed purpose of delivering marketing messages to attract customers. Examples of online advertising include text ads that appear on search engine results pages, banner ads, in-text ads, or Rich Media ads that appear on regular web pages, portals or applications. Since it inception over 15 years ago, online advertising has grown rapidly and currently accounts for 10% of the overall advertising spend (which is approximately $600 billion worldwide)). A large part of the more recent success in this field has come from the following key factors:
* Personalization: offline advertising (via broadcast TV, radio, newspaper etc.) is largely a broadcast form of communication where as digital advertising is much more targeted and thus enables a personalized, and possibly informative, message to consumers.
* Interactivity: internet advertising is becoming increasingly interactive with the advent of new forms of advertising such as social advertising; this is enables advertises and consumers to operate in a more conversant manner.
* Engagement: consumers are spending more time online than with any other form of media thereby enabling a broader reach and deeper connection with consumers.
* Explainabilty: advertisers are beginning to understand their consumers better.
This shift in focus in digital advertising from location (i.e., publisher web pages) to personalization has brought with it numerous challenges some of which have received a lot of research attention in the data mining and machine learning communities over the past 10-20 years. In this talk I will review, along the dimensions outlined above, some of these key technical problems and challenges that arise when adverting becomes personal. This will be done within the context of the elaborate (and ever-evolving) ecosystems of modern day digital advertising where one has to capture, store, and process petabytes of data within the constraints of a, sometimes, sequential workflow. The ultimate goal to is provide millisecond-based decision-making at each step of this workflow that enables customizable and engaging consumer experiences.

Machine Learning for Advertiser Engagement
Tao Qin (Microsoft Research Asia)
Abstract
Advertiser engagement, which goal is to attract more advertisers, make them loyal to the ad platform, and make them willing to spend more money on (online) advertising, is very important for an ad platform to boost its long-term revenue. Industry has paid more and more attention to advertiser engagement. For example, many search engines have provided tools to help advertisers, including keyword suggestion, traffic (number of impressions/clicks) estimation, and bid suggestion. However, from the research point of view, the effort on advertiser engagement is still limited.
In this talk, we discuss the challenges in advertiser engagement, especially from the machine learning perspective. Actually machine learning algorithms can be used in many aspects of online advertising, such as CTR prediction. We propose a number of principles that should be considered when using machine learning technologies to help advertiser engagement.
(1) Accurate. The results of learning algorithms should be as accurate as possible. This principle is the same as that in other machine learning tasks.
(2) Socially fair. The learning algorithms should promote diversity and be fair to even tail advertisers. In this way, more advertisers will feel engaged and the entire ads eco-system will become more healthy.
(3) Understandable. The evaluation metrics and learned models should be easy to interpret. In this way, it is easier for advertisers to diagnose their campaigns and identify the key aspects to improve. This will also make the ad platform more transparent to advertisers and increase their trust in the ad platform.
(4) Actionable. The learning algorithms should provide actionable suggestions/feedback to advertisers. In this way, the advertisers can take effective actions to improve their performances, and therefore stick to the ad platform in a more loyal fashion.
We will show several example problems in online advertising (such as effectiveness evaluation and auction mechanism) and discuss possible solutions based the above principles.
This is joint work with Bin Gao and Tie-Yan Liu.