Optimizing peer referrals for public awareness using contextual bandits

Programs that reward people for referring their friends are increasingly being used to raise awareness about important topics. With a fixed budget for referral incentives, a natural goal for such referral programs is to maximize the number of people reached. Unlike a typical influence maximization problem, however, the social network of potential adopters is unknown apriori. Further, people’s response to a referral incentive can depend on various factors such as their preference for the content, size of their social network, and their estimated value for sharing. Therefore, we introduce an incentive-aware variant of the influence maximization problem and formalize it under an online learning setting. Given the lack of initial information about the social network or how people respond to referral incentives, we use an explore-exploit strategy and present a contextual bandit agent CoBBI that optimizes the incentives for each user by learning from the results of its past actions. We demonstrate the effectiveness of CoBBI on data from a real-world referral program for raising land rights’ awareness among farmers. Compared to a wide range of baselines, we find that CoBBI is consistently more cost-effective, across a wide range of influence probabilities and people’s response to incentives.