WarningBird: Detecting Suspicious URLs in Twitter Stream

19th Network and Distributed System Security Symposium (NDSS 2012) |

Twitter can suffer from malicious tweets containing suspicious URLs for spam, phishing, and malware distribution. Previous Twitter spam detection schemes have used account features such as the ratio of tweets containing URLs and the account creation date, or relation features in the Twitter graph. Malicious users, however, can easily fabricate account features. Moreover, extracting relation features from the Twitter graph is time and resource consuming. Previous suspiciousURLdetectionschemeshaveclassifiedURLsusingseveralfeaturesincludinglexicalfeaturesofURLs,URL redirection, HTML content, and dynamic behavior. However, evading techniques exist, such as time-based evasion and crawler evasion. In this paper, we propose WARNINGBIRD, a suspicious URL detection system for Twitter. Instead of focusing on the landing pages of individual URLs in each tweet, we consider correlated redirect chains of URLs in a number of tweets. Because attackers have limited resources and thus have to reuse them, a portion of their redirect chains will be shared. We focus on these shared resources to detect suspicious URLs. We have collected a large number of tweets from the Twitter public timeline and trained a statistical classifier with features derived from correlated URLs and tweet context information. Our classifier has high accuracy and low false-positive and false negative rates. We also present WARNINGBIRD as a real-time system for classifying suspicious URLs in the Twitter stream.