Detecting Fake News with Weak Supervision

IEEE Intelligent Systems |

Limited labeled data is becoming one of the largest bottlenecks for supervised
learning systems. This is especially the case for many real-world tasks where large scale labeled
examples are either too expensive to acquire or unavailable due to privacy or data access
constraints. Weak supervision has shown to be effective in mitigating the scarcity of labeled
data by leveraging weak labels or injecting constraints from heuristic rules and/or extrinsic
knowledge sources. Social media has little labeled data but possesses unique characteristics
that make it suitable for generating weak supervision, resulting in a new type of weak
supervision, i.e., weak social supervision. In this article, we illustrate how various aspects of
social media can be used as weak social supervision. Specifically, we use the recent research on
fake news detection as the use case, where social engagements are abundant but annotated
examples are scarce, to show that weak social supervision is effective when facing the labeled
data scarcity problem. This article opens the door to learning with weak social supervision for
similar emerging tasks when labeled data is limited.