LDTM: A Latent Document Type Model for Cumulative Citation Recommendation

Jingang Wang; Dandan Song; Zhiwei Zhang; Lejian Liao; Luo Si; Chin-Yew Lin

LDTM: A Latent Document Type Model for Cumulative Citation Recommendation

Jingang Wang ,
Dandan Song ,
Zhiwei Zhang ,
Lejian Liao ,
Luo Si ,
Chin-Yew Lin

Empirical Methods in Natural Language Processing | September 2015

Published by Association for Computational Linguistics

DOI | Publication | Publication

Download BibTex

This paper studies Cumulative Citation Recommendation (CCR) – given an entity in Knowledge Bases, how to effectively detect its potential citations from volume text streams. Most previous approaches treated all kinds of features indifferently to build a global relevance model, in which the prior knowledge embedded in documents cannot be exploited adequately. To address this problem, we propose a latent document type discriminative model by introducing a latent layer to capture the correlations between documents and their underlying types. The model can better adjust to different types of documents and yield flexible performance when dealing with a broad range of document types. An extensive set of experiments has been conducted on TREC-KBA-2013 dataset, and the results demonstrate that this model can yield a significant performance gain in recommendation quality as compared to the state-of-the-art.