LDTM: A Latent Document Type Model for Cumulative Citation Recommendation
- Jingang Wang ,
- Dandan Song ,
- Zhiwei Zhang ,
- Lejian Liao ,
- Luo Si ,
- Chin-Yew Lin
Empirical Methods in Natural Language Processing |
Published by Association for Computational Linguistics
This paper studies Cumulative Citation Recommendation (CCR) – given an entity in Knowledge Bases, how to effectively detect its potential citations from volume text streams. Most previous approaches treated all kinds of features indifferently to build a global relevance model, in which the prior knowledge embedded in documents cannot be exploited adequately. To address this problem, we propose a latent document type discriminative model by introducing a latent layer to capture the correlations between documents and their underlying types. The model can better adjust to different types of documents and yield flexible performance when dealing with a broad range of document types. An extensive set of experiments has been conducted on TREC-KBA-2013 dataset, and the results demonstrate that this model can yield a significant performance gain in recommendation quality as compared to the state-of-the-art.