LDTM: A Latent Document Type Model for Cumulative Citation Recommendation

  • Jingang Wang ,
  • Dandan Song ,
  • Zhiwei Zhang ,
  • Lejian Liao ,
  • Luo Si ,

Empirical Methods in Natural Language Processing |

Published by Association for Computational Linguistics

DOI | Publication | Publication

This paper studies Cumulative Citation Recommendation (CCR) – given an entity in Knowledge Bases, how to effectively detect its potential citations from volume text streams. Most previous approaches treated all kinds of features indifferently to build a global relevance model, in which the prior knowledge embedded in documents cannot be exploited adequately. To address this problem, we propose a latent document type discriminative model by introducing a latent layer to capture the correlations between documents and their underlying types. The model can better adjust to different types of documents and yield flexible performance when dealing with a broad range of document types. An extensive set of experiments has been conducted on TREC-KBA-2013 dataset, and the results demonstrate that this model can yield a significant performance gain in recommendation quality as compared to the state-of-the-art.