Mining Query Subtopics from Questions in Community Question Answering.

  • Yu Wu ,
  • Wei Wu ,
  • Zhoujun Li ,
  • Ming Zhou ,
  • Wei Wu ,
  • Ming Zhou

AAAI'15 |

Published by ACM

This paper proposes mining query subtopics from questions
in community question answering (CQA). The subtopics are
represented as a number of clusters of questions with keywords
summarizing the clusters. The task is unique in that the
subtopics from questions can not only facilitate user browsing
in CQA search, but also describe aspects of queries from
a question-answering perspective. The challenges of the task
include how to group semantically similar questions and how
to find keywords capable of summarizing the clusters. We
formulate the subtopic mining task as a non-negative matrix
factorization (NMF) problem and further extend the model of
NMF to incorporate question similarity estimated from metadata
of CQA into learning. Compared with existing methods,
our method can jointly optimize question clustering and keyword
extraction and encourage the former task to enhance the
latter. Experimental results on large scale real world CQA
datasets show that the proposed method significantly outperforms
the existing methods in terms of keyword extraction,
while achieving a comparable performance to the state-of-the-art
methods for question clustering.