Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

MSR-TR-2014-109 |

Deep neural network (DNN) based natural language processing models rely on a word embedding matrix to transform raw words into vectors. Recently, a deep structured semantic model (DSSM) has been proposed to project raw text to a continuously-valued vector for Web Search. In this technical report, we propose learning word embedding using DSSM. We show that the DSSM trained on large body of text can produce meaningful word embedding vectors as demonstrated on semantic word clustering and semantic word analogy tasks.