TREC-9 CLIR experiments at MSRCN

  • ,
  • Jian-Yun Nie ,
  • Jian Zhang ,
  • Endong Xun ,
  • Yi Su ,
  • Ming Zhou ,
  • Chang-Ning Huang

TREC-9 |

In TREC-9, we participated in the English-Chinese
Cross-Language Information Retrieval (CLIR) track. Our work involved two aspects: finding good methods for Chinese IR, and finding effective translation means between English and Chinese. On Chinese monolingual retrieval, we investigated the use of different entities as indexes, pseudorelevance feedback, and length normalization, and examined their impact on Chinese IR. On English-Chinese CLIR, our focus was put on finding effective ways for query translation. Our method incorporates three improvements over the simple lexicon-based translation: (1) word/term disambiguation using co-occurrence, (2) phrase detecting and translation using a statistical language model and (3) translation coverage enhancement using a statistical translation model. This method is shown to be as effective as a good MT system.