Joint lab marks 10 years of collaborative research in natural language processing

Published

The following is the first of three blogs on the contributions of the Microsoft Research Asia Joint Lab Program (JLP), which recently celebrated its tenth anniversary. The JLP brings together the resources of Microsoft Research and major Chinese universities, facilitating collaboration on state-of-the-art research, academic exchange, and talent incubation. This blog focuses on the Microsoft-Harbin Institute of Technology joint lab (Microsoft-HIT; officially the China Ministry of Education–Microsoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology).

Professor Sheng Li talks about joint labs at the Microsoft Research Asia Summer School, hosted by the Microsoft-HIT lab in 2005.Think of countries that have more than one official language. Which ones come to mind? Canada, with two official tongues? Switzerland with four? How about China, which has no less than eight official languages and more than 50 unofficial but widely spoken indigenous tongues. Each of these languages is cherished as a cultural treasure in China, but the multiplicity of minority languages seriously impedes economic, technological, scientific, and educational exchanges between minority groups and the Mandarin-speaking Han, who make up a majority of China’s population.

Resolving this linguistic tangle is exactly the sort of challenge that prompted the creation of the Microsoft Research Asia Joint Lab Program (JLP), and it is the research focus of Microsoft-HIT. Since 2004, Microsoft-HIT researchers have published over 500 academic journal papers and, during just the last five years, presented more than 30 essays at such high-level events as the ACM-SIGIR Conference and the International Joint Conference on Artificial Intelligence (IJCAI).

Spotlight: Blog post

Research Focus: Week of September 9, 2024

Investigating vulnerabilities in LLMs; A novel total-duration-aware (TDA) duration model for text-to-speech (TTS); Generative expert metric system through iterative prompt priming; Integrity protection in 5G fronthaul networks.

The fruits of this labor can be seen in a Microsoft-HIT project called Minority Language Machine Translation. The project’s goal is to bridge the linguistic and cultural gulfs that separate different ethnic and national groups, both in China and around the world, and, potentially help preserve endangered minority languages. The project prototype is based on Microsoft Research’s Microsoft Translator Hub (opens in new tab), a platform for machine translation between different languages. Utilizing the Microsoft Azure cloud-computing service, the prototype allows users to upload language and translation data and thus build a repository of lexical and grammatical information that can facilitate bilingual translation. While the work to date has focused on machine translation between Mandarin, English, and Uyghur, the underlying principles can be applied to translating between any two languages.

But this project isn’t the only focus of Microsoft-HIT. The joint lab also aims to serve as a talent incubator, mentoring the young researchers who will be the leaders of tomorrow. Microsoft-HIT not only employs a large number of the university’s faculty and graduate students, it also holds an annual summer seminar on natural language processing. Since 2004, the summer seminar has provided more than 2,000 students an opportunity to develop their skills and laid the foundation for advanced research in language processing and speech technology.

Professor Sheng Li, seen here at the 2014 Microsoft Research Asia Faculty Summit, was instrumental in establishing the Microsoft-HIT joint lab.
Professor Sheng Li, seen here at the 2014 Microsoft Research Asia Faculty Summit, was instrumental in establishing the Microsoft-HIT joint lab.

Although the Microsoft-HIT joint lab dates from 2004, it antecedents stretch back to last century, when, during the 1990s, Microsoft Research Asia worked with Harbin Institute of Technology professor Sheng Li to set up a laboratory on machine translation. In 2000, it became one of the first labs in the Microsoft Research Joint Lab Program and in 2004, the Chinese Ministry of Education (MOE) accorded official recognition to this joint effort, designating it as a MOE-Microsoft Key Laboratory.

Professor Li, who is still deeply involved in the joint lab, credits it with having provided valuable experience to many young faculty members and promising students. He notes that many of these talented researchers have gone onto careers in related industries, but that a significant number choose to stay in the joint lab as either HIT professors or Microsoft researchers.

With the past 10 years of this program as a guide, we look forward to the next decade and beyond, confident that the Microsoft Research-HIT joint lab will foster even greater talent cultivation and research collaboration.

Tim Pan, Director of University Relations, Microsoft Research Asia,

Learn more