Semi-Supervised Learning for Acoustic and Prosodic Modeling in Speech Recognition

Semi-supervised learning is a class of machine learning techniques that aims to use unlabeled data to improve the performance of models trained by labeled data only. It is especially useful in scenarios where enormous amounts of unlabeled data are available with little cost. In this work, I investigate semi-supervised learning algorithms for prosodic and acoustic modeling. In particular, I propose an integrated training framework, which utilizes untranscribed speech to discover prosodic boundaries, or to improve phonetic classification and recognition accuracy. We show promising results for Mandarin prosodic boundary detection, and significant improvement for phonetic classification.

Speaker Details

Jui-Ting Huang is a PhD candidate in the statistical speech technology group at University of Illinois at Urbana-Champaign. She expects to receive her PhD degree in Electrical and Computer Engineering in December 2011. Under the guidance of Professor Mark Hasegawa-Johnson, her thesis research focuses on semi-supervised learning for acoustic and prosodic modeling. She has also maintained broad research interests in multilingual speech processing, natural language processing, and machine learning. She worked in Microsoft Research and Google as an intern in 2009 and 2011 summer. Jui-Ting received her BS and MS degree in Electrical Engineering from National Taiwan University.

Date:
Speakers:
Jui-Ting Huang
Affiliation:
University of Illinois at Urbana-Champaign

Series: Microsoft Research Talks