High-Accuracy Neural-Network Models for Speech Enhancement

In this talk we will discuss our recent work on AI techniques that improve the quality of audio signals for both machine understanding and sensory perception. Our best models utilize convolutional-recurrent neural networks. They improve PESQ of noisy signals by 0.6 and boost SNRs by up to 34 dB in challenging capture conditions. We will compare the performance of our models with classical approaches that use statistical signal-signal processing and existing state-of-the-art data-driven methods that use DNNs. We will also discuss preliminary results from semi-supervised learning approaches that further improve the enhancement performance.

Date:
Speakers:
Han Zhao 
Affiliation:
Carnegie Mellon University