Text to Speech

Established: November 1, 2018

We are working on neural network based text to speech (TTS). including acoustic model, vocoder, frontend, and end-to-end text-to-wave model. Our research works have been transferred in Microsoft Azure TTS service to improve the product experiences.

Product Transfer (Azure TTS page: https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/ (opens in new tab))

Paper Publication (Speech demo page: https://speechresearch.github.io/ (opens in new tab))

  • Jiawei Chen, Xu Tan, Yichong Leng, Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu, Speech-T: Transducer for Text to Speech and Beyond, NeurIPS, 2021.
  • Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu, A Survey on Neural Speech Synthesis, arXiv 2021. [Paper (opens in new tab)] [Article-1 (opens in new tab)] [Article-2 (opens in new tab)] [Github (opens in new tab)]
  • Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu, PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior, arXiv 2021. [Paper (opens in new tab)]
  • Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu, AdaSpeech 3: Adaptive Text to Speech for Spontaneous StyleINTERSPEECH 2021.
  • Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu, AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data. ICASSP 2021. [Paper (opens in new tab)]
  • Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu, LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. ICASSP 2021. [Paper (opens in new tab)]
  • Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu, DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling. ICASSP 2021. [Paper (opens in new tab)]
  • Yichong Leng, Xu Tan, Sheng Zhao, Frank Soong, Xiang-Yang Li, Tao Qin, MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network, ICASSP 2021. [Paper (opens in new tab)]
  • Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu, AdaSpeech: Adaptive Text to Speech for Custom Voice, ICLR 2021. [Paper (opens in new tab)]
  • Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu, FastSpeech 2: Fast and High-Quality End-to-End Text to Speech, ICLR 2021. [Paper (opens in new tab)] [Blog (opens in new tab)]
  • Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu, HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis, arXiv 2020. [Paper (opens in new tab)]
  • Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou, XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System, INTERSPEECH 2020. [Paper (opens in new tab)]
  • Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu, MultiSpeech: Multi-Speaker Text to Speech with Transformer, INTERSPEECH 2020. [Paper (opens in new tab)]
  • Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu, DeepSinger: Singing Voice Synthesis with Data Mined From the WebKDD 2020. [Paper (opens in new tab)]
  • Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu, LRSpeech: Extremely Low-Resource Speech Synthesis and RecognitionKDD 2020. [Paper (opens in new tab)] [Blog (opens in new tab)]
  • Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan, ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit, ICASSP 2020. [Paper (opens in new tab)]
  • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu, FastSpeech: Fast, Robust and Controllable Text to Speech, NeurIPS 2019. [Paper (opens in new tab)] [Demo (opens in new tab)] [Article (opens in new tab)] [Reddit (opens in new tab)]
  • Hao Sun, Xu Tan, Jun-Wei Gan, Sheng Zhao, Dongxu Han, Hongzhi Liu, Tao Qin, and Tie-Yan Liu, Knowledge Distillation from BERT in Pre-training and Fine-tuning for Polyphone Disambiguation, ASRU 2019. [Paper (opens in new tab)]
  • Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu, Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion, INTERSPEECH 2019. [Paper (opens in new tab)]
  • Yi Ren, Xu Tan, Tao Qin, Zhou Zhao, Sheng Zhao, Tie-Yan Liu, Almost Unsupervised Text to Speech and Automatic Speech Recognition, ICML 2019. [Paper (opens in new tab)] [Demo (opens in new tab)] [Article (opens in new tab)] [Blog (opens in new tab)] [Slides (opens in new tab)] [Video (opens in new tab)]

People

Portrait of Xu Tan

Xu Tan

Principal Research Manager

Portrait of Tao Qin

Tao Qin

Senior Principal Research Manager

Portrait of Rui Wang

Rui Wang

Senior Researcher

Portrait of Renqian Luo

Renqian Luo

Senior Researcher

Portrait of Chang Liu

Chang Liu

Senior Researcher

Portrait of Tie-Yan Liu

Tie-Yan Liu

Distinguished Scientist, Microsoft Research AI for Science