[2022-May-25] Deep-learning-based Speech Enhancement with Its Application to Assistive Oral Com-munications Devices

Institute of Information Systems and Applications



Prof. Yu Tsao, Research Fellow

Academia Sinica


Deep-learning-based Speech Enhancement with Its Application to Assistive Oral Com-munications Devices


13:20-15:00 Wednesday 25-May-2022

QR Code:



Hosted by:

Prof. Chun-Yi Lee


Speech enhancement (SE) serves as a key component in most speech-related applications. The goal of SE is to enhance the speech signals by reducing distortions caused by additive and convoluted noises in order to achieve improved human-human and human-machine communication efficacy. In this talk, we will review the system architecture and fundamental theories of deep learning-based SE approaches. Next, we will present more recent advances, including end-to-end and goal-driven based SE systems as well as the SE systems with improved architectures and feature extraction procedures. The reinforcement learning and generative adversarial network (GAN)-based SE methods will also be presented. Finally, we will discuss some applications based on the deep learning SE systems, including impaired speech transformation and noise reduction for assistive hearing and speaking devices.


Yu Tsao received the B.S. and M.S. degrees in electrical engineering from National Taiwan University, Taipei, Taiwan, in 1999 and 2001, respectively, and the Ph.D. degree in electrical and computer engineering from the Georgia Institute of Technology, Atlanta, GA, USA, in 2008. From 2009 to 2011, he was a Researcher with the National Institute of Information and Communications Technology, Tokyo, Japan, where he engaged in research and product devel-opment in automatic speech recognition for multilingual speech-to-speech translation. He is currently a Research Fellow (Professor) and Deputy Director with the Research Center for Information Technology Innovation, Academia Sinica, Taipei. His research interests include speech and speaker recognition, acoustic and language modeling, audio coding, and bio-signal processing. He is currently an Associate Editor for the IEEE/ACM Transactions on Audio, Speech, and Language Processing and IEEE Signal Processing Letters and a Distinguished Lecturer of APSIPA. He was the recipient of the Academia Sinica Career Development Award in 2017, the National Innovation Award in 2018, 2019, 2020, Future Tech Break-through Award 2019, and the Outstanding Elite Award, Chung Hwa Rotary Educational Foundation 2019–2020. He is the corresponding author of a paper that receives the 2021 IEEE Signal Processing Society (SPS), Young Author, Best Paper Award.

All faculty and students are welcome to join.