Abstract
In this paper, we propose a new recurrent neural network (RNN)-based single-channel speech enhancement framework for off-line wind noise reduction. To adequately represent highly non-stationary characteristics of wind noise, we first adopt a deep bi-directional long short-term memory (DBLSTM) structure. However, its enhanced output becomes muffled due to the spectral over-smoothing effect. To overcome this problem, we propose a new structure of DBLSTM-based speech enhancement system that internally incorporates the speech and noise power estimation processes in the spectral filtering framework. Furthermore, we propose a structure with an additional internal constraint of minimizing log a priori SNR, which provides efficient learning mechanism. Experimental results show that the proposed method improves source-to-distortion ratio (SDR) by 6.9 dB and perceptual evaluation of speech quality (PESQ) by 0.24 in comparison to the conventional DBLSTM-based system.
Original language | English |
---|---|
Title of host publication | 2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 41-45 |
Number of pages | 5 |
ISBN (Electronic) | 9781509059256 |
DOIs | |
Publication status | Published - 2017 Apr 10 |
Event | 2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - San Francisco, United States Duration: 2017 Mar 1 → 2017 Mar 3 |
Publication series
Name | 2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 - Proceedings |
---|
Other
Other | 2017 Hands-Free Speech Communications and Microphone Arrays, HSCMA 2017 |
---|---|
Country/Territory | United States |
City | San Francisco |
Period | 17/3/1 → 17/3/3 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Networks and Communications
- Acoustics and Ultrasonics
- Instrumentation
- Communication