gms | German Medical Science

25. Jahrestagung der Deutschen Gesellschaft für Audiologie

Deutsche Gesellschaft für Audiologie e. V.

01.03. - 03.03.2023, Köln

Subjective evaluation of DNN-assisted WPE dereverberation algorithms with end-to-end optimization

Meeting Abstract

  • presenting/speaker Jean-Marie Lemercier - Universität Hamburg, Signal Processing (SP), Hamburg, DE
  • Joachim Thiemann - Advanced Bionics GmbH, Hannover, DE
  • Raphael Koning - Advanced Bionics, Hannover, DE
  • Timo Gerkmann - Universität Hamburg, Signal Processing (SP), Hamburg, DE

Deutsche Gesellschaft für Audiologie e.V.. 25. Jahrestagung der Deutschen Gesellschaft für Audiologie. Köln, 01.-03.03.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. Doc040

doi: 10.3205/23dga040, urn:nbn:de:0183-23dga0405

Veröffentlicht: 1. März 2023

© 2023 Lemercier et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Reverberant environments pose challenges for people with hearing impairments, especially for users of cochlear implants. Thus algorithms to enhance speech in reverberant conditions have been developed, and one such algorithm is Weighted Prediction Error (WPE) [1], which requires an estimate of the anechoic speech power spectral density (PSD). In recent work [2], we developed an online capable version of WPE where the PSD estimate is computed by a deep neural network (DNN). Rather than giving a straight anechoic speech PSD, the DNN is trained to optimize the WPE output with respect to the desired criterion. We term this approach E2Ep-WPE.The WPE algorithm proved very efficient at suppressing early reflections and moderate reverberation, but is not able to remove the late reverberant tail of the room impulse response. Thus, we further modified the algorithm to include a post-processing stage using a second DNN. We label this enhancement algorithm E2Ep-WPE+DNN-PF.This contribution reports an initial subjective assessment of the proposed E2Ep-WPE and E2Ep-WPE+DNN-PF, using a MUSHRA-like presentation to normal-hearing subjects. This comparison allows us to evaluate the possible benefit of the post-filter, which incurs additional computational complexity but no additional delay. We also include in our benchmark GaGNet [3], a DNN-based state-of-the-art enhancement algorithm and successor of the 2021 DNS challenge winning approach.


References

1.
Nakatani T, Yoshioka T, Kinoshita K, Miyoshi M, Juang B. Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2008 Mar 31 – Apr 04; Las Vegas, NV, USA. New York: Institute of Electrical and Electronics Engineers (IEEE); 2008. p. 85–8.
2.
Lemercier JM, Thiemann J, Konig R, Gerkmann T. Customizable end-to-end optimization of online neural network-supported dereverberation for hearing devices. 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP); 2022 May 22-27; Singapore, Singapore. New York: Institute of Electrical and Electronics Engineers (IEEE); 2022. p. 171–5.
3.
Li A, Liu W, Luo X, Yu G, Zheng C, Li X. A simultaneous denoising and dereverberation framework with target decoupling. ISCA Interspeech; 2021 Aug 30 – Sep 3; Brno, Czech Republic. p. 2801–5.