Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
ŽMOLÍKOVÁ, K. DELCROIX, M. KINOSHITA, K. HIGUCHI, T. NAKATANI, T. ČERNOCKÝ, J.
Originální název
Optimization of Speaker-aware Multichannel Speech Extraction with ASR Criterion
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
This paper addresses the problem of recognizing speech corruptedby overlapping speakers in a multichannel setting. Toextract a target speaker from the mixture, we use a neural networkbased beamformer which uses masks estimated by a neuralnetwork to compute statistically optimal spatial filters. Followingour previous work, we inform the neural network about thetarget speaker using information extracted from an adaptation utterance,enabling the network to track the target speaker. Whilein the previous work, this method was used to separately extractthe speaker and then pass such preprocessed speech to a speechrecognition system, here we explore training both systems jointlywith a common speech recognition criterion. We show that integratingthe two systems and training for the final objective improvesthe performance. In addition, the integration enables furthersharing of information between the acoustic model and thespeaker extraction system, by making use of the predicted HMMstateposteriors to refine the masks used for beamforming.
Klíčová slova
Speaker extraction, joint training, speakeradaptive neural network, beamforming, speech recognition
Autoři
ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; NAKATANI, T.; ČERNOCKÝ, J.
Vydáno
15. 4. 2018
Nakladatel
IEEE Signal Processing Society
Místo
Calgary
ISBN
978-1-5386-4658-8
Kniha
Proceedings of ICASSP 2018
Strany od
6702
Strany do
6706
Strany počet
5
URL
https://www.fit.vut.cz/research/publication/11722/
BibTex
@inproceedings{BUT155044, author="Kateřina {Žmolíková} and Marc {Delcroix} and Keisuke {Kinoshita} and Takuya {Higuchi} and Tomohiro {Nakatani} and Jan {Černocký}", title="Optimization of Speaker-aware Multichannel Speech Extraction with ASR Criterion", booktitle="Proceedings of ICASSP 2018", year="2018", pages="6702--6706", publisher="IEEE Signal Processing Society", address="Calgary", doi="10.1109/ICASSP.2018.8461533", isbn="978-1-5386-4658-8", url="https://www.fit.vut.cz/research/publication/11722/" }
Dokumenty
zmolikova_icassp2018_0006702.pdf