Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
ŽMOLÍKOVÁ, K. DELCROIX, M. KINOSHITA, K. HIGUCHI, T. OGAWA, A. NAKATANI, T.
Originální název
Speaker-aware neural network based beamformer for speaker extraction in speech mixtures
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
This article is about the speaker-aware neural network based beamformer for speaker extraction in speech mixtures. In this work, we address the problem of extracting one target speaker from a multichannel mixture of speech. We use a neural network to estimate masks to extract the target speaker and derive beamformer filters using these masks, in a similar way as the recently proposed approach for extraction of speech in presence of noise. To overcome the permutation ambiguity of neural network mask estimation, which arises in presence of multiple speakers, we propose to inform the neural network about the target speaker so that it learns to follow the speaker characteristics through the utterance. We investigate and compare different methods of passing the speaker information to the network such as making one layer of the network dependent on speaker characteristics. Experiments on mixture of two speakers demonstrate that the proposed scheme can track and extract a target speaker for both closed and open speaker set cases.
Klíčová slova
speaker extraction, speaker-aware neural network, beamforming, mask estimation
Autoři
ŽMOLÍKOVÁ, K.; DELCROIX, M.; KINOSHITA, K.; HIGUCHI, T.; OGAWA, A.; NAKATANI, T.
Vydáno
20. 8. 2017
Nakladatel
International Speech Communication Association
Místo
Stocholm
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Ročník
2017
Číslo
08
Stát
Francouzská republika
Strany od
2655
Strany do
2659
Strany počet
5
URL
http://www.isca-speech.org/archive/Interspeech_2017/pdfs/0667.PDF
BibTex
@inproceedings{BUT144496, author="Kateřina {Žmolíková} and Marc {Delcroix} and Keisuke {Kinoshita} and Takuya {Higuchi} and Atsunori {Ogawa} and Tomohiro {Nakatani}", title="Speaker-aware neural network based beamformer for speaker extraction in speech mixtures", booktitle="Proceedings of Interspeech 2017", year="2017", journal="Proceedings of Interspeech", volume="2017", number="08", pages="2655--2659", publisher="International Speech Communication Association", address="Stocholm", doi="10.21437/Interspeech.2017-667", issn="1990-9772", url="http://www.isca-speech.org/archive/Interspeech_2017/pdfs/0667.PDF" }
Dokumenty
zmolikova_interspeech2017_IS170667.pdf