Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
DELCROIX, M. ŽMOLÍKOVÁ, K. KINOSHITA, K. ARAKI, S. OGAWA, A. NAKATANI, T.
Originální název
SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker's Voice Characteristics
Typ
článek v časopise ve Scopus, Jsc
Jazyk
angličtina
Originální abstrakt
In a noisy environment such as a cocktail party, humans can focus on listening to a desired speaker, an ability known as selective hearing. Current approaches developed to realize computational selective hearing require knowing the position of the target speaker, which limits their practical usage. This article introduces SpeakerBeam, a deep learning based approach for computational selective hearing based on the characteristics of the target speakers voice. SpeakerBeam requires only a small amount of speech data from the target speaker to compute his/her voice characteristics. It can then extract the speech of that speaker regardless of his/her position or the number of speakers talking in the background.
Klíčová slova
deep learning, target speaker extraction, SpeakerBeam
Autoři
DELCROIX, M.; ŽMOLÍKOVÁ, K.; KINOSHITA, K.; ARAKI, S.; OGAWA, A.; NAKATANI, T.
Vydáno
1. 11. 2018
ISSN
1348-3447
Periodikum
NTT Technical Review
Ročník
16
Číslo
11
Stát
Japonsko
Strany od
19
Strany do
24
Strany počet
6
URL
https://www.ntt-review.jp/archive/ntttechnical.php?contents=ntr201811all.pdf&mode=show_pdf
BibTex
@article{BUT185149, author="DELCROIX, M. and ŽMOLÍKOVÁ, K. and KINOSHITA, K. and ARAKI, S. and OGAWA, A. and NAKATANI, T.", title="SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker's Voice Characteristics", journal="NTT Technical Review", year="2018", volume="16", number="11", pages="19--24", issn="1348-3447", url="https://www.ntt-review.jp/archive/ntttechnical.php?contents=ntr201811all.pdf&mode=show_pdf" }