Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
RYANT, N. BERGELSON, E. CHURCH, K. CRISTIA, A. DU, J. GANAPATHY, S. KHUDANPUR, S. KOWALSKI, D. KRISHNAMOORTHY, M. KULSHRESHTA, R. LIBERMAN, M. LU, Y. MACIEJEWSKI, M. METZE, F. PROFANT, J. SUN, L. TSAO, Y. YU, Z.
Originální název
Enhancement and Analysis of Conversational Speech: JSALT 2017
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
Automatic speech recognition is more and more widely and effectively used. Nevertheless, in some automatic speech analysis tasks the state of the art is surprisingly poor. One of these is "diarization", the task of determining who spoke when. Diarization is key to processing meeting audio and clinical interviews, extended recordings such as police body cam or child language acquisition data, and any other speech data involving multiple speakers whose voices are not cleanly separated into individual channels. Overlapping speech, environmental noise and suboptimal recording techniques make the problem harder. During the JSALT Summer Workshop at CMU in 2017, an international team of researchers worked on several aspects of this problem, including calibration of the state of the art, detection of overlaps, enhancement of noisy recordings, and classification of shorter speech segments. This paper sketches the workshops results, and announces plans for a "Diarization Challenge" to encourage further progress.
Klíčová slova
diarization, overlap detection, speech enhancement, automatic speech recognition
Autoři
RYANT, N.; BERGELSON, E.; CHURCH, K.; CRISTIA, A.; DU, J.; GANAPATHY, S.; KHUDANPUR, S.; KOWALSKI, D.; KRISHNAMOORTHY, M.; KULSHRESHTA, R.; LIBERMAN, M.; LU, Y.; MACIEJEWSKI, M.; METZE, F.; PROFANT, J.; SUN, L.; TSAO, Y.; YU, Z.
Vydáno
15. 4. 2018
Nakladatel
IEEE Signal Processing Society
Místo
Calgary
ISBN
978-1-5386-4658-8
Kniha
Proceedings of ICASSP 2018
Strany od
5154
Strany do
5158
Strany počet
5
URL
http://www.fit.vutbr.cz/research/groups/speech/publi/2018/profant_icassp2018_0005154.pdf
BibTex
@inproceedings{BUT155050, author="RYANT, N. and BERGELSON, E. and CHURCH, K. and CRISTIA, A. and DU, J. and GANAPATHY, S. and KHUDANPUR, S. and KOWALSKI, D. and KRISHNAMOORTHY, M. and KULSHRESHTA, R. and LIBERMAN, M. and LU, Y. and MACIEJEWSKI, M. and METZE, F. and PROFANT, J. and SUN, L. and TSAO, Y. and YU, Z.", title="Enhancement and Analysis of Conversational Speech: JSALT 2017", booktitle="Proceedings of ICASSP 2018", year="2018", pages="5154--5158", publisher="IEEE Signal Processing Society", address="Calgary", doi="10.1109/ICASSP.2018.8462468", isbn="978-1-5386-4658-8", url="http://www.fit.vutbr.cz/research/groups/speech/publi/2018/profant_icassp2018_0005154.pdf" }