Přístupnostní navigace
E-application
Search Search Close
Publication detail
VESELÝ, K. WATANABE, S. ŽMOLÍKOVÁ, K. KARAFIÁT, M. BURGET, L. ČERNOCKÝ, J.
Original Title
Sequence Summarizing Neural Network for Speaker Adaptation
Type
conference paper
Language
English
Original Abstract
In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to i-vector extractor, the SSNN produces a "summary vector", representing an acoustic summary of an utterance. Such vector is then appended to the input of main network, while both networks are trained together optimizing single loss function. Both the i-vector and SSNN speaker adaptation methods are compared on AMI meeting data. The results show comparable performance of both techniques on FBANK system with frameclassification training. Moreover, appending both the i-vector and "summary vector" to the FBANK features leads to additional improvement comparable to the performance of FMLLR adapted DNN system.
Keywords
DNN, adaptation, i-vector, sequence summary, SSNN
Authors
VESELÝ, K.; WATANABE, S.; ŽMOLÍKOVÁ, K.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J.
Released
20. 3. 2016
Publisher
IEEE Signal Processing Society
Location
Shanghai
ISBN
978-1-4799-9988-0
Book
Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016
Pages from
5315
Pages to
5319
Pages count
5
URL
https://www.fit.vut.cz/research/publication/11145/
BibTex
@inproceedings{BUT130964, author="Karel {Veselý} and Shinji {Watanabe} and Kateřina {Žmolíková} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}", title="Sequence Summarizing Neural Network for Speaker Adaptation", booktitle="Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016", year="2016", pages="5315--5319", publisher="IEEE Signal Processing Society", address="Shanghai", doi="10.1109/ICASSP.2016.7472692", isbn="978-1-4799-9988-0", url="https://www.fit.vut.cz/research/publication/11145/" }