Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
LOZANO DÍEZ, A. PLCHOT, O. MATĚJKA, P. GONZALEZ-RODRIGUEZ, J.
Originální název
DNN Based Embeddings for Language Recognition
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
In this work, we present a language identification (LID) system based on embeddings. In our case, an embedding is a fixed-length vector (similar to i-vector) that represents the whole utterance, but unlike i-vector it is designed to contain mostly information relevant to the target task (LID). In order to obtain these embeddings, we train a deep neural network (DNN) with sequence summarization layer to classify languages. In particular, we trained a DNN based on bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) layers, whose frame-by-frame outputs are summarized into mean and standard deviation statistics. After this pooling layer, we add two fully connected layers whose outputs correspond to embeddings. Finally, we add a softmax output layer and train the whole network with multi-class cross-entropy objective to discriminate between languages. We report our results on NIST LRE 2015 and we compare the performance of embeddings and corresponding i-vectors both modeled by Gaussian Linear Classifier (GLC). Using only embeddings resulted in comparable performance to i-vectors and by performing score-level fusion we achieved 7.3% relative improvement over the baseline.
Klíčová slova
Embeddings, language recognition, LID, DNN
Autoři
LOZANO DÍEZ, A.; PLCHOT, O.; MATĚJKA, P.; GONZALEZ-RODRIGUEZ, J.
Vydáno
15. 4. 2018
Nakladatel
IEEE Signal Processing Society
Místo
Calgary
ISBN
978-1-5386-4658-8
Kniha
Proceedings of ICASSP 2018
Strany od
5184
Strany do
5188
Strany počet
5
URL
https://www.fit.vut.cz/research/publication/11723/
BibTex
@inproceedings{BUT155045, author="Alicia {Lozano Díez} and Oldřich {Plchot} and Pavel {Matějka} and Joaquin {Gonzalez-Rodriguez}", title="DNN Based Embeddings for Language Recognition", booktitle="Proceedings of ICASSP 2018", year="2018", pages="5184--5188", publisher="IEEE Signal Processing Society", address="Calgary", doi="10.1109/ICASSP.2018.8462403", isbn="978-1-5386-4658-8", url="https://www.fit.vut.cz/research/publication/11723/" }