Detail publikace

End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA

ROHDIN, J. SILNOVA, A. DIEZ SÁNCHEZ, M. PLCHOT, O. MATĚJKA, P. BURGET, L.

Originální název

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

Recently, several end-to-end speaker verification systems based ondeep neural networks (DNNs) have been proposed. These systemshave been proven to be competitive for text-dependent tasks as wellas for text-independent tasks with short utterances. However, fortext-independent tasks with longer utterances, end-to-end systemsare still outperformed by standard i-vector + PLDA systems. In thiswork, we develop an end-to-end speaker verification system that isinitialized to mimic an i-vector + PLDA baseline. The system isthen further trained in an end-to-end manner but regularized so thatit does not deviate too far from the initial system. In this way wemitigate overfitting which normally limits the performance of endto-end systems. The proposed system outperforms the i-vector +PLDA baseline on both long and short duration utterances.

Klíčová slova

Speaker verification, DNN, end-to-end

Autoři

ROHDIN, J.; SILNOVA, A.; DIEZ SÁNCHEZ, M.; PLCHOT, O.; MATĚJKA, P.; BURGET, L.

Vydáno

15. 4. 2018

Nakladatel

IEEE Signal Processing Society

Místo

Calgary

ISBN

978-1-5386-4658-8

Kniha

Proceedings of ICASSP

Strany od

4874

Strany do

4878

Strany počet

URL

https://www.fit.vut.cz/research/publication/11724/

BibTex

@inproceedings{BUT155046,
  author="Johan Andréas {Rohdin} and Anna {Silnova} and Mireia {Diez Sánchez} and Oldřich {Plchot} and Pavel {Matějka} and Lukáš {Burget}",
  title="End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA",
  booktitle="Proceedings of ICASSP",
  year="2018",
  pages="4874--4878",
  publisher="IEEE Signal Processing Society",
  address="Calgary",
  doi="10.1109/ICASSP.2018.8461958",
  isbn="978-1-5386-4658-8",
  url="https://www.fit.vut.cz/research/publication/11724/"
}

Dokumenty

rohdin_icassp2018_0004874.pdf

VUT

Fakulty

Vysokoškolské ústavy

Součásti

End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA