Publication detail

Dereverberation and Beamforming in Far-Field Speaker Recognition

MOŠNER, L. MATĚJKA, P. NOVOTNÝ, O. ČERNOCKÝ, J.

Original Title

Dereverberation and Beamforming in Far-Field Speaker Recognition

Type

conference paper

Language

English

Original Abstract

This paper deals with far-field speaker recognition. On a corpus of NIST SRE 2010 data retransmitted in a real room with multiple microphones, we first demonstrate how room acoustics cause significant degradation of state-of-the-art ivector based speaker recognition system. We then investigate several techniques to improve the performances ranging from probabilistic linear discriminant analysis (PLDA) re-training, through dereverberation, to beamforming. We found that weighted prediction error (WPE) based dereverberation combined with generalized eigenvalue beamformer with powerspectral density (PSD) weighting masks generated by neural networks (NN) provides results approaching the clean closemicrophone setup. Further improvement was obtained by re-training PLDA or the mask-generating NNs on simulated target data. The work shows that a speaker recognition system working robustly in the far-field scenario can be developed.

Keywords

Speaker recognition, microphone array, beamforming, dereverberation, audio retransmission

Authors

MOŠNER, L.; MATĚJKA, P.; NOVOTNÝ, O.; ČERNOCKÝ, J.

Released

15. 4. 2018

Publisher

IEEE Signal Processing Society

Location

Calgary

ISBN

978-1-5386-4658-8

Book

Proceedings of ICASSP 2018

Pages from

5254

Pages to

5258

Pages count

5

URL

BibTex

@inproceedings{BUT155039,
  author="Ladislav {Mošner} and Pavel {Matějka} and Ondřej {Novotný} and Jan {Černocký}",
  title="Dereverberation and Beamforming in Far-Field Speaker Recognition",
  booktitle="Proceedings of ICASSP 2018",
  year="2018",
  pages="5254--5258",
  publisher="IEEE Signal Processing Society",
  address="Calgary",
  doi="10.1109/ICASSP.2018.8462365",
  isbn="978-1-5386-4658-8",
  url="https://www.fit.vut.cz/research/publication/11717/"
}