Přístupnostní navigace
E-application
Search Search Close
Publication detail
MOŠNER, L. PLCHOT, O. ROHDIN, J. ČERNOCKÝ, J.
Original Title
Utilizing VOiCES dataset for multichannel speaker verification with beamforming
Type
article in a collection out of WoS and Scopus
Language
English
Original Abstract
VOiCES from a Distance Challenge 2019 aimed at the evaluation of speaker verification (SV) systems using single-channel trials based on the Voices Obscured in Complex Environmental Settings (VOiCES) corpus. Since it comprises recordings of the same utterances captured simultaneously by multiple microphones in the same environments, it is also suitable for multichannel experiments. In this work, we design a multichannel dataset as well as development and evaluation trials for SV inspired by the VOiCES challenge. Alternatives discarding harmful microphones are presented as well. We asses the utilization of the created dataset for x-vector based SV with beamforming as a front end. Standard fixed beamforming and NN-supported beamforming using simulated data and ideal binary masks (IBM) are compared with another variant of NNsupported beamforming that is trained solely on the VOiCES data. Lack of data revealed by experiments with VOiCESdata trained beamformer was tackled by means of a variant of SpecAugment applied to magnitude spectra. This approach led to as much as 10% relative improvement in EER pushing results closer to those obtained by a good beamformer based on IBMs.
Keywords
multichannel speaker verification, application-aware beamforming
Authors
MOŠNER, L.; PLCHOT, O.; ROHDIN, J.; ČERNOCKÝ, J.
Released
1. 11. 2020
Publisher
International Speech Communication Association
Location
Tokyo
ISBN
2312-2846
Periodical
Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland
Year of study
2020
Number
11
State
Republic of Finland
Pages from
187
Pages to
193
Pages count
7
URL
https://www.isca-speech.org/archive/Odyssey_2020/abstracts/80.html
BibTex
@inproceedings{BUT164069, author="Ladislav {Mošner} and Oldřich {Plchot} and Johan Andréas {Rohdin} and Jan {Černocký}", title="Utilizing VOiCES dataset for multichannel speaker verification with beamforming", booktitle="Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop", year="2020", journal="Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland", volume="2020", number="11", pages="187--193", publisher="International Speech Communication Association", address="Tokyo", doi="10.21437/Odyssey.2020-27", issn="2312-2846", url="https://www.isca-speech.org/archive/Odyssey_2020/abstracts/80.html" }