Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
PENG, J. GU, R. MOŠNER, L. PLCHOT, O. BURGET, L. ČERNOCKÝ, J.
Originální název
Learnable Sparse Filterbank for Speaker Verification
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
Recently, feature extraction with learnable filters was extensively investigated with speaker verification systems, with filters learned both in time- and frequency-domains. Most of the learned schemes however end up with filters close to their initialization (e.g. Mel filterbank) or filters strongly limited by their constraints. In this paper, we propose a novel learnable sparse filterbank, named LearnSF, by exclusively optimizing the sparsity of the filterbank, that does not explicitly constrain the filters to follow pre-defined distribution. After standard pre-processing (STFT and square of the magnitude spectrum), the learnable sparse filterbank is employed, with its normalized outputs fed into a neural network predicting the speaker identity. We evaluated the performance of the proposed approach on both VoxCeleb and CNCeleb datasets. The experimental results demonstrate the effectiveness of the proposed LearnSF compared to both widely-used acoustic features and existing parameterized learnable front-ends.
Klíčová slova
learnable filter, sparse filtering, sparsity, speaker verification
Autoři
PENG, J.; GU, R.; MOŠNER, L.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J.
Vydáno
18. 9. 2022
Nakladatel
International Speech Communication Association
Místo
Incheon
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Číslo
9
Stát
Francouzská republika
Strany od
5110
Strany do
5114
Strany počet
5
URL
https://www.isca-speech.org/archive/pdfs/interspeech_2022/peng22e_interspeech.pdf
BibTex
@inproceedings{BUT179826, author="PENG, J. and GU, R. and MOŠNER, L. and PLCHOT, O. and BURGET, L. and ČERNOCKÝ, J.", title="Learnable Sparse Filterbank for Speaker Verification", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2022", journal="Proceedings of Interspeech", number="9", pages="5110--5114", publisher="International Speech Communication Association", address="Incheon", doi="10.21437/Interspeech.2022-11309", issn="1990-9772", url="https://www.isca-speech.org/archive/pdfs/interspeech_2022/peng22e_interspeech.pdf" }