Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
PENG, J. ZHANG, C. ČERNOCKÝ, J. YU, D.
Originální název
Progressive contrastive learning for self-supervised text-independent speaker verification
Typ
článek ve sborníku mimo WoS a Scopus
Jazyk
angličtina
Originální abstrakt
Self-supervised speaker representation learning has drawn attention extensively in recent years. Most of the work is based on the iterative clustering-classification learning framework, and the performance is sensitive to the pre-defined number of clusters. However, the cluster number is hard to estimate when dealing with large-scale unlabeled data. In this paper, we propose a progressive contrastive learning (PCL) algorithm to dynamically estimate the cluster number at each step based on the statistical characteristics of the data itself, and the estimated number will progressively approach the ground-truth speaker number with the increasing of step. Specifically, we first update the data queue by current augmented samples. Then, eigendecomposition is introduced to estimate the number of speakers in the updated data queue. Finally, we assign the queued data into the estimated cluster centroid and construct a contrastive loss, which encourages the speaker representation to be closer to its cluster centroid and away from others. Experimental results on VoxCeleb1 demonstrate the effectiveness of our proposed PCL compared with existing self-supervised approaches.
Klíčová slova
self-supervised, text-independent, speaker, verification
Autoři
PENG, J.; ZHANG, C.; ČERNOCKÝ, J.; YU, D.
Vydáno
28. 6. 2022
Nakladatel
International Speech Communication Association
Místo
Beijing
Strany od
17
Strany do
24
Strany počet
8
URL
https://www.isca-speech.org/archive/pdfs/odyssey_2022/peng22_odyssey.pdf
BibTex
@inproceedings{BUT179661, author="Junyi {Peng} and Chunlei {Zhang} and Jan {Černocký} and Dong {Yu}", title="Progressive contrastive learning for self-supervised text-independent speaker verification", booktitle="Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022)", year="2022", pages="17--24", publisher="International Speech Communication Association", address="Beijing", doi="10.21437/Odyssey.2022-3", url="https://www.isca-speech.org/archive/pdfs/odyssey_2022/peng22_odyssey.pdf" }