Detail publikace
Progressive contrastive learning for self-supervised text-independent speaker verification
PENG, J. ZHANG, C. ČERNOCKÝ, J. YU, D.
Originální název
Progressive contrastive learning for self-supervised text-independent speaker verification
Typ
článek ve sborníku mimo WoS a Scopus
Jazyk
angličtina
Originální abstrakt
Self-supervised speaker representation learning hasdrawn attention extensively in recent years. Most of thework is based on the iterative clustering-classificationlearning framework, and the performance is sensitiveto the pre-defined number of clusters. However, thecluster number is hard to estimate when dealing withlarge-scale unlabeled data. In this paper, we proposea progressive contrastive learning (PCL) algorithm todynamically estimate the cluster number at each stepbased on the statistical characteristics of the data itself,and the estimated number will progressively approachthe ground-truth speaker number with the increasing ofstep. Specifically, we first update the data queue bycurrent augmented samples. Then, eigendecompositionis introduced to estimate the number of speakers in theupdated data queue. Finally, we assign the queued datainto the estimated cluster centroid and construct a contrastiveloss, which encourages the speaker representationto be closer to its cluster centroid and away from others.Experimental results on VoxCeleb1 demonstrate the effectivenessof our proposed PCL compared with existingself-supervised approaches.
Klíčová slova
self-supervised, text-independent, speaker, verification
Autoři
PENG, J.; ZHANG, C.; ČERNOCKÝ, J.; YU, D.
Vydáno
28. 6. 2022
Nakladatel
International Speech Communication Association
Místo
Beijing
Strany od
17
Strany do
24
Strany počet
8
URL
BibTex
@inproceedings{BUT179661,
author="Junyi {Peng} and Chunlei {Zhang} and Jan {Černocký} and Dong {Yu}",
title="Progressive contrastive learning for self-supervised text-independent speaker verification",
booktitle="Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022)",
year="2022",
pages="17--24",
publisher="International Speech Communication Association",
address="Beijing",
doi="10.21437/Odyssey.2022-3",
url="https://www.isca-speech.org/archive/pdfs/odyssey_2022/peng22_odyssey.pdf"
}
Dokumenty