Publication detail
Progressive contrastive learning for self-supervised text-independent speaker verification
PENG, J. ZHANG, C. ČERNOCKÝ, J. YU, D.
Original Title
Progressive contrastive learning for self-supervised text-independent speaker verification
Type
article in a collection out of WoS and Scopus
Language
English
Original Abstract
Self-supervised speaker representation learning hasdrawn attention extensively in recent years. Most of thework is based on the iterative clustering-classificationlearning framework, and the performance is sensitiveto the pre-defined number of clusters. However, thecluster number is hard to estimate when dealing withlarge-scale unlabeled data. In this paper, we proposea progressive contrastive learning (PCL) algorithm todynamically estimate the cluster number at each stepbased on the statistical characteristics of the data itself,and the estimated number will progressively approachthe ground-truth speaker number with the increasing ofstep. Specifically, we first update the data queue bycurrent augmented samples. Then, eigendecompositionis introduced to estimate the number of speakers in theupdated data queue. Finally, we assign the queued datainto the estimated cluster centroid and construct a contrastiveloss, which encourages the speaker representationto be closer to its cluster centroid and away from others.Experimental results on VoxCeleb1 demonstrate the effectivenessof our proposed PCL compared with existingself-supervised approaches.
Keywords
self-supervised, text-independent, speaker, verification
Authors
PENG, J.; ZHANG, C.; ČERNOCKÝ, J.; YU, D.
Released
28. 6. 2022
Publisher
International Speech Communication Association
Location
Beijing
Pages from
17
Pages to
24
Pages count
8
URL
BibTex
@inproceedings{BUT179661,
author="Junyi {Peng} and Chunlei {Zhang} and Jan {Černocký} and Dong {Yu}",
title="Progressive contrastive learning for self-supervised text-independent speaker verification",
booktitle="Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022)",
year="2022",
pages="17--24",
publisher="International Speech Communication Association",
address="Beijing",
doi="10.21437/Odyssey.2022-3",
url="https://www.isca-speech.org/archive/pdfs/odyssey_2022/peng22_odyssey.pdf"
}
Documents