Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
PENG, J. PLCHOT, O. STAFYLAKIS, T. MOŠNER, L. BURGET, L. ČERNOCKÝ, J.
Originální název
Improving Speaker Verification with Self-Pretrained Transformer Models
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
Recently, fine-tuning large pre-trained Transformer models using downstream datasets has received a rising interest. Despite their success, it is still challenging to disentangle the benefits of large-scale datasets and Transformer structures from the limitations of the pre-training. In this paper, we introduce a hierarchical training approach, named self-pretraining, in which Transformer models are pretrained and finetuned on the same dataset. Three pre-trained models including HuBERT, Conformer andWavLM are evaluated on four different speaker verification datasets with varying sizes. Our experiments show that these self-pretrained models achieve competitive performance on downstream speaker verification tasks with only one-third of the data compared to Librispeech pretraining, such as Vox- Celeb1 and CNCeleb1. Furthermore, when pre-training only on the VoxCeleb2-dev, the Conformer model outperforms the one pre-trained on 94k hours of data using the same fine-tuning settings.
Klíčová slova
speaker verification, pre-trained speech transformer model, pre-training,
Autoři
PENG, J.; PLCHOT, O.; STAFYLAKIS, T.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J.
Vydáno
20. 8. 2023
Nakladatel
International Speech Communication Association
Místo
Dublin
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Ročník
2023
Číslo
08
Stát
Francouzská republika
Strany od
5361
Strany do
5365
Strany počet
5
URL
https://www.isca-speech.org/archive/pdfs/interspeech_2023/peng23_interspeech.pdf
BibTex
@inproceedings{BUT185575, author="Junyi {Peng} and Oldřich {Plchot} and Themos {Stafylakis} and Ladislav {Mošner} and Lukáš {Burget} and Jan {Černocký}", title="Improving Speaker Verification with Self-Pretrained Transformer Models", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2023", journal="Proceedings of Interspeech", volume="2023", number="08", pages="5361--5365", publisher="International Speech Communication Association", address="Dublin", doi="10.21437/Interspeech.2023-453", issn="1990-9772", url="https://www.isca-speech.org/archive/pdfs/interspeech_2023/peng23_interspeech.pdf" }