Přístupnostní navigace
E-application
Search Search Close
Publication result detail
PENG, J.; PLCHOT, O.; STAFYLAKIS, T.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J.
Original Title
Improving Speaker Verification with Self-Pretrained Transformer Models
English Title
Type
Paper in proceedings (conference paper)
Original Abstract
Recently, fine-tuning large pre-trained Transformer models using downstream datasets has received a rising interest. Despite their success, it is still challenging to disentangle the benefits of large-scale datasets and Transformer structures from the limitations of the pre-training. In this paper, we introduce a hierarchical training approach, named self-pretraining, in which Transformer models are pretrained and finetuned on the same dataset. Three pre-trained models including HuBERT, Conformer andWavLM are evaluated on four different speaker verification datasets with varying sizes. Our experiments show that these self-pretrained models achieve competitive performance on downstream speaker verification tasks with only one-third of the data compared to Librispeech pretraining, such as Vox- Celeb1 and CNCeleb1. Furthermore, when pre-training only on the VoxCeleb2-dev, the Conformer model outperforms the one pre-trained on 94k hours of data using the same fine-tuning settings.
English abstract
Keywords
speaker verification, pre-trained speech transformer model, pre-training,
Key words in English
Authors
RIV year
2024
Released
20.08.2023
Publisher
International Speech Communication Association
Location
Dublin
Book
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISBN
1990-9772
Periodical
Proceedings of Interspeech
Volume
2023
Number
08
State
French Republic
Pages from
5361
Pages to
5365
Pages count
5
URL
https://www.isca-speech.org/archive/pdfs/interspeech_2023/peng23_interspeech.pdf
BibTex
@inproceedings{BUT185575, author="Junyi {Peng} and Oldřich {Plchot} and Themos {Stafylakis} and Ladislav {Mošner} and Lukáš {Burget} and Jan {Černocký}", title="Improving Speaker Verification with Self-Pretrained Transformer Models", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2023", journal="Proceedings of Interspeech", volume="2023", number="08", pages="5361--5365", publisher="International Speech Communication Association", address="Dublin", doi="10.21437/Interspeech.2023-453", issn="1990-9772", url="https://www.isca-speech.org/archive/pdfs/interspeech_2023/peng23_interspeech.pdf" }
Documents
peng23_interspeech2023_improving