Detail publikace
ABC SYSTEM DESCRIPTION FOR NIST SRE 2024
ALAM, J. BARAHONA QUIRÓS, S. BOBOŠ, D. BURGET, L. CUMANI, S. DAHMANE, M. HAN, J. HLAVÁČEK, M. KODOVSKÝ, M. LANDINI, F. MOŠNER, L. PÁLKA, P. PAVLÍČEK, T. PENG, J. PLCHOT, O. RAJASEKHAR, P. ROHDIN, J. SILNOVA, A. STAFYLAKIS, T. ZHANG, L.
Originální název
ABC SYSTEM DESCRIPTION FOR NIST SRE 2024
Typ
článek ve sborníku mimo WoS a Scopus
Jazyk
angličtina
Originální abstrakt
This paper presents the ABC team's submission to the NIST SRE 2024 evaluation, a collaboration among BUT, Polito, Phonexia, Omilia, UAM, and CRIM. Our team participated in all evaluation tracks (audio-only, visual-only, and audio-visual) under both fixed and open conditions. We developed a variety of frontends, back- ends, and strategies for calibration and fusion to optimize system performance. The fixed and open conditions share some solutions. In the audio-only systems, we employed ResNet variants and the newly introduced ReDimNet model as frontends for embedding extraction. Then, we explored various backends including cosine scoring, Prob- abilistic Linear Discriminant Analysis, and Pairwise Support Vec- tor Machine. For the visual-only systems, we adopted the Insight- face framework, utilized ResNet100 and MagFace pre-trained on the MS1MV2 dataset. Cosine scoring under various strategies were ap- plied, with logistic regression used for both calibration and fusion. Finally, scores from audio-only and visual-only systems were fused using logistic regression for submission to the audio-visual track. Building on the fixed condition, the open condition included en- hancements such as larger ResNet models, additional training data from the VoxBlink2 dataset, and the pre-trained XLS-R foundation model
Klíčová slova
NIST, speaker, recognition, evaluation
Autoři
ALAM, J.; BARAHONA QUIRÓS, S.; BOBOŠ, D.; BURGET, L.; CUMANI, S.; DAHMANE, M.; HAN, J.; HLAVÁČEK, M.; KODOVSKÝ, M.; LANDINI, F.; MOŠNER, L.; PÁLKA, P.; PAVLÍČEK, T.; PENG, J.; PLCHOT, O.; RAJASEKHAR, P.; ROHDIN, J.; SILNOVA, A.; STAFYLAKIS, T.; ZHANG, L.
Vydáno
3. 12. 2024
Nakladatel
National Institute of Standards and Technology
Místo
San Juan
Strany od
1
Strany do
9
Strany počet
9
URL
BibTex
@inproceedings{BUT193961,
author="ALAM, J. and BARAHONA QUIRÓS, S. and BOBOŠ, D. and BURGET, L. and CUMANI, S. and DAHMANE, M. and HAN, J. and HLAVÁČEK, M. and KODOVSKÝ, M. and LANDINI, F. and MOŠNER, L. and PÁLKA, P. and PAVLÍČEK, T. and PENG, J. and PLCHOT, O. and RAJASEKHAR, P. and ROHDIN, J. and SILNOVA, A. and STAFYLAKIS, T. and ZHANG, L.",
title="ABC SYSTEM DESCRIPTION FOR NIST SRE 2024",
booktitle="Proceedings of NIST SRE 2024",
year="2024",
pages="1--9",
publisher="National Institute of Standards and Technology",
address="San Juan",
url="https://www.fit.vut.cz/research/publication/13341/"
}
Dokumenty