Detail publikace

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

STAFYLAKIS, T. SILNOVA, A. ROHDIN, J. PLCHOT, O. BURGET, L.

Originální název

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

Speaker embedding extractors are typically trained using a classification loss over the training speakers. During the last few years, the standard softmax/cross-entropy loss has been replaced by the margin-based losses, yielding significant im- provements in speaker recognition accuracy. Motivated by the fact that the margin merely reduces the logit of the target speaker during training, we consider a probabilistic framework that has a similar effect. The variational information bottle- neck provides a principled mechanism for making deterministic nodes stochastic, resulting in an implicit reduction of the pos- terior of the target speaker. We experiment with a wide range of speaker recognition benchmarks and scoring methods and re- port competitive results to those obtained with the state-of-the- art Additive Angular Margin loss.

Klíčová slova

speaker recognition, variational information bottleneck

Autoři

STAFYLAKIS, T.; SILNOVA, A.; ROHDIN, J.; PLCHOT, O.; BURGET, L.

Vydáno

1. 9. 2024

Nakladatel

International Speech Communication Association

Místo

Kos

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Ročník

2024

Číslo

9

Stát

Francouzská republika

Strany od

3220

Strany do

3224

Strany počet

5

URL

BibTex

@inproceedings{BUT193738,
  author="Themos {Stafylakis} and Anna {Silnova} and Johan Andréas {Rohdin} and Oldřich {Plchot} and Lukáš {Burget}",
  title="Challenging margin-based speaker embedding extractors by using the variational information bottleneck",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="3220--3224",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-2058",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/stafylakis24_interspeech.pdf"
}

Dokumenty