Publication detail

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

STAFYLAKIS, T. SILNOVA, A. ROHDIN, J. PLCHOT, O. BURGET, L.

Original Title

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

Type

conference paper

Language

English

Original Abstract

Speaker embedding extractors are typically trained using a classification loss over the training speakers. During the last few years, the standard softmax/cross-entropy loss has been replaced by the margin-based losses, yielding significant im- provements in speaker recognition accuracy. Motivated by the fact that the margin merely reduces the logit of the target speaker during training, we consider a probabilistic framework that has a similar effect. The variational information bottle- neck provides a principled mechanism for making deterministic nodes stochastic, resulting in an implicit reduction of the pos- terior of the target speaker. We experiment with a wide range of speaker recognition benchmarks and scoring methods and re- port competitive results to those obtained with the state-of-the- art Additive Angular Margin loss.

Keywords

speaker recognition, variational information bottleneck

Authors

STAFYLAKIS, T.; SILNOVA, A.; ROHDIN, J.; PLCHOT, O.; BURGET, L.

Released

1. 9. 2024

Publisher

International Speech Communication Association

Location

Kos

ISBN

1990-9772

Periodical

Proceedings of Interspeech

Year of study

2024

Number

9

State

French Republic

Pages from

3220

Pages to

3224

Pages count

5

URL

BibTex

@inproceedings{BUT193738,
  author="Themos {Stafylakis} and Anna {Silnova} and Johan Andréas {Rohdin} and Oldřich {Plchot} and Lukáš {Burget}",
  title="Challenging margin-based speaker embedding extractors by using the variational information bottleneck",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="3220--3224",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-2058",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/stafylakis24_interspeech.pdf"
}

Documents