Detail publikace

Multi-Channel Extension of Pre-trained Models for Speaker Verification

MOŠNER, L. SERIZEL, R. BURGET, L. PLCHOT, O. VINCENT, E. PENG, J. ČERNOCKÝ, J.

Originální název

Multi-Channel Extension of Pre-trained Models for Speaker Verification

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

In this work, we focus on designing a multi-channel speech processing system based on large pre-trained models. These models are typically trained for single-channel scenarios via self-supervised learning (SSL). A common approach to using the SSL models with microphone array data is to prepend it with a multi-channel speech enhancement. The downside is that spatial information can be leveraged only by the pre-processing stage, and enhancement errors get propagated to the SSL model. We aim to alleviate the issue by designing METRO, a Multi-channel ExTension of pRe-trained mOdels. It interleaves per- channel processing with cross-channel information exchange, eventually fusing channels into one. While our approach is general, here we focus on multi-channel speaker verification. Our experiments on the MultiSV corpus show noteworthy improvements over the best-published results on the dataset.

Klíčová slova

multi-channel speaker verification, pre-trained models

Autoři

MOŠNER, L.; SERIZEL, R.; BURGET, L.; PLCHOT, O.; VINCENT, E.; PENG, J.; ČERNOCKÝ, J.

Vydáno

1. 9. 2024

Nakladatel

International Speech Communication Association

Místo

Kos

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Ročník

2024

Číslo

9

Stát

Francouzská republika

Strany od

2135

Strany do

2139

Strany počet

5

URL

BibTex

@inproceedings{BUT193682,
  author="MOŠNER, L. and SERIZEL, R. and BURGET, L. and PLCHOT, O. and VINCENT, E. and PENG, J. and ČERNOCKÝ, J.",
  title="Multi-Channel Extension of Pre-trained Models for Speaker Verification",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="2135--2139",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-1260",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/mosner24_interspeech.pdf"
}

Dokumenty