Detail publikace

DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors

LANDINI, F. DIEZ SÁNCHEZ, M. STAFYLAKIS, T. BURGET, L.

Originální název

DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors

Typ

článek v časopise ve Web of Science, Jimp

Jazyk

angličtina

Originální abstrakt

Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly re- garding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most success- ful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Further- more, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data.

Klíčová slova

Attractor, DiaPer, end-to-end neural diarization, perceiver, speaker diarization.

Autoři

LANDINI, F.; DIEZ SÁNCHEZ, M.; STAFYLAKIS, T.; BURGET, L.

Vydáno

3. 7. 2024

ISSN

1558-7916

Periodikum

IEEE Transactions on Audio, Speech, and Language Processing

Ročník

32

Číslo

7

Stát

Spojené státy americké

Strany od

3450

Strany do

3465

Strany počet

16

URL

BibTex

@article{BUT189802,
  author="Federico Nicolás {Landini} and Mireia {Diez Sánchez} and Themos {Stafylakis} and Lukáš {Burget}",
  title="DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors",
  journal="IEEE Transactions on Audio, Speech, and Language Processing",
  year="2024",
  volume="32",
  number="7",
  pages="3450--3465",
  doi="10.1109/TASLP.2024.3422818",
  issn="1558-7916",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10584294"
}