Přístupnostní navigace
E-application
Search Search Close
Publication detail
LANDINI, F. DIEZ SÁNCHEZ, M. STAFYLAKIS, T. BURGET, L.
Original Title
DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors
Type
journal article in Web of Science
Language
English
Original Abstract
Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly re- garding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most success- ful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Further- more, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data.
Keywords
Attractor, DiaPer, end-to-end neural diarization, perceiver, speaker diarization.
Authors
LANDINI, F.; DIEZ SÁNCHEZ, M.; STAFYLAKIS, T.; BURGET, L.
Released
3. 7. 2024
ISBN
1558-7916
Periodical
IEEE Transactions on Audio, Speech, and Language Processing
Year of study
32
Number
7
State
United States of America
Pages from
3450
Pages to
3465
Pages count
16
URL
https://ieeexplore.ieee.org/document/10584294
BibTex
@article{BUT189802, author="Federico Nicolás {Landini} and Mireia {Diez Sánchez} and Themos {Stafylakis} and Lukáš {Burget}", title="DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors", journal="IEEE Transactions on Audio, Speech, and Language Processing", year="2024", volume="32", number="7", pages="3450--3465", doi="10.1109/TASLP.2024.3422818", issn="1558-7916", url="https://ieeexplore.ieee.org/document/10584294" }
Documents
landini_IEEE_Transactions_2024_DiaPer_End-to-End_Neural_Diarization.pdf