Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
LANDINI, F. DIEZ SÁNCHEZ, M. STAFYLAKIS, T. BURGET, L.
Originální název
DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors
Typ
článek v časopise ve Web of Science, Jimp
Jazyk
angličtina
Originální abstrakt
Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly re- garding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most success- ful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Further- more, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data.
Klíčová slova
Attractor, DiaPer, end-to-end neural diarization, perceiver, speaker diarization.
Autoři
LANDINI, F.; DIEZ SÁNCHEZ, M.; STAFYLAKIS, T.; BURGET, L.
Vydáno
3. 7. 2024
ISSN
1558-7916
Periodikum
IEEE Transactions on Audio, Speech, and Language Processing
Ročník
32
Číslo
7
Stát
Spojené státy americké
Strany od
3450
Strany do
3465
Strany počet
16
URL
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10584294
BibTex
@article{BUT189802, author="Federico Nicolás {Landini} and Mireia {Diez Sánchez} and Themos {Stafylakis} and Lukáš {Burget}", title="DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors", journal="IEEE Transactions on Audio, Speech, and Language Processing", year="2024", volume="32", number="7", pages="3450--3465", doi="10.1109/TASLP.2024.3422818", issn="1558-7916", url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10584294" }