Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
MOŠNER, L. PLCHOT, O. PENG, J. BURGET, L. ČERNOCKÝ, J.
Originální název
Multi-Channel Speech Separation with Cross-Attention and Beamforming
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
Originally, single-channel source separation gained more research interest. It resulted in immense progress. Multichannel (MC) separation comes with new challenges posed by adverse indoor conditions making it an important field of study. We seek to combine promising ideas from the two worlds. First, we build MC models by extending current single-channel time-domain separators relying on their strength. Our approach allows reusing pre-trained models by inserting designed lightweight reference channel attention (RCA) combiner, the only trained module. It comprises two blocks: the former allows attending to different parts of other channels w.r.t. the reference one, and the latter provides an attention-based combination of channels. Second, like many successful MC models, our system incorporates beamforming and allows for the fusion of the network and beamformer outputs. We compare our approach with the SOTA models on the SMS-WSJ dataset and show better or similar performance.
Klíčová slova
multi-channel source separation, cross-channel attention, beamforming
Autoři
MOŠNER, L.; PLCHOT, O.; PENG, J.; BURGET, L.; ČERNOCKÝ, J.
Vydáno
20. 8. 2023
Nakladatel
International Speech Communication Association
Místo
Dublin
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Ročník
2023
Číslo
08
Stát
Francouzská republika
Strany od
1693
Strany do
1697
Strany počet
5
URL
https://www.isca-speech.org/archive/interspeech_2023/mosner23_interspeech.html
BibTex
@inproceedings{BUT185571, author="Ladislav {Mošner} and Oldřich {Plchot} and Junyi {Peng} and Lukáš {Burget} and Jan {Černocký}", title="Multi-Channel Speech Separation with Cross-Attention and Beamforming", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2023", journal="Proceedings of Interspeech", volume="2023", number="08", pages="1693--1697", publisher="International Speech Communication Association", address="Dublin", doi="10.21437/Interspeech.2023-2537", issn="1990-9772", url="https://www.isca-speech.org/archive/interspeech_2023/mosner23_interspeech.html" }
Dokumenty
mosner23_interspeech2023_multi-channel.pdf