Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
MAI, F. ZULUAGA-GOMEZ, J. PARCOLLET, T. MOTLÍČEK, P.
Originální název
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
State-of-the-art ASR systems have achieved promising results by modeling local and global interactions separately. While the former can be computed efficiently, global interactions are usu- ally modeled via attention mechanisms, which are expensive for long input sequences. Here, we address this by extending Hy- perMixer, an efficient alternative to attention exhibiting linear complexity, to the Conformer architecture for speech recogni- tion, leading to HyperConformer. In particular, multi-head Hy- perConformer achieves comparable or higher recognition per- formance while being more efficient than Conformer in terms of inference speed, memory, parameter count, and available train- ing data. HyperConformer achieves a word error rate of 2.9% on LibriSpeech test-clean with less than 8M neural parameters and a peak memory during training of 5.7GB, hence trainable with accessible hardware. Encoder speed is between 38% on mid-length speech and 56% on long speech faster than an equiv- alent Conformer.1)
Klíčová slova
Hypernetworks, HyperMixer, Efficient Auto- matic Speech Recognition, LibriSpeech, SpeechBrain
Autoři
MAI, F.; ZULUAGA-GOMEZ, J.; PARCOLLET, T.; MOTLÍČEK, P.
Vydáno
20. 8. 2023
Nakladatel
International Speech Communication Association
Místo
Dublin
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Ročník
2023
Číslo
08
Stát
Francouzská republika
Strany od
2213
Strany do
2217
Strany počet
5
URL
https://www.isca-archive.org/interspeech_2023/mai23_interspeech.pdf
BibTex
@inproceedings{BUT187786, author="MAI, F. and ZULUAGA-GOMEZ, J. and PARCOLLET, T. and MOTLÍČEK, P.", title="HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition", booktitle="Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH", year="2023", journal="Proceedings of Interspeech", volume="2023", number="08", pages="2213--2217", publisher="International Speech Communication Association", address="Dublin", doi="10.21437/Interspeech.2023-1611", issn="1990-9772", url="https://www.isca-archive.org/interspeech_2023/mai23_interspeech.pdf" }