Přístupnostní navigace
E-application
Search Search Close
Publication detail
EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.
Original Title
Spelling-Aware Word-Based End-to-End ASR
Type
journal article in Web of Science
Language
English
Original Abstract
We propose a new end-to-end architecture for automatic speech recognition that expands the listen, attend and spell (LAS) paradigm. While the main word-predicting network is trained to predict words, the secondary, speller network, is optimized to predict word spellings from inner representations of the main network (e.g. word embeddings or context vectors from the attention module). We show that this joint training improves the word error rate of a word-based system and enables solving additional tasks, such as out-of-vocabulary word detection and recovery. The tests are conducted on LibriSpeech dataset consisting of 1000h of read speech.
Keywords
end-to-end, ASR, OOV, Listen Attend and Spell architecture
Authors
EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.
Released
19. 7. 2022
ISBN
1558-2361
Periodical
IEEE SIGNAL PROCESSING LETTERS
Year of study
29
Number
State
United States of America
Pages from
1729
Pages to
1733
Pages count
5
URL
https://ieeexplore.ieee.org/document/9833231
BibTex
@article{BUT178877, author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}", title="Spelling-Aware Word-Based End-to-End ASR", journal="IEEE SIGNAL PROCESSING LETTERS", year="2022", volume="29", number="29", pages="1729--1733", doi="10.1109/LSP.2022.3192199", issn="1558-2361", url="https://ieeexplore.ieee.org/document/9833231" }