Publication detail
Spelling-Aware Word-Based End-to-End ASR
EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.
Original Title
Spelling-Aware Word-Based End-to-End ASR
Type
journal article in Web of Science
Language
English
Original Abstract
We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.
Keywords
end-to-end, ASR, OOV, Listen Attend and Spellarchitecture
Authors
EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.
Released
19. 7. 2022
ISBN
1558-2361
Periodical
IEEE SIGNAL PROCESSING LETTERS
Year of study
29
Number
29
State
United States of America
Pages from
1729
Pages to
1733
Pages count
5
URL
BibTex
@article{BUT178877,
author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
title="Spelling-Aware Word-Based End-to-End ASR",
journal="IEEE SIGNAL PROCESSING LETTERS",
year="2022",
volume="29",
number="29",
pages="1729--1733",
doi="10.1109/LSP.2022.3192199",
issn="1558-2361",
url="https://ieeexplore.ieee.org/document/9833231"
}
Documents