Publication detail

Spelling-Aware Word-Based End-to-End ASR

EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.

Original Title

Spelling-Aware Word-Based End-to-End ASR

Type

journal article in Web of Science

Language

English

Original Abstract

We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.

Keywords

end-to-end, ASR, OOV, Listen Attend and Spellarchitecture

Authors

EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.

Released

19. 7. 2022

ISBN

1558-2361

Periodical

IEEE SIGNAL PROCESSING LETTERS

Year of study

29

Number

29

State

United States of America

Pages from

1729

Pages to

1733

Pages count

5

URL

BibTex

@article{BUT178877,
  author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
  title="Spelling-Aware Word-Based End-to-End ASR",
  journal="IEEE SIGNAL PROCESSING LETTERS",
  year="2022",
  volume="29",
  number="29",
  pages="1729--1733",
  doi="10.1109/LSP.2022.3192199",
  issn="1558-2361",
  url="https://ieeexplore.ieee.org/document/9833231"
}

Documents