Publication detail

Spelling-Aware Word-Based End-to-End ASR

EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.

Original Title

Type

journal article in Web of Science

Language

English

Original Abstract

We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.

Keywords

end-to-end, ASR, OOV, Listen Attend and Spellarchitecture

Authors

EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.

Released

19. 7. 2022

ISBN

1558-2361

Periodical

IEEE SIGNAL PROCESSING LETTERS

Year of study

Number

State

United States of America

Pages from

1729

Pages to

1733

Pages count

URL

https://ieeexplore.ieee.org/document/9833231

BibTex

@article{BUT178877,
  author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
  title="Spelling-Aware Word-Based End-to-End ASR",
  journal="IEEE SIGNAL PROCESSING LETTERS",
  year="2022",
  volume="29",
  number="29",
  pages="1729--1733",
  doi="10.1109/LSP.2022.3192199",
  issn="1558-2361",
  url="https://ieeexplore.ieee.org/document/9833231"
}

Documents

egorova_ieee2022_Spelling-Aware_Word-Based_End-to-End_ASR.pdf

VUT

Faculties

University Institutes

Parts

Spelling-Aware Word-Based End-to-End ASR