Detail publikace

Spelling-Aware Word-Based End-to-End ASR

EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.

Originální název

Spelling-Aware Word-Based End-to-End ASR

Typ

článek v časopise ve Web of Science, Jimp

Jazyk

angličtina

Originální abstrakt

We propose a new end-to-end architecture for automatic speech recognition that expands the listen, attend and spell (LAS) paradigm. While the main word-predicting network is trained to predict words, the secondary, speller network, is optimized to predict word spellings from inner representations of the main network (e.g. word embeddings or context vectors from the attention module). We show that this joint training improves the word error rate of a word-based system and enables solving additional tasks, such as out-of-vocabulary word detection and recovery. The tests are conducted on LibriSpeech dataset consisting of 1000h of read speech.

Klíčová slova

end-to-end, ASR, OOV, Listen Attend and Spell architecture

Autoři

EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.

Vydáno

19. 7. 2022

ISSN

1558-2361

Periodikum

IEEE SIGNAL PROCESSING LETTERS

Ročník

29

Číslo

29

Stát

Spojené státy americké

Strany od

1729

Strany do

1733

Strany počet

5

URL

BibTex

@article{BUT178877,
  author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
  title="Spelling-Aware Word-Based End-to-End ASR",
  journal="IEEE SIGNAL PROCESSING LETTERS",
  year="2022",
  volume="29",
  number="29",
  pages="1729--1733",
  doi="10.1109/LSP.2022.3192199",
  issn="1558-2361",
  url="https://ieeexplore.ieee.org/document/9833231"
}