Detail publikace

Spelling-Aware Word-Based End-to-End ASR

EGOROVA, E. VYDANA, H. BURGET, L. ČERNOCKÝ, J.

Originální název

Typ

článek v časopise ve Web of Science, Jimp

Jazyk

angličtina

Originální abstrakt

We propose a new end-to-end architecture for automaticspeech recognition that expands the listen, attend andspell (LAS) paradigm. While the main word-predicting networkis trained to predict words, the secondary, speller network, isoptimized to predict word spellings from inner representationsof the main network (e.g. word embeddings or context vectorsfrom the attention module). We show that this joint trainingimproves the word error rate of a word-based system and enablessolving additional tasks, such as out-of-vocabulary word detectionand recovery. The tests are conducted on LibriSpeech datasetconsisting of 1000h of read speech.

Klíčová slova

end-to-end, ASR, OOV, Listen Attend and Spellarchitecture

Autoři

EGOROVA, E.; VYDANA, H.; BURGET, L.; ČERNOCKÝ, J.

Vydáno

19. 7. 2022

ISSN

1558-2361

Periodikum

IEEE SIGNAL PROCESSING LETTERS

Ročník

Číslo

Stát

Spojené státy americké

Strany od

1729

Strany do

1733

Strany počet

URL

https://ieeexplore.ieee.org/document/9833231

BibTex

@article{BUT178877,
  author="Ekaterina {Egorova} and Hari Krishna {Vydana} and Lukáš {Burget} and Jan {Černocký}",
  title="Spelling-Aware Word-Based End-to-End ASR",
  journal="IEEE SIGNAL PROCESSING LETTERS",
  year="2022",
  volume="29",
  number="29",
  pages="1729--1733",
  doi="10.1109/LSP.2022.3192199",
  issn="1558-2361",
  url="https://ieeexplore.ieee.org/document/9833231"
}

Dokumenty

egorova_ieee2022_Spelling-Aware_Word-Based_End-to-End_ASR.pdf

VUT

Fakulty

Vysokoškolské ústavy

Součásti

Spelling-Aware Word-Based End-to-End ASR