Detail publikace

Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System

EGOROVA, E. BURGET, L.

Originální název

Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

The paper presents a new approach to extracting useful informationfrom out-of-vocabulary (OOV) speech regions inASR system output. The system makes use of a hybrid decodingnetwork with both words and sub-word units. In thedecoded lattices, candidates for OOV regions are identifiedas sub-graphs of sub-word units. To facilitate OOV word recovery,we search for recurring OOVs by clustering the detectedcandidate OOVs. The metrics for clustering is basedon a comparison of the sub-graphs corresponding to the OOVcandidates. The proposed method discovers repeating outof-vocabulary words and finds their graphemic representationmore robustly than more conventional techniques taking intoaccount only one best sub-word string hypotheses.

Klíčová slova

Out-of-vocabulary Words, Robust ASR

Autoři

EGOROVA, E.; BURGET, L.

Vydáno

15. 4. 2018

Nakladatel

IEEE Signal Processing Society

Místo

Calgary

ISBN

978-1-5386-4658-8

Kniha

Proceedings of ICASSP 2018

Strany od

5919

Strany do

5923

Strany počet

5

URL

BibTex

@inproceedings{BUT155047,
  author="Ekaterina {Egorova} and Lukáš {Burget}",
  title="Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System",
  booktitle="Proceedings of ICASSP 2018",
  year="2018",
  pages="5919--5923",
  publisher="IEEE Signal Processing Society",
  address="Calgary",
  doi="10.1109/ICASSP.2018.8462221",
  isbn="978-1-5386-4658-8",
  url="https://www.fit.vut.cz/research/publication/11725/"
}

Dokumenty