Přístupnostní navigace
E-application
Search Search Close
Publication detail
EGOROVA, E. BURGET, L.
Original Title
Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System
Type
conference paper
Language
English
Original Abstract
The paper presents a new approach to extracting useful information from out-of-vocabulary (OOV) speech regions in ASR system output. The system makes use of a hybrid decoding network with both words and sub-word units. In the decoded lattices, candidates for OOV regions are identified as sub-graphs of sub-word units. To facilitate OOV word recovery, we search for recurring OOVs by clustering the detected candidate OOVs. The metrics for clustering is based on a comparison of the sub-graphs corresponding to the OOV candidates. The proposed method discovers repeating outof- vocabulary words and finds their graphemic representation more robustly than more conventional techniques taking into account only one best sub-word string hypotheses.
Keywords
Out-of-vocabulary Words, Robust ASR
Authors
EGOROVA, E.; BURGET, L.
Released
15. 4. 2018
Publisher
IEEE Signal Processing Society
Location
Calgary
ISBN
978-1-5386-4658-8
Book
Proceedings of ICASSP 2018
Pages from
5919
Pages to
5923
Pages count
5
URL
https://www.fit.vut.cz/research/publication/11725/
BibTex
@inproceedings{BUT155047, author="Ekaterina {Egorova} and Lukáš {Burget}", title="Out-of-Vocabulary Word Recovery Using FST-Based Subword Unit Clustering in a Hybrid ASR System", booktitle="Proceedings of ICASSP 2018", year="2018", pages="5919--5923", publisher="IEEE Signal Processing Society", address="Calgary", doi="10.1109/ICASSP.2018.8462221", isbn="978-1-5386-4658-8", url="https://www.fit.vut.cz/research/publication/11725/" }