Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
VESELÝ, K. BASKAR, M. DIEZ SÁNCHEZ, M. BENEŠ, K.
Originální název
MGB-3 BUT System: Low-resource ASR on Egyptian YOUTUBE data
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
This paper presents a series of experiments we performed during our work on the MGB-3 evaluations. We both describe the submitted system, as well as the post-evaluation analysis. Our initial BLSTM-HMM system was trained on 250 hours of MGB-2 data (Al-Jazeera), it was adapted with 5 hours of Egyptian data (YouTube). We included such techniques as diarization, n-gram language model adaptation, speed perturbation of the adaptation data, and the use of all 4 correct references. The 4 references were either used for supervision with a confusion network, or we included each sentence 4x with the transcripts from all the annotators. Then, it was also helpful to blend the augmented MGB-3 adaptation data with 15 hours of MGB-2 data. Although we did not rank with our single system among the best teams in the evaluations, we believe that our analysis will be highly interesting not only for the other MGB-3 challenge participants.
Klíčová slova
MGB-3, ASR adaptation, low-resource ASR, Egyptian Arabic, diarization
Autoři
VESELÝ, K.; BASKAR, M.; DIEZ SÁNCHEZ, M.; BENEŠ, K.
Vydáno
16. 12. 2017
Nakladatel
IEEE Signal Processing Society
Místo
Okinawa
ISBN
978-1-5090-4788-8
Kniha
Proceedings of ASRU 2017
Strany od
368
Strany do
373
Strany počet
6
URL
https://www.fit.vut.cz/research/publication/11595/
BibTex
@inproceedings{BUT144502, author="Karel {Veselý} and Murali Karthick {Baskar} and Mireia {Diez Sánchez} and Karel {Beneš}", title="MGB-3 BUT System: Low-resource ASR on Egyptian YOUTUBE data", booktitle="Proceedings of ASRU 2017", year="2017", pages="368--373", publisher="IEEE Signal Processing Society", address="Okinawa", doi="10.1109/ASRU.2017.8268959", isbn="978-1-5090-4788-8", url="https://www.fit.vut.cz/research/publication/11595/" }
Dokumenty
vesely_asru2017_mgb3-paper.pdf