Project detail

Multiligvální rozpoznávání a vyhledávání v řeči pro elektronické slovníky

Duration: 1.9.2009 — 31.8.2013

Funding resources

Ministerstvo průmyslu a obchodu ČR - TIP

On the project

Projekt je zaměřen na výzkum, vývoj a ověření technologií, které umožní prototypovat systémy pro rozpoznávání a vyhledávání v řeči pouze s několika hodinami přepsaných trénovacích dat, bez fonetické a lingvistické expertízy. Tyto technologie budou ověřeny v oblasti elektronických slovníků.

Description in English
The proposed project aims at research, development and assessment of technologies for prototyping of speech recognition and search systems with only a few hours of transcribed training data, without the need for phonetic or linguistic expertise. These technologies will be tested in the domain of electronic dictionaries.

Keywords
multiligvalita, rozpoznávání řeči, detekceklíčových slov, elektronické slovníky

Key words in English
multilinguality, speech recognition, keyword spotting, electronic dictionaries

Mark

FR-TI1/034

Default language

Czech

People responsible

Černocký Jan, prof. Dr. Ing. - principal person responsible
Kubalík Jakub, Ing. - fellow researcher
Tomášek Pavel, Ing. - fellow researcher
Veselý Karel, Ing., Ph.D. - fellow researcher

Units

Department of Computer Graphics and Multimedia
- responsible department (1.1.1989 - not assigned)
Speech Data Mining Research Group BUT Speech@FIT
- internal (17.2.2010 - 31.8.2013)
Department of Computer Graphics and Multimedia
- co-beneficiary (1.9.2009 - 30.8.2013)

Results

POVEY, D.; GHOSHAL, A.; BOULIANNE, G.; BURGET, L.; GLEMBEK, O.; GOEL, N.; HANNEMANN, M.; MOTLÍČEK, P.; QIAN, Y.; SCHWARZ, P.; SILOVSKÝ, J.; STEMMER, G.; VESELÝ, K. The Kaldi Speech Recognition Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village Resort, Hawaii: IEEE Signal Processing Society, 2011. p. 1-4. ISBN: 978-1-4673-0366-8.
Detail

KARAFIÁT, M.; GRÉZL, F.; EGOROVA, E.; JANDA, M.; ČERNOCKÝ, J.; KAŠPAR, M.: ZB - FR-TI1/034; Prototypování rozpoznávačů řeči pro nové jazyky. http://www.fit.vutbr.cz/research/groups/speech/publi/2013/Overena_technologie_2013_Projekt_FR_TI1_034.pdf. URL: http://www.fit.vutbr.cz/research/groups/speech/publi/2013/Overena_technologie_2013_Projekt_FR_TI1_034.pdf. (ověřená technologie)
Detail

KARAFIÁT, M.; GRÉZL, F.; EGOROVA, E.; JANDA, M.; ČERNOCKÝ, J.: R - MPO TIP FR-TI1/034; Multilingvální modely pro rozpoznávání řeči. Produkt je umístěn na serverech ÚPGM FIT VUT v Brně.. URL: https://www.fit.vut.cz/research/product/375/. (software)
Detail

SOUFIFAR, M.; BURGET, L.; PLCHOT, O.; CUMANI, S.; ČERNOCKÝ, J. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. p. 74-78. ISBN: 978-1-62993-443-3. ISSN: 2308-457X.
Detail

EGOROVA, E.; VESELÝ, K.; KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J. Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set. In Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 7324-7328. ISBN: 978-1-4799-0355-9.
Detail

JANDA, M. Automatic Generation Of Pronunciation Dictionaries Based On Diarization. Proceedings of the 19th Conference Student EEICT 2013. Brno: Brno University of Technology, 2013. p. 228-232. ISBN: 978-80-214-4695-3.
Detail

TEJEDOR, J.; FAPŠO, M.; SZŐKE, I.; ČERNOCKÝ, J.; GRÉZL, F. Comparison of methods for language-dependent and language-independent query-by-example spoken term detection. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2012, vol. 2012, no. 30, p. 1-34. ISSN: 1046-8188.
Detail

SZŐKE, I.; FAPŠO, M.; VESELÝ, K. BUT2012 přístup pro Spoken Web Search úkol na MediaEval2012. Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR Workshop Proceedings. Pisa: CEUR-WS.org, 2012. s. 1-2. ISSN: 1613-0073.
Detail

JANDA, M.; KARAFIÁT, M.; ČERNOCKÝ, J. Dealing with Numbers in Grapheme-Based Speech Recognition. Proceedings of 15th International Conference on Text, Speech and Dialogue. Lecture Notes in Computer Science. Lecture Notes in Computer Science, 2012, Volume 7499. Springer-Verlag Berlin Heidelberg 2012: Springer Verlag, 2012. p. 438-445. ISBN: 978-3-642-32789-6. ISSN: 0302-9743.
Detail

VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F.; JANDA, M.; EGOROVA, E. The Language-Independent Bottleneck Features. Proceedings of IEEE 2012 Workshop on Spoken Language Technology. Miami: IEEE Signal Processing Society, 2012. p. 336-341. ISBN: 978-1-4673-5124-9.
Detail

MIKOLOV, T.; KOMBRINK, S.; DEORAS, A.; BURGET, L.; ČERNOCKÝ, J. RNNLM - Recurrent Neural Network Language Modeling Toolkit. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 1-4. ISBN: 978-1-4673-0366-8.
Detail

PLCHOT, O.; KARAFIÁT, M.; BRUMMER, J.; GLEMBEK, O.; MATĚJKA, P.; DE VILLIERS, E.; ČERNOCKÝ, J. Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification. In Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 330-333. ISBN: 978-981-07-3093-2.
Detail

BRUMMER, J.; CUMANI, S.; GLEMBEK, O.; KARAFIÁT, M.; MATĚJKA, P.; PEŠÁN, J.; PLCHOT, O.; SOUFIFAR, M.; DE VILLIERS, E.; ČERNOCKÝ, J. Description and analysis of the Brno276 system for LRE2011. In Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 216-223. ISBN: 978-981-07-3093-2.
Detail

JANDA, M. Grapheme Based Speech Recognition. Proceedings of the 18th Conference STUDENT EEICT 2012. Brno: Brno University of Technology, 2012. p. 441-445. ISBN: 978-80-214-4460-7.
Detail

KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J.; BURGET, L. Region Dependent Linear Transforms in Multilingual Speech Recognition. In Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012. p. 4885-4888. ISBN: 978-1-4673-0044-5.
Detail

KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Improving Language Models for ASR Using Translated In-domain Data. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012. p. 4405-4408. ISBN: 978-1-4673-0044-5.
Detail

GRÉZL, F.; KARAFIÁT, M.; JANDA, M. Study of Probabilistic and Bottle-Neck Features in Multilingual Environment. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 359-364. ISBN: 978-1-4673-0366-8.
Detail

MIKOLOV, T.; DEORAS, A.; POVEY, D.; BURGET, L.; ČERNOCKÝ, J. Strategies for Training Large Scale Neural Network Language Models. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 196-201. ISBN: 978-1-4673-0366-8.
Detail

VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F. Convolutive Bottleneck Network Features for LVCSR. Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 42-47. ISBN: 978-1-4673-0366-8.
Detail

KARAFIÁT, M.; BURGET, L.; MATĚJKA, P.; GLEMBEK, O.; ČERNOCKÝ, J. iVector-Based Discriminative Adaptation for Automatic Speech Recognition. Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 152-157. ISBN: 978-1-4673-0366-8.
Detail