Project detail

Voice technologies for support of information society

Duration: 01.01.2002 — 31.12.2004

Funding resources

Czech Science Foundation - Standardní projekty

- whole funder (2002-01-01 - 2004-12-31)

On the project

Hlasové technologie v podpoře informační společnosti

Description in English
Voice technologies for support of information society

zpracování řeči, rozpoznávání, kódování

Key words in English
speech processing, recognition, coding



Default language


People responsible

Burget Lukáš, doc. Ing., Ph.D. - fellow researcher
Grézl František, Ing., Ph.D. - fellow researcher
Karafiát Martin, Ing., Ph.D. - fellow researcher
Motlíček Petr, doc. Ing., Ph.D. - fellow researcher
Schwarz Petr, Ing., Ph.D. - fellow researcher
Černocký Jan, prof. Dr. Ing. - principal person responsible


Department of Computer Graphics and Multimedia
- beneficiary (2002-01-01 - 2004-12-31)


MATĚJKA, P., SCHWARZ, P., ČERNOCKÝ, J., HEŘMANSKÝ, H. Phoneme Recognition using Temporal Patterns. In In Proceedings of the conference TSD'2003. International Conference on Text Speech and Dialogue, TSD 2003. 2003. p. 198 ( p.)ISBN: 3-540-20024-X.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Recognition Of Phoneme Strings Using TRAP Technique. In 8th European conference on speech communication and technology EUROSPEECH'03. Geneva, Schwitzerland: ISCA, 2003. p. 825 ( p.)

VONDRA, M., VÍCH, R. Design of FIR Vocal Tract Models with Linear and Nonlinear Phase. In Proceedings of the 12th Czech-German Workshop SPEECH PROCESSING. URE AV CR Praha: Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, Prague, 2002. p. 31 ( p.)ISBN: 80-86269-09-4.

MATĚJKA, P.; SCHWARZ, P.; KARAFIÁT, M.; ČERNOCKÝ, J. Some like it Gaussian... Proc. 5th International Conference Text, Speech and Dialogue, TSD2002. Lecture notes in artificial intelligence 2448. Berlin: Springer Verlag, 2002. p. 321-324. ISBN: 3-540-44129-8.

SCHWARZ, P.; ČERNOCKÝ, J. Keyword detection in Czech fluent speech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2.

KARAFIÁT, M.; ČERNOCKÝ, J. Context dependent Hidden Markov models in recognition of Czech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

GRÉZL, F.; BURGET, L.; JAIN, P.; ČERNOCKÝ, J. Improving TRAPS features using LDA. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

SCHWARZ, P. Modifications of Viterbi algorithms for keyword detection. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 0-0. ISBN: 80-214-2116-9.

MOTLÍČEK, P.; BURGET, L. Noise estimation for efficient speech enhancement and robust speech recognition. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 1033-1036. ISBN: 1-876346-42-6.

MOTLÍČEK, P. Application of Mel-scale Filter bank for Noise Estimation in Speech Processing. 12th International Czech-Slovak Scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2.

MOTLÍČEK, P.; BURGET, L. Efficient Noise Estimation and its Application for Robust Speech Recognition. 5th International Conference, TSD 2002 Brno, Czech Republic, September 2002 Proceedings. Berlin: Springer Verlag, 2002. p. 229-236. ISBN: 3-540-44129-8.

MOTLÍČEK, P. Noise Estimation for Spectral Subtraction in Speech Processing. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 0-0. ISBN: 80-214-2116-9.

KARAFIÁT, M.; ČERNOCKÝ, J. Differences between context dependent and context independent Hidden Markov Models for recognition of Czech. Proc. of 8th student conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering TUB, 2002. p. 328-332. ISBN: 80-214-2116-9.

MATĚJKA, P.; ČERNOCKÝ, J. Feature gaussianization in speech recognition. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

GRÉZL, F. Classifiers in speech recognition systems based on TRAPS. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 74-77. ISBN: 80-214-2116-9.

ČERNOCKÝ, J. Units for automatic language independent speech processing. Proc. LREC 2002 - workshop on Portability issues in human language technologies. Las Palmas: European Language Resources Association, 2002. p. 7-13.

BURGET, L.; DUPONT, S.; GARUDADRI, H.; GRÉZL, F.; HEŘMANSKÝ, H.; JAIN, P.; KAJAREKAR, S.; MORGAN, N. QUALCOMM-ICSI-OGI Features for ASR. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 4-7. ISBN: 1-876346-42-6.

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing. Sborník příspěvků a prezentací akce Odborné semináře 2003. REL02V. Brno: Department of Radioelectronics FEEC BUT, 2003. s. 0-0.

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. p. 147 ( p.)ISBN: 3-540-23049-1.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. p. 465 ( p.)ISBN: 3-540-23049-1.

VONDRA, M. Voice Transformation in Parametric Speech Synthesis. In Speech Processing. Praha: Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, 2003. p. 35 ( p.)ISBN: 80-86269-10-8.

MATĚJKA, P. Review of Automatic Language Identification. Proceedings of 10th Conference and Competition STUDENT EEICTT 2004 Volume 2. Brno: 2004. p. 344-348. ISBN: 80-214-2635-7.

MATĚJKA, P. Review of Automatic Language Identification. In Proceedings of 10th Conference and Competition STUDENT EEICTT 2004 Volume 2. Brno, Czech Republic: FIT BUT & FEEC BUT, 2004. p. 344 ( p.)ISBN: 80-214-2635-7.

MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. Proceedings of 9th Conference and Competition STUDENT EEICT 2003. Brno: Dean Office of FEEC BUT, 2003. p. 598-602. ISBN: 80-214-2379-X.

JENDERKA, P.; VÍCHA, T. Voice Activity Detection in Multimodal Meeting Manager. Proceedings of 9th Conference and Competition STUDENT EEICT 2003 Volume 3. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 588-592. ISBN: 80-214-2379-X.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Recognition of Phoneme Strings using TRAP Technique. Proceedings of 8th International Conference Eurospeech. European Conference EUROSPEECH. Geneve: International Speech Communication Association, 2003. p. 1-4. ISSN: 1018-4074.

MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. Proceedings of the International Conference and Competition. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 315-319. ISBN: 80-214-2401-X.

MOTLÍČEK, P.; ČERNOCKÝ, J. Time-domain based Temporal Processing with Application of. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 821-824. ISSN: 1018-4074.

MOTLÍČEK, P.; ČERNOCKÝ, J. Autoregressive Modeling based Feature Extraction for Aurora3 DSR Task. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 1801-1804. ISSN: 1018-4074.

MOTLÍČEK, P.; ČERNOCKÝ, J. All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 295-300. ISBN: 3-540-20024-X. ISSN: 0302-9743.

SCHWARZ, P. Would You Like To Make Your Programs Understand Human Voice?. Proceedings of 9th Conference STUDENT EEICT 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 231-235. ISBN: 80-214-2379-X.

MATĚJKA, P.; SCHWARZ, P.; HEŘMANSKÝ, H.; ČERNOCKÝ, J. Phoneme Recognition using Temporal Patterns. Proc. 6th International Conference Text, Speech and Dialogue, TSD2003. Ceske Budejovice: Springer Verlag, 2003. p. 465-472. ISBN: 3-540-20024-X.

MATĚJKA, P.; SCHWARZ, P.; GRÉZL, F.; ČERNOCKÝ, J. Phoneme Classification using Temporal Patterns. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 1-4. ISBN: 80-214-2383-8.

GRÉZL, F. Local time-frequency operators in TRAPs for speech recognition. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 269-274. ISBN: 3-540-20024-X. ISSN: 0302-9743.

GRÉZL, F. Effect of normalization on TRAP based systems in ASR. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Department of Radioelectronics FEEC BUT, 2003. p. 128-131. ISBN: 80-214-2383-8.

SCHWARZ, P.; HEŘMANSKÝ, H.; MATĚJKA, P. Použití časové dynamiky k rozpoznávání jazyků z mluvené řeči. Proceedings of Language Recognition Workshop 2003. NIST Gaithersburg, MD USA: 2003. s. 56-62.

VONDRA, M. Voice Conversion Based on Nonlinear Spectrum Transformation. In SPEECH PROCESSING. Czech Republic: Radio Engineering and Electronics AS CR, 2004. p. 53 ( p.)ISBN: 80-86269-11-6.

GRÉZL, F. Combinations of TRAP-based systems. Proc. Seventh International conference on Text, Speech and Dialogue. Brno: Faculty of Informatics MU, 2004. p. 323-330. ISBN: 3-540-23049-1.

SZŐKE, I. Speech units automatically generated by ergodic hidden Markov model. Proceedings of 10th Conference and Competition STUDENT EEICT 2004. Brno: Faculty of Electrical Engineering and Communication BUT, 2004. p. 1-5.

MATĚJKA, P.; SZŐKE, I.; SCHWARZ, P.; ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 147-154. ISBN: 3-540-23049-1.

MATĚJKA, P.; ČERNOCKÝ, J.; SIGMUND, M. Introduction to Automatic Language Identification. Conference Proceedings of Radioelektronika 2004. Brno: Slovak University of Technology in Bratislava, 2004. p. 112-115. ISBN: 80-227-2017-8.

BURGET, L. Combination of Speech Features Using Smoothed Heteroscedastic Linear Discriminant Analysis. Proc. 8th International Conference on Spoken Language Processing. Jeju island: Sunjin Printing Co, 2004. p. 2549-2552.

MOTLÍČEK, P.; ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. 7th International Conference, TSD 2004 Brno, Czech Republic, September 2004 Proceedings. Lecture Notes in Computer Science. Brno: Springer Verlag, 2004. p. 379-384. ISBN: 3-540-23049-1. ISSN: 0302-9743.

BURGET, L. Measurement of Complementarity of Recognition Systems. Proc. Seventh International conference on Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence (LNAI) subseries of LNCS series as Volume 3206. Brno: Springer Verlag, 2004. p. 283-290. ISBN: 3-540-23049-1.

MOTLÍČEK, P. Segmentace nahrávek živých jednání podle mluvčího. Sborník příspěvků a prezentací akce Odborné semináře 2004. REL03V. Brno: Ústav radioelektroniky FEKT VUT v Brně, 2004. s. 0-0.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 465-472. ISBN: 3-540-23049-1.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Phoneme Recognition from a Long Temporal Context. poster at JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms. Martigny: Institute for Perceptual Artificial Intelligence, 2004. p. 1 (1 s.).

FOUSEK, P.; SVOJANOVSKÝ, P.; GRÉZL, F.; HEŘMANSKÝ, H. New Nonsense Syllables Database - Analyses and Preliminary ASR Experiments. Proc. 8th International Conference on Spoken Language Processing. 8th International Conference on Spoken Language Processing. Jeju Island: Sunjin Printing Co, 2004. p. 348-351. ISSN: 1225-4111.

ČERNOCKÝ, J.; LAMPA, P. Teaching signals - making it automatic, making it fun. Proc. Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005. p. 318-321. ISBN: 80-214-2904-6.

BURGET, L.; ČERNOCKÝ, J. Recognition of Speech with Non-random Attributes. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: Springer Verlag, 2003. p. 1-6. ISBN: 3-540-20024-X. ISSN: 0302-9743.

FAPŠO, M.; SCHWARZ, P.; SZŐKE, I.; SMRŽ, P.; SCHWARZ, M.; ČERNOCKÝ, J.; KARAFIÁT, M.; BURGET, L. Search Engine for Information Retrieval from Speech Records. Proceedings of the Third International Seminar on Computer Treatment of Slavic and East European Languages. Bratislava: 2006. p. 100-101.

MATĚJKA, P.; ČERNOCKÝ, J.; SIGMUND, M. Introduction to Automatic Languages Identification. In Proceedings of the 14th international Czech-Slovak scientific conference RADIOELEKTRONIKA 2004. Bratislava: 2004. p. 112 ( p.)ISBN: 80-227-2017-8.

BAUDOIN, G.; CAPMAN, F.; ČERNOCKÝ, J.; EL CHAMI, F.; CHARBIT, M.; CHOLLET, G.; PETROVSKA-DELACRETAZ, D. Advances in very low bit-rate speech coding using recognition and synthesis techniques. Lecture Notes in Computer Science, 2002, vol. 2002, no. 2448, p. 269-276. ISSN: 0302-9743.

BURGET, L.; MOTLÍČEK, P.; GRÉZL, F.; JAIN, P. Distributed speech recognition. Radioengineering, 2002, vol. 2002, no. 4, p. 12-16. ISSN: 1210-2512.

KARAFIÁT, M.; GRÉZL, F. Using MATLAB for Analysis of TRAP system. Radioengineering, 2003, vol. 2003, no. 4, p. 38-41. ISSN: 1210-2512.

Martin Vondra, Robert Vích. Speech Identity Conversion. Lecture Notes in Computer Science, 2005, vol. 2005, no. 3445, p. 421 ( p.)ISSN: 0302-9743.

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 147 ( p.)ISSN: 0302-9743.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465 ( p.)ISSN: 0302-9743.

MATĚJKA, P.; SZŐKE, I.; SCHWARZ, P.; ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 147-154. ISSN: 0302-9743.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465-472. ISSN: 0302-9743.

MOTLÍČEK, P.; ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 379-384. ISSN: 0302-9743.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates In Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465-472. ISSN: 0302-9743.

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition. In Vědecké spisy VUT. Edice Habilitační a inaugurační spisy, sv. 112. Brno: Publishing house of Brno University of Technology VUTIUM, 2003. p. 1-30. ISBN: 80-214-2395-1.

GARUDADRI, H.; HEŘMANSKÝ, H.; MORGAN, N.; BENITEZ, C.; BURGET, L.; KAJAREKAR, S.; GRÉZL, F.; JAIN, P.; MOTLÍČEK, P. Distributed Voice Recognition System Utilizing Multistream Network Feature Processing. San Diego: Qualcomm, 2002. p. 0-0.

MOTLÍČEK, P. Feature Extraction in Speech Coding and Recognition. Portland: Oregon Graduate Institute of Science and Technology, 2002. p. 1-50.

MOTLÍČEK, P. Visual Feature Extreaction for Phoneme Recognition of Meetings. Brno: Department of Computer Graphics and Multimedia FIT BUT, 2004. p. 0-0.

MOTLÍČEK, P. Modelování spektra a časových trajektorií v rozpoznávání řeči. GACR 102/02/0124 "Hlasové technologie v podpoře informační společnosti", souhrnný přehled aktivit řešitelských kolektivů. Praha: 2004. s. 0-0. ISBN: 80-01-02957-3.

SZŐKE, I.; MOTLÍČEK, P. Kódování řeči na velmi nízkých bitových rychlostech. GACR 102/02/0124 "Hlasové technologie v podpoře informační společnosti", souhrnný přehled aktivit řešitelských klektivů. Praha: Fakulta elektrotechniky ČVUT, 2004. s. 0-0. ISBN: 80-01-02957-3.


KARAFIÁT, M.; GRÉZL, F.; BURGET, L. Combination of MFCC and TRAP features for LVCSR of meeting data. Martigny: 2004. p. 0-0.

SCHWARZ, P.; MATĚJKA, P. Phoneme Recognition from a Long Temporal Context. Martigny: 2004. p. 0 (1 s.).

MATĚJKA, P., ČERNOCKÝ, J., SIGMUND, M. Introduction to Automatic Language Identification. Conference Proceedings of Radioelektronika 2004. Bratislava, Slovk Republic: Institute of Radio Electronics, Slovak Technical University in Bratislava, 2004. p. 112 ( p.)ISBN: 80-227-2017-8.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Phoneme Recognition. AMI Workshop. 2004. p. 1 ( p.)

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing, PhD thesis. Brno: Faculty of Information Technology BUT, 2003. p. 1-138.

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition, habilitation thesis. Brno: 2002. p. 0-0.

SCHWARZ, P.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O.: VUT-SW-Search; Phoneme recognizer based on long temporal context. URL: (software)