Detail projektu

Hlasové technologie v podpoře informační společnosti

Období řešení: 01.01.2002 — 31.12.2004

Zdroje financování

Grantová agentura České republiky - Standardní projekty

- plně financující (2002-01-01 - 2004-12-31)

O projektu

Hlasové technologie v podpoře informační společnosti

Popis anglicky
Voice technologies for support of information society

Klíčová slova
zpracování řeči, rozpoznávání, kódování

Klíčová slova anglicky
speech processing, recognition, coding



Originální jazyk




Ústav počítačové grafiky a multimédií
- příjemce (01.01.2002 - 31.12.2004)


MATĚJKA, P., SCHWARZ, P., ČERNOCKÝ, J., HEŘMANSKÝ, H. Phoneme Recognition using Temporal Patterns. In In Proceedings of the conference TSD'2003. International Conference on Text Speech and Dialogue, TSD 2003. 2003. p. 198 ( p.)ISBN: 3-540-20024-X.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Recognition Of Phoneme Strings Using TRAP Technique. In 8th European conference on speech communication and technology EUROSPEECH'03. Geneva, Schwitzerland: ISCA, 2003. p. 825 ( p.)

VONDRA, M., VÍCH, R. Design of FIR Vocal Tract Models with Linear and Nonlinear Phase. In Proceedings of the 12th Czech-German Workshop SPEECH PROCESSING. URE AV CR Praha: Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, Prague, 2002. p. 31 ( p.)ISBN: 80-86269-09-4.

MATĚJKA, P.; SCHWARZ, P.; KARAFIÁT, M.; ČERNOCKÝ, J. Some like it Gaussian... Proc. 5th International Conference Text, Speech and Dialogue, TSD2002. Lecture notes in artificial intelligence 2448. Berlin: Springer Verlag, 2002. p. 321-324. ISBN: 3-540-44129-8.

SCHWARZ, P.; ČERNOCKÝ, J. Keyword detection in Czech fluent speech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2.

KARAFIÁT, M.; ČERNOCKÝ, J. Context dependent Hidden Markov models in recognition of Czech. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

GRÉZL, F.; BURGET, L.; JAIN, P.; ČERNOCKÝ, J. Improving TRAPS features using LDA. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

SCHWARZ, P. Modifications of Viterbi algorithms for keyword detection. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 0-0. ISBN: 80-214-2116-9.

MOTLÍČEK, P.; BURGET, L. Noise estimation for efficient speech enhancement and robust speech recognition. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 1033-1036. ISBN: 1-876346-42-6.

MOTLÍČEK, P. Application of Mel-scale Filter bank for Noise Estimation in Speech Processing. 12th International Czech-Slovak Scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 1-4. ISBN: 80-227-1700-2.

MOTLÍČEK, P.; BURGET, L. Efficient Noise Estimation and its Application for Robust Speech Recognition. 5th International Conference, TSD 2002 Brno, Czech Republic, September 2002 Proceedings. Berlin: Springer Verlag, 2002. p. 229-236. ISBN: 3-540-44129-8.

MOTLÍČEK, P. Noise Estimation for Spectral Subtraction in Speech Processing. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 0-0. ISBN: 80-214-2116-9.

KARAFIÁT, M.; ČERNOCKÝ, J. Differences between context dependent and context independent Hidden Markov Models for recognition of Czech. Proc. of 8th student conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering TUB, 2002. p. 328-332. ISBN: 80-214-2116-9.

MATĚJKA, P.; ČERNOCKÝ, J. Feature gaussianization in speech recognition. Proc. 12th International scientific conference Radioelektronika 2002. Bratislava: Slovak University of Technology in Bratislava, 2002. p. 0-0. ISBN: 80-227-1700-2.

GRÉZL, F. Classifiers in speech recognition systems based on TRAPS. Proceedings of 8th Conference STUDENT EEICT 2002. Brno: Faculty of Electrical Engineering and Communication BUT, 2002. p. 74-77. ISBN: 80-214-2116-9.

ČERNOCKÝ, J. Units for automatic language independent speech processing. Proc. LREC 2002 - workshop on Portability issues in human language technologies. Las Palmas: European Language Resources Association, 2002. p. 7-13.

BURGET, L.; DUPONT, S.; GARUDADRI, H.; GRÉZL, F.; HEŘMANSKÝ, H.; JAIN, P.; KAJAREKAR, S.; MORGAN, N. QUALCOMM-ICSI-OGI Features for ASR. Proc. 7th International Conference on Spoken Language Processing. Denver: International Speech Communication Association, 2002. p. 4-7. ISBN: 1-876346-42-6.

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing. Sborník příspěvků a prezentací akce Odborné semináře 2003. REL02V. Brno: Department of Radioelectronics FEEC BUT, 2003. s. 0-0.

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. p. 147 ( p.)ISBN: 3-540-23049-1.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. p. 465 ( p.)ISBN: 3-540-23049-1.

VONDRA, M. Voice Transformation in Parametric Speech Synthesis. In Speech Processing. Praha: Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, 2003. p. 35 ( p.)ISBN: 80-86269-10-8.

MATĚJKA, P. Review of Automatic Language Identification. Proceedings of 10th Conference and Competition STUDENT EEICTT 2004 Volume 2. Brno: 2004. p. 344-348. ISBN: 80-214-2635-7.

MATĚJKA, P. Review of Automatic Language Identification. In Proceedings of 10th Conference and Competition STUDENT EEICTT 2004 Volume 2. Brno, Czech Republic: FIT BUT & FEEC BUT, 2004. p. 344 ( p.)ISBN: 80-214-2635-7.

MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. Proceedings of 9th Conference and Competition STUDENT EEICT 2003. Brno: Dean Office of FEEC BUT, 2003. p. 598-602. ISBN: 80-214-2379-X.

JENDERKA, P.; VÍCHA, T. Voice Activity Detection in Multimodal Meeting Manager. Proceedings of 9th Conference and Competition STUDENT EEICT 2003 Volume 3. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 588-592. ISBN: 80-214-2379-X.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Recognition of Phoneme Strings using TRAP Technique. Proceedings of 8th International Conference Eurospeech. European Conference EUROSPEECH. Geneve: International Speech Communication Association, 2003. p. 1-4. ISSN: 1018-4074.

MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. Proceedings of the International Conference and Competition. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 315-319. ISBN: 80-214-2401-X.

MOTLÍČEK, P.; ČERNOCKÝ, J. Time-domain based Temporal Processing with Application of. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 821-824. ISSN: 1018-4074.

MOTLÍČEK, P.; ČERNOCKÝ, J. Autoregressive Modeling based Feature Extraction for Aurora3 DSR Task. Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 1801-1804. ISSN: 1018-4074.

MOTLÍČEK, P.; ČERNOCKÝ, J. All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 295-300. ISBN: 3-540-20024-X. ISSN: 0302-9743.

SCHWARZ, P. Would You Like To Make Your Programs Understand Human Voice?. Proceedings of 9th Conference STUDENT EEICT 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 231-235. ISBN: 80-214-2379-X.

MATĚJKA, P.; SCHWARZ, P.; HEŘMANSKÝ, H.; ČERNOCKÝ, J. Phoneme Recognition using Temporal Patterns. Proc. 6th International Conference Text, Speech and Dialogue, TSD2003. Ceske Budejovice: Springer Verlag, 2003. p. 465-472. ISBN: 3-540-20024-X.

MATĚJKA, P.; SCHWARZ, P.; GRÉZL, F.; ČERNOCKÝ, J. Phoneme Classification using Temporal Patterns. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 1-4. ISBN: 80-214-2383-8.

GRÉZL, F. Local time-frequency operators in TRAPs for speech recognition. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 269-274. ISBN: 3-540-20024-X. ISSN: 0302-9743.

GRÉZL, F. Effect of normalization on TRAP based systems in ASR. Proc. 13th International scientific conference Radioelektronika 2003. Brno: Department of Radioelectronics FEEC BUT, 2003. p. 128-131. ISBN: 80-214-2383-8.

SCHWARZ, P.; HEŘMANSKÝ, H.; MATĚJKA, P. Použití časové dynamiky k rozpoznávání jazyků z mluvené řeči. Proceedings of Language Recognition Workshop 2003. NIST Gaithersburg, MD USA: 2003. s. 56-62.

VONDRA, M. Voice Conversion Based on Nonlinear Spectrum Transformation. In SPEECH PROCESSING. Czech Republic: Radio Engineering and Electronics AS CR, 2004. p. 53 ( p.)ISBN: 80-86269-11-6.

GRÉZL, F. Combinations of TRAP-based systems. Proc. Seventh International conference on Text, Speech and Dialogue. Brno: Faculty of Informatics MU, 2004. p. 323-330. ISBN: 3-540-23049-1.

SZŐKE, I. Speech units automatically generated by ergodic hidden Markov model. Proceedings of 10th Conference and Competition STUDENT EEICT 2004. Brno: Faculty of Electrical Engineering and Communication BUT, 2004. p. 1-5.

MATĚJKA, P.; SZŐKE, I.; SCHWARZ, P.; ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 147-154. ISBN: 3-540-23049-1.

MATĚJKA, P.; ČERNOCKÝ, J.; SIGMUND, M. Introduction to Automatic Language Identification. Conference Proceedings of Radioelektronika 2004. Brno: Slovak University of Technology in Bratislava, 2004. p. 112-115. ISBN: 80-227-2017-8.

BURGET, L. Combination of Speech Features Using Smoothed Heteroscedastic Linear Discriminant Analysis. Proc. 8th International Conference on Spoken Language Processing. Jeju island: Sunjin Printing Co, 2004. p. 2549-2552.

MOTLÍČEK, P.; ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. 7th International Conference, TSD 2004 Brno, Czech Republic, September 2004 Proceedings. Lecture Notes in Computer Science. Brno: Springer Verlag, 2004. p. 379-384. ISBN: 3-540-23049-1. ISSN: 0302-9743.

BURGET, L. Measurement of Complementarity of Recognition Systems. Proc. Seventh International conference on Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence (LNAI) subseries of LNCS series as Volume 3206. Brno: Springer Verlag, 2004. p. 283-290. ISBN: 3-540-23049-1.

MOTLÍČEK, P. Segmentace nahrávek živých jednání podle mluvčího. Sborník příspěvků a prezentací akce Odborné semináře 2004. REL03V. Brno: Ústav radioelektroniky FEKT VUT v Brně, 2004. s. 0-0.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 465-472. ISBN: 3-540-23049-1.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Phoneme Recognition from a Long Temporal Context. poster at JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms. Martigny: Institute for Perceptual Artificial Intelligence, 2004. p. 1 (1 s.).

FOUSEK, P.; SVOJANOVSKÝ, P.; GRÉZL, F.; HEŘMANSKÝ, H. New Nonsense Syllables Database - Analyses and Preliminary ASR Experiments. Proc. 8th International Conference on Spoken Language Processing. 8th International Conference on Spoken Language Processing. Jeju Island: Sunjin Printing Co, 2004. p. 348-351. ISSN: 1225-4111.

ČERNOCKÝ, J.; LAMPA, P. Teaching signals - making it automatic, making it fun. Proc. Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005. p. 318-321. ISBN: 80-214-2904-6.

BURGET, L.; ČERNOCKÝ, J. Recognition of Speech with Non-random Attributes. 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: Springer Verlag, 2003. p. 1-6. ISBN: 3-540-20024-X. ISSN: 0302-9743.

FAPŠO, M.; SCHWARZ, P.; SZŐKE, I.; SMRŽ, P.; SCHWARZ, M.; ČERNOCKÝ, J.; KARAFIÁT, M.; BURGET, L. Search Engine for Information Retrieval from Speech Records. Proceedings of the Third International Seminar on Computer Treatment of Slavic and East European Languages. Bratislava: 2006. p. 100-101.

MATĚJKA, P.; ČERNOCKÝ, J.; SIGMUND, M. Introduction to Automatic Languages Identification. In Proceedings of the 14th international Czech-Slovak scientific conference RADIOELEKTRONIKA 2004. Bratislava: 2004. p. 112 ( p.)ISBN: 80-227-2017-8.

BAUDOIN, G.; CAPMAN, F.; ČERNOCKÝ, J.; EL CHAMI, F.; CHARBIT, M.; CHOLLET, G.; PETROVSKA-DELACRETAZ, D. Advances in very low bit-rate speech coding using recognition and synthesis techniques. Lecture Notes in Computer Science, 2002, vol. 2002, no. 2448, p. 269-276. ISSN: 0302-9743.

BURGET, L.; MOTLÍČEK, P.; GRÉZL, F.; JAIN, P. Distributed speech recognition. Radioengineering, 2002, vol. 2002, no. 4, p. 12-16. ISSN: 1210-2512.

KARAFIÁT, M.; GRÉZL, F. Using MATLAB for Analysis of TRAP system. Radioengineering, 2003, vol. 2003, no. 4, p. 38-41. ISSN: 1210-2512.

Martin Vondra, Robert Vích. Speech Identity Conversion. Lecture Notes in Computer Science, 2005, vol. 2005, no. 3445, p. 421 ( p.)ISSN: 0302-9743.

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 147 ( p.)ISSN: 0302-9743.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465 ( p.)ISSN: 0302-9743.

MATĚJKA, P.; SZŐKE, I.; SCHWARZ, P.; ČERNOCKÝ, J. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 147-154. ISSN: 0302-9743.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465-472. ISSN: 0302-9743.

MOTLÍČEK, P.; ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 379-384. ISSN: 0302-9743.

SCHWARZ, P.; MATĚJKA, P.; ČERNOCKÝ, J. Towards Lower Error Rates In Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465-472. ISSN: 0302-9743.

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition. In Vědecké spisy VUT. Edice Habilitační a inaugurační spisy, sv. 112. Brno: Publishing house of Brno University of Technology VUTIUM, 2003. p. 1-30. ISBN: 80-214-2395-1.

GARUDADRI, H.; HEŘMANSKÝ, H.; MORGAN, N.; BENITEZ, C.; BURGET, L.; KAJAREKAR, S.; GRÉZL, F.; JAIN, P.; MOTLÍČEK, P. Distributed Voice Recognition System Utilizing Multistream Network Feature Processing. San Diego: Qualcomm, 2002. p. 0-0.

MOTLÍČEK, P. Feature Extraction in Speech Coding and Recognition. Portland: Oregon Graduate Institute of Science and Technology, 2002. p. 1-50.

MOTLÍČEK, P. Visual Feature Extreaction for Phoneme Recognition of Meetings. Brno: Department of Computer Graphics and Multimedia FIT BUT, 2004. p. 0-0.

MOTLÍČEK, P. Modelování spektra a časových trajektorií v rozpoznávání řeči. GACR 102/02/0124 "Hlasové technologie v podpoře informační společnosti", souhrnný přehled aktivit řešitelských kolektivů. Praha: 2004. s. 0-0. ISBN: 80-01-02957-3.

SZŐKE, I.; MOTLÍČEK, P. Kódování řeči na velmi nízkých bitových rychlostech. GACR 102/02/0124 "Hlasové technologie v podpoře informační společnosti", souhrnný přehled aktivit řešitelských klektivů. Praha: Fakulta elektrotechniky ČVUT, 2004. s. 0-0. ISBN: 80-01-02957-3.


KARAFIÁT, M.; GRÉZL, F.; BURGET, L. Combination of MFCC and TRAP features for LVCSR of meeting data. Martigny: 2004. p. 0-0.

SCHWARZ, P.; MATĚJKA, P. Phoneme Recognition from a Long Temporal Context. Martigny: 2004. p. 0 (1 s.).

MATĚJKA, P., ČERNOCKÝ, J., SIGMUND, M. Introduction to Automatic Language Identification. Conference Proceedings of Radioelektronika 2004. Bratislava, Slovk Republic: Institute of Radio Electronics, Slovak Technical University in Bratislava, 2004. p. 112 ( p.)ISBN: 80-227-2017-8.

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Phoneme Recognition. AMI Workshop. 2004. p. 1 ( p.)

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing, PhD thesis. Brno: Faculty of Information Technology BUT, 2003. p. 1-138.

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition, habilitation thesis. Brno: 2002. p. 0-0.

SCHWARZ, P.; MATĚJKA, P.; BURGET, L.; GLEMBEK, O.: VUT-SW-Search; Phoneme recognizer based on long temporal context. URL: (software)