Course detail
Modern Methods of Speech Processing
FIT-MZDAcad. year: 2021/2022
From simple systems to stochastic modelling. Hidden Markov models. Large vocabulary continuous speech recognition. Language models. Speech production, speech perception: time and frequency. Data-driven methods for feature extraction. Speech databases. Excitation in speech coding, CELP. Speaker identification.
Language of instruction
Mode of study
Guarantor
Learning outcomes of the course unit
Prerequisites
Co-requisites
Planned learning activities and teaching methods
Assesment methods and criteria linked to learning outcomes
Course curriculum
Work placements
Aims
Specification of controlled education, way of implementation and compensation for absences
Recommended optional programme components
Prerequisites and corequisites
Basic literature
Recommended reading
Fukunaga, K.: Introduction to Statistical Pattern Recognition, Academic Press, 1990
Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000
Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998
Moore, B.C.J., : An introduction to the psychology of hearing, Academic Press, 1989
Psutka, J.: Komunikace s s počítačem mluvenou řečí. Academia, Praha, 1995
Texty z http://www.fit.vutbr.cz/~cernocky/speech/
Vapnik, V. N.: Statistical Learning Theory, Wiley-Interscience, 1998
Classification of course in study plans
- Programme DIT Doctoral 0 year of study, winter semester, compulsory-optional
- Programme DIT Doctoral 0 year of study, winter semester, compulsory-optional
- Programme CSE-PHD-4 Doctoral
branch DVI4 , 0 year of study, winter semester, elective
- Programme CSE-PHD-4 Doctoral
branch DVI4 , 0 year of study, winter semester, elective
- Programme DIT-EN Doctoral 0 year of study, winter semester, compulsory-optional
- Programme DIT-EN Doctoral 0 year of study, winter semester, compulsory-optional
- Programme CSE-PHD-4 Doctoral
branch DVI4 , 0 year of study, winter semester, elective
- Programme CSE-PHD-4 Doctoral
branch DVI4 , 0 year of study, winter semester, elective
Type of course unit
Lecture
Teacher / Lecturer
Syllabus
- Review of notions: signal vectors and parameter matrices, basic statistics.
- Stochastic modeling of parameters, modeling of time by state sequences.
- Hidden Markov models: basic structure, training.
- Recognition of speech using HMM: Viterbi search, token passing.
- Pronunciation dictionaries and language models.
- Speech production and derived parameters: LPC, Log area ratios, line spectral pairs.
- Speech perception and derived parameters: Mel-frequency cepstral coefficients, Perceptual linear prediction.
- Temporal properties of hearing - RASTA filtering.
- Training the feature extractor on the data - linear discriminant analysis.
- Speech databases: standards, contents, speakers, annotations.
- Vocoders and modeling of the excitation: multi-pulse and stochastic excitations (GSM coding).
- CELP coding: long-term predictor, codebooks. Very low bit-rate coders.
- Current methods of speaker identification and verification.
Guided consultation in combined form of studies
Teacher / Lecturer