Course detail
Speech Processing
FEKT-MKC-ZPRAcad. year: 2022/2023
The subject gives a comprehensive view of the solution of speech processing occurring in verbal communication. First, speech production, its perception, human auditory system and process of hearing are introduced. Then segmental and suprasegmental parameters that are frequently used in speech analysis are discussed. Furthermore, all important areas of speech processing are mentioned, especially speech analysis, pattern recognition, speech synthesis and coding. The method of pitch analysis, prosody modelling, emotion analysis, analysis of pathological voice, speech de-identification and speech watermarking are added. Attention is also paid to one-channel and multi-channel speech enhancement methods and noise cancellation. In the end, subjective and objective methods of assessing the quality and intelligibility of speech are introduced.
Language of instruction
Number of ECTS credits
Mode of study
Guarantor
Department
Learning outcomes of the course unit
- describe vocal and auditory tract, and the way of speech production and its perception
- analyse speech using most common segmental and suprasegmental parameters
- apply cepstral and linear predictive analysis
- use machine learning in the field of speech processing (speech recognition, speaker recognition, speech pathology identification, emotion detection, etc.)
- design and implement text-to-speech system based on concatenation synthesis
- model vocal tract and perform speech coding
- use objective and subjective tests of speech quality and intelligibility assessment
- enhance speech using one- and multiple-channel methods
- design speech watermarking and de-identification system
- process/analyse speech signals using Matlab environment
Prerequisites
Co-requisites
Planned learning activities and teaching methods
Assesment methods and criteria linked to learning outcomes
Course curriculum
2. Speech signal analysis, segmental and suprasegmental parameters I, fundamental frequency analysis
3. Speech signal analysis, segmental and suprasegmental parameters II
4. Speech signal analysis III, pattern recognition (classification based on distances)
5. Pattern recognition (statistical classifiers)
6. Speech synthesis, text-to-speech systems, prosody modelling
7. Speech coding and its transmission
8. Objective and subjective methods of speech quality and intelligibility assessment
9. One- and multiple-channel speech enhancement methods
10. Emotion analysis and its application
11. Neurodegenerative disorders analysis
12. Speech watermarking, speech de-identification
Work placements
Aims
Specification of controlled education, way of implementation and compensation for absences
Recommended optional programme components
Prerequisites and corequisites
Basic literature
SMÉKAL, Z. Zpracování řeči. Brno: Vysoké učení technické v Brně, 2012. s. 1-171. ISBN: 978-80-214-4896-4. (CS)
Recommended reading
Classification of course in study plans
- Programme MPC-TIT Master's 1 year of study, summer semester, compulsory-optional