Course detail

Speech Signal Processing

FIT-ZREAcad. year: 2023/2024

Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM, synthesis. Software and libraries for speech processing.

Language of instruction

Czech

Number of ECTS credits

5

Mode of study

Not applicable.

Entry knowledge

Not applicable.

Rules for evaluation and completion of the course

  • mid-term test 14 pts
  • project 29 pts
  • presentation of results in computer labs 6 pts

Aims

To provide students with the knowledge of basic characteristics of speech signal in relation to production and hearing of speech by humans. To describe basic algorithms of speech analysis common to many applications. To give an overview of applications (recognition, synthesis, coding) and to inform about practical aspects of speech algorithms implementation.
The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.

Study aids

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN  80-200-0203-0 
Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7 
www stránka předmětu https://www.fit.vutbr.cz/study/courses/ZRE/public/

Recommended literature

Gold, B., Morgan, N.: Speech and Audio Signal Processing, Wiley-Interscience; 2 edition.
Yu, D., Deng, L., Automatic speech recognition, Springer, 2016.
Rabiner, L. R., & Schafer, R. W. Theory and applications of digital speech processing, Pearson, 2011.
Psutka, J., Müller, L., Matoušek, J., & Radová, V., Mluvíme s počítačem česky, Academia, 2006.

Classification of course in study plans

  • Programme IT-MSC-2 Master's

    branch MPV , 0 year of study, summer semester, compulsory-optional
    branch MIN , 0 year of study, summer semester, compulsory-optional
    branch MBI , 0 year of study, summer semester, compulsory-optional
    branch MSK , 2 year of study, summer semester, compulsory-optional
    branch MBS , 0 year of study, summer semester, elective
    branch MIS , 0 year of study, summer semester, elective
    branch MGM , 1 year of study, summer semester, compulsory
    branch MMM , 0 year of study, summer semester, elective

  • Programme MITAI Master's

    specialization NISY , 0 year of study, summer semester, elective
    specialization NSPE , 0 year of study, summer semester, compulsory
    specialization NBIO , 0 year of study, summer semester, elective
    specialization NSEN , 0 year of study, summer semester, elective
    specialization NVIZ , 0 year of study, summer semester, elective
    specialization NGRI , 0 year of study, summer semester, elective
    specialization NADE , 0 year of study, summer semester, elective
    specialization NISD , 0 year of study, summer semester, elective
    specialization NMAT , 0 year of study, summer semester, elective
    specialization NSEC , 0 year of study, summer semester, elective
    specialization NISY up to 2020/21 , 0 year of study, summer semester, elective
    specialization NCPS , 0 year of study, summer semester, elective
    specialization NHPC , 0 year of study, summer semester, elective
    specialization NNET , 0 year of study, summer semester, elective
    specialization NMAL , 0 year of study, summer semester, elective
    specialization NVER , 0 year of study, summer semester, elective
    specialization NIDE , 0 year of study, summer semester, elective
    specialization NEMB , 0 year of study, summer semester, elective
    specialization NEMB up to 2021/22 , 0 year of study, summer semester, elective

Type of course unit

 

Lecture

26 hod., optionally

Teacher / Lecturer

Syllabus

  1. Introduction, applications of speech processing. 
  2. Digital processing of speech signals.
  3. Speech production and its signal processing model. 
  4. Pre-processing and basic parameters of speech, cepstrum.
  5. Linear-predictive model. 
  6. Fundamental frequency estimation.
  7. Speech coding - basics
  8. CELP Speech coding. 
  9. Speech recognition - basics, DTW. 
  10. Hidden Markov models HMM. 
  11. Large vocabulary continuous speech recognition (LVCSR) systems. 
  12. Speaker and language recognition. Neural networks in speech processing. 
  13. Text to speech synthesis. 

Fundamentals seminar

2 hod., compulsory

Teacher / Lecturer

Syllabus

  1. Parameterization, DTW, HMM.

Exercise in computer lab

12 hod., compulsory

Teacher / Lecturer

Syllabus

    Except the last one, Matlab is used in labs.
  1. Introduction. 
  2. Linear prediction and vector quantization. 
  3. Fundamental frequency estimation and speech coding. 
  4. Basics of classification. 
  5. Recognition - Dynamic time Warping (DTW).
  6. Recognition - hidden Markov models (HTK).

Project

12 hod., compulsory

Teacher / Lecturer