Course detail

Audio and Speech Processing by Humans and Machines

FIT-ASDAcad. year: 2017/2018

3 day intensive course
Interaction between humans and machines could be greatly enhanced through communication using human sensory signals such as speech. Knowledge of human information processing is critical in the design of such human-machine interfaces. The course covers concept of signal as a carrier of information, basic principles of processing of cognitive signals, and introduces selected phenomena in auditory and visual perception. Students learn how to interpret empirical data, how to incorporate these data in models, and how to apply these models to engineering problems. Emphasis is on active research in auditory modeling that exploits special properties of speech.

Language of instruction

Czech

Mode of study

Not applicable.

Learning outcomes of the course unit

Item has no knowledges.

Prerequisites

There are no prerequisites

Co-requisites

Not applicable.

Planned learning activities and teaching methods

Not applicable.

Assesment methods and criteria linked to learning outcomes

Study evaluation is based on marks obtained for specified items. Minimimum number of marks to pass is 50.

Course curriculum

    Syllabus of lectures:
    • Day 1
      Introduction to processing of information-bearing sensory signals such as speech. Fundamentals of information theory and of pattern classification. Fundamentals of speech production. Conventional techniques for speech analysis (concept of short-term analysis, band-pass filtering, fourier-like transforms, cepstrum, linear prediction).
    • Day 2
      Fundamentals of human auditory perception. Perception of pitch and loudness. Spectral and temporal resolution of hearing. Masking in frequency and in time. Some important speech perception phenomena.
    • Day 3
      Introduction to auditory-like speech analysis techniques. Linear discriminant analysis and its use for deriving optimized spectral basis Temporal domain for speech analysis. Dynamic features of speech and RASTA technique. Multi-stream speech recognition. Recognition from temporal patterns and nonlinear discriminant mapping approaches speech.

Work placements

Not applicable.

Aims

Item has no goals.

Specification of controlled education, way of implementation and compensation for absences

There are no checked study.

Recommended optional programme components

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

Ben Gold and Nelson Morgan: Speech and Audio Signal Processing, Willey and Sons, 2000 Psutka a kol.: Hovoříme s počítačem česky, Akademia Praha 2006 Dodatečné materiály budou distribuovány dle potřeby během kurzu.

Recommended reading

Not applicable.

Classification of course in study plans

  • Programme CSE-PHD-4 Doctoral

    branch DVI4 , 0 year of study, winter semester, elective

Type of course unit

 

Lecture

39 hod., optionally

Teacher / Lecturer

Syllabus

  • Day 1
    Introduction to processing of information-bearing sensory signals such as speech. Fundamentals of information theory and of pattern classification. Fundamentals of speech production. Conventional techniques for speech analysis (concept of short-term analysis, band-pass filtering, fourier-like transforms, cepstrum, linear prediction).
  • Day 2
    Fundamentals of human auditory perception. Perception of pitch and loudness. Spectral and temporal resolution of hearing. Masking in frequency and in time. Some important speech perception phenomena.
  • Day 3
    Introduction to auditory-like speech analysis techniques. Linear discriminant analysis and its use for deriving optimized spectral basis Temporal domain for speech analysis. Dynamic features of speech and RASTA technique. Multi-stream speech recognition. Recognition from temporal patterns and nonlinear discriminant mapping approaches speech.