Course detail

Speech processing

FEKT-MZPRAcad. year: 2011/2012

The course gives a comprehensive view of of the present-day solution of major problems occurring in speech communication systems. It is designed for students who want to understand and have a command of both basic and advanced tgechniques of speech processing, synthesis and recognition.

Language of instruction

Czech

Number of ECTS credits

Mode of study

Not applicable.

Guarantor

prof. Ing. Zdeněk Smékal, CSc.

Department

Department of Telecommunications (UTKO)

Learning outcomes of the course unit

The students will have a clear idea of the model of speech generation, the analysis of speech signals and the classical marks of speech signal. They will furher be made familiar with prediction analysis, spectrogram and homomorphous analysis used in the techniques of automatic recognition of commands.In addition to the classical methods the students will be introduced to the basic principles of the speaker identification technique, with problems of separating speech from noisy acoustic background, and also with the latest trends in the field of automatic speech recognition.

Prerequisites

The subject knowledge on the Bachelor´s degree level is requested.

Co-requisites

Not applicable.

Planned learning activities and teaching methods

Teaching methods depend on the type of course unit as specified in the article 7 of BUT Rules for Studies and Examinations.

Assesment methods and criteria linked to learning outcomes

Requirements for completion of a course are specified by a regulation issued by the lecturer responsible for the course and updated for every.

Course curriculum

Not applicable.

Work placements

Not applicable.

Aims

The aim of the course is to give a comprehensive overview of speech communication systems. It is designed for students who want to learn the basic and advanced techniques of speech processing, synthesis and recognition. Apart from the basic principles of speaker identification the students will become familiar with problems of separating speech from noisy background and with principles of automatic speech recognition.

Specification of controlled education, way of implementation and compensation for absences

The content and forms of instruction in the evaluated course are specified by a regulation issued by the lecturer responsible for the course and updated for every academic year.

Recommended optional programme components

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

DELLER, J.R., HANSEN, J.H.L., PROAKIS, J.G.: Discrete-Time Processing of Speech Signals. John Wiley, New York, 2000. ISBN 0-7803-5386-2
O'SHAUGNESSY, D., LI DENG: Speech Processing-A Dynamic Optimization-Oriented Approach. Marcel Dekker, New York, 2003. ISBN 0-8247-4040-8
PSUTKA, J.: Komunikace s počítačem mluvenou řečí. ACADEMIA, Praha 1995. ISBN 80-200-0203-0
QUATIERI, T.F.: Discrete-Time Speech Signal Processing-Principles and Practice. Prentice Hall, NJ 2002. ISBN 0-13-242942-X
UHLÍŘ, J. SOVKA, P.: Digital Signal Processing (Číslicové zpracování signálů), ČVUT, Praha, 1995. (In Czech)

Type of course unit

Lecture

26 hod., optionally

Teacher / Lecturer

prof. Ing. Zdeněk Smékal, CSc.

Syllabus

The nature and information content of speech signal.
Phonetic description of the Czech language.
Introduction into speech signal analysis, model of speech generation.
The marks used in analyzing speech signals.
Breaking down the homomorphous analysis (LPCC, LFCC and MFCC coefficients).
Automatic recognition of commands.
Automatic speaker recognition.
Temporal and fequency synthesis of speech.
Speech encoding techniques.
Speech signal and interference.
Single-channel filtering techniques.
Multi-channel filtering techniques.
Technical tools for the realization.

Laboratory exercise

39 hod., compulsory

Teacher / Lecturer

doc. Ing. Jiří Mekyska, Ph.D.

Syllabus

Modification of the wav-file in Matlab environment.
Calculation of autocorrelation and LPC coefficients.
Spectrogram-based analysis of speech signals.
Calculation of cepstral coefficients (LPCC, LFCC and MFCC coefficients).
Calculating the AMDF function, establishing the basic tone.
Selecting the marks for automatic command recognition.
Selecting the marks for automatic speaker recognition.
Establishing the utterance boundaries in noisy recordings.
Speech synthesis in the time domain.
Assignment of individual projects.
Solving and consulting individual projects.
Solving and consulting individual projects.
Handing in the projects and awarding the credit pass.

VUT

Faculties

University Institutes

Parts

Speech processing

Type of course unit