Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
KARAFIÁT, M.
Originální název
The Development of the AMI System for the Transcription of Speech in Meetings
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
The automatic processing of speech collected in conferencestyle meetings has attracted considerable interest with several large scale projects devoted to this area. This paper describes the development of a baseline automatic speech transcription system for meetings in the context of the AMI (Augmented Multiparty Interaction) project. We present several techniques important to processing of this data and show the performance in terms of word error rates (WERs). An important aspect of transcription of this data is the necessary flexibility in terms ofaudio pre-processing. Real world systems have to deal with flexible input,for example by using microphone arrays or randomly placed microphones in a room. Automatic segmentation and microphone array processing techniques are described and the effect on WERs is discussed. The system and its components presented in this paper yield compettive performanceand form a baseline for future research in this domain.
Klíčová slova
speech recognition, LVCSR, speech processing, signal processing, HMM, Language modeling, meeting transcriptions
Autoři
Rok RIV
2005
Vydáno
13. 7. 2005
Nakladatel
University of Edinburgh
Místo
Edinburgh
ISBN
978-3-540-32549-9
Kniha
Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers
Edice
Lecture Notes in Computer Science Volume 3869, Springer 2006
Strany od
344
Strany do
356
Strany počet
12
URL
https://www.fit.vutbr.cz/~karafiat/publi/2005/hain_mlmi2005.pdf
BibTex
@inproceedings{BUT18265, author="Thomas {Hain} and Martin {Karafiát} and John {Dines} and Iain {McCowan} and Mike {Lincoln} and Giulia {Garau} and Vincent {Wan} and Roeland {Ordelman} and Steve {Renals}", title="The Development of the AMI System for the Transcription of Speech in Meetings", booktitle="Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers", year="2005", series="Lecture Notes in Computer Science Volume 3869, Springer 2006", pages="344--356", publisher="University of Edinburgh", address="Edinburgh", isbn="978-3-540-32549-9", url="https://www.fit.vutbr.cz/~karafiat/publi/2005/hain_mlmi2005.pdf" }