Publication detail

Combination of MFCC and TRAP features for LVCSR of meeting data

KARAFIÁT, M. GRÉZL, F. BURGET, L.

Original Title

Type

presentation, poster

Language

English

Original Abstract

he aim of this work is to examine TempoRAl Patterns (TRAPs) basedfeature extraction for the task of large vocabulary continuous speechrecognition (LVCSR). Previously, TRAPs based features were mainly usedin conjunction with hybrid NN-HMM recognition system (the conectionistapproach). In this work, we use Tandem-TRAPS system to generate speechfeatures, which are then used as an input for a standard GMM-HMMsystem. This approach allows for more precise modeling of phoneticcontext (context dependent models), which is important for LVCSR.Experiments are carried out on ICSI meetings database. For TRAPSprocessing, it is shown that use of frequency differentiation and localoperators can significantly improve recognition performance.Performances obtained with TRAPs based features and convetional MFCCfeatures are compared. Although stand-alone TRAPs based features neveroutperform MFCC in our experiments, we have reported an improvementover MFCC when TRAPs based features and MFCC features are combinedtogether. The combined features are created by concatenation of theoriginal feature streams followed by Heteroscedastic LinearDiscriminant Analysis to perform decorelation and dimensionalityreduction. Compared to previous works, the big advantage is brought byHLDA which combines the two feature streams optimally without strongassumptions imposed on data by previously used transforms (as PCA andLDA)

Keywords

speech recognition, TRAP, feature extraction, feature combination, hlda

Authors

KARAFIÁT, M.; GRÉZL, F.; BURGET, L.

Released

9. 12. 2004

Location

Martigny

Pages count

BibTex

@misc{BUT63339,
  author="Martin {Karafiát} and František {Grézl} and Lukáš {Burget}",
  title="Combination of MFCC and TRAP features for LVCSR of meeting data",
  year="2004",
  pages="1",
  address="Martigny",
  note="presentation, poster"
}

VUT

Faculties

University Institutes

Parts

Combination of MFCC and TRAP features for LVCSR of meeting data