Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
ONDEL YANG, L. YUSUF, B. BURGET, L. SARAÇLAR, M.
Originální název
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
Typ
článek v časopise ve Web of Science, Jimp
Jazyk
angličtina
Originální abstrakt
This work investigates subspace non-parametric models for the task of learning a set of acoustic units fromunlabeled speech recordings. We constrain the base-measure of a Dirichlet- Process mixture with a phonetic subspaceestimated from other source languagesto build an educated prior, thereby forcing the learned acoustic units to resemble phones of known source languages. Two types of models are proposed: (i) the Subspace HMM (SHMM) which assumes that the phonetic subspace is the same for every language, (ii) the Hierarchical-Subspace HMM (H-SHMM) which relaxes this assumption and allows to have a languagespecific subspace estimated on the unlabeled target data. These models are applied on 3 languages: English, Yoruba and Mboshi and they are compared with various competitive acoustic units discovery baselines. Experimental results show that both subspace models outperform other systems in terms of clustering quality and segmentation accuracy. Moreover, we observe that the H-SHMM provides results superior to the SHMM supporting the idea that language-specific priors are preferable to language-agnostic priors for acoustic unit discovery.
Klíčová slova
Unsupervised learning, non- parametricBayesian models, acoustic unit discovery
Autoři
ONDEL YANG, L.; YUSUF, B.; BURGET, L.; SARAÇLAR, M.
Vydáno
3. 5. 2022
ISSN
2329-9290
Periodikum
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING
Ročník
30
Číslo
5
Stát
Spojené státy americké
Strany od
1902
Strany do
1917
Strany počet
16
URL
https://ieeexplore.ieee.org/document/9767690
BibTex
@article{BUT178412, author="ONDEL YANG, L. and YUSUF, B. and BURGET, L. and SARAÇLAR, M.", title="Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery", journal="IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING", year="2022", volume="30", number="5", pages="1902--1917", doi="10.1109/TASLP.2022.3171975", issn="2329-9290", url="https://ieeexplore.ieee.org/document/9767690" }