Publication detail
Unsupervised Word Segmentation from Speech with Attention
GODARD, P. BOITO, M. ONDEL YANG, L. BERARD, A. YVON, F. VILLAVICENCIO, A. BESACIER, L.
Original Title
Unsupervised Word Segmentation from Speech with Attention
Type
conference paper
Language
English
Original Abstract
We present a first attempt to perform attentional word segmentationdirectly from the speech signal, with the final goal toautomatically identify lexical units in a low-resource, unwrittenlanguage (UL). Our methodology assumes a pairing betweenrecordings in the UL with translations in a well-resourcedlanguage. It uses Acoustic Unit Discovery (AUD) to convertspeech into a sequence of pseudo-phones that is segmented usingneural soft-alignments produced by a neural machine translationmodel. Evaluation uses an actual Bantu UL, Mboshi;comparisons to monolingual and bilingual baselines illustratethe potential of attentional word segmentation for language documentation.
Keywords
computational language documentation,encoder-decoder models, attentional models, unsupervised word segmentation.
Authors
GODARD, P.; BOITO, M.; ONDEL YANG, L.; BERARD, A.; YVON, F.; VILLAVICENCIO, A.; BESACIER, L.
Released
2. 9. 2018
Publisher
International Speech Communication Association
Location
Hyderabad
ISBN
1990-9772
Periodical
Proceedings of Interspeech
Year of study
2018
Number
9
State
French Republic
Pages from
2678
Pages to
2682
Pages count
5
URL
BibTex
@inproceedings{BUT163406,
author="GODARD, P. and BOITO, M. and ONDEL YANG, L. and BERARD, A. and YVON, F. and VILLAVICENCIO, A. and BESACIER, L.",
title="Unsupervised Word Segmentation from Speech with Attention",
booktitle="Proceeding of Interspeech 2018",
year="2018",
journal="Proceedings of Interspeech",
volume="2018",
number="9",
pages="2678--2682",
publisher="International Speech Communication Association",
address="Hyderabad",
doi="10.21437/Interspeech.2018-1308",
issn="1990-9772",
url="https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1308.pdf"
}
Documents