Detail publikace

i-vectors in language modeling: An efficient way of domain adaptation for feed-forward models

BENEŠ, K. KESIRAJU, S. BURGET, L.

Originální název

i-vectors in language modeling: An efficient way of domain adaptation for feed-forward models

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

We show an effective way of adding context information to shallow neural language models. We propose to use Subspace Multinomial Model (SMM) for context modeling and we add the extracted i-vectors in a computationally efficient way. By adding this information, we shrink the gap between shallow feed-forward network and an LSTM from 65 to 31 points of perplexity on the Wikitext-2 corpus (in the case of neural 5-gram model). Furthermore, we show that SMM i-vectors are suitable for domain adaptation and a very small amount of adaptation data (e.g. endmost 5% of a Wikipedia article) brings a substantial improvement. Our proposed changes are compatible with most optimization techniques used for shallow feedforward LMs.

Klíčová slova

language modeling, feed-forward models, subspace multinomial model, domain adaptation

Autoři

BENEŠ, K.; KESIRAJU, S.; BURGET, L.

Vydáno

2. 9. 2018

Nakladatel

International Speech Communication Association

Místo

Hyderabad

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Ročník

2018

Číslo

9

Stát

Francouzská republika

Strany od

3383

Strany do

3387

Strany počet

5

URL

BibTex

@inproceedings{BUT155102,
  author="Karel {Beneš} and Santosh {Kesiraju} and Lukáš {Burget}",
  title="i-vectors in language modeling: An efficient way of domain adaptation for feed-forward models",
  booktitle="Proceedings of Interspeech 2018",
  year="2018",
  journal="Proceedings of Interspeech",
  volume="2018",
  number="9",
  pages="3383--3387",
  publisher="International Speech Communication Association",
  address="Hyderabad",
  doi="10.21437/Interspeech.2018-1070",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2018/abstracts/1070.html"
}

Dokumenty