Detail publikace

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models

YUSUF, B. BASKAR, M. ROSENBERG, A. RAMABHADRAN, B.

Originální název

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio. We introduce a metric for measuring SSR performance and we propose a model which does SSR by com bining a RNN-Transducer-based ASR system with an audioprefixed language model (LM). The ASR system transcribes ongoing audio and feeds the resulting transcripts, along with an audiodependent prefix, to the LM, which speculates likely completions for the transcriptions. We experiment with a variety of ASR datasets on which show the efficacy our method and the feasibility of SSR as a method of reducing ASR latency.

Klíčová slova

low-latency speech recognition, speculative speech recognition, prefix language model, low-rank adaptation

Autoři

YUSUF, B.; BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B.

Vydáno

1. 9. 2024

Nakladatel

International Speech Communication Association

Místo

Kos

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Ročník

2024

Číslo

Stát

Francouzská republika

Strany od

792

Strany do

796

Strany počet

URL

https://www.isca-archive.org/interspeech_2024/yusuf24_interspeech.pdf

BibTex

@inproceedings{BUT193739,
  author="YUSUF, B. and BASKAR, M. and ROSENBERG, A. and RAMABHADRAN, B.",
  title="Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="792--796",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-298",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/yusuf24_interspeech.pdf"
}

Dokumenty

yusuf24_interspeech_2024.pdf

VUT

Fakulty

Vysokoškolské ústavy

Součásti

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models