Publication detail

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models

YUSUF, B. BASKAR, M. ROSENBERG, A. RAMABHADRAN, B.

Original Title

Type

conference paper

Language

English

Original Abstract

This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio. We introduce a metric for measuring SSR performance and we propose a model which does SSR by com bining a RNN-Transducer-based ASR system with an audioprefixed language model (LM). The ASR system transcribes ongoing audio and feeds the resulting transcripts, along with an audiodependent prefix, to the LM, which speculates likely completions for the transcriptions. We experiment with a variety of ASR datasets on which show the efficacy our method and the feasibility of SSR as a method of reducing ASR latency.

Keywords

low-latency speech recognition, speculative speech recognition, prefix language model, low-rank adaptation

Authors

YUSUF, B.; BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B.

Released

1. 9. 2024

Publisher

International Speech Communication Association

Location

Kos

ISBN

1990-9772

Periodical

Proceedings of Interspeech

Year of study

2024

Number

State

French Republic

Pages from

792

Pages to

796

Pages count

URL

https://www.isca-archive.org/interspeech_2024/yusuf24_interspeech.pdf

BibTex

@inproceedings{BUT193739,
  author="YUSUF, B. and BASKAR, M. and ROSENBERG, A. and RAMABHADRAN, B.",
  title="Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="792--796",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-298",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/yusuf24_interspeech.pdf"
}

Documents

yusuf24_interspeech_2024.pdf

VUT

Faculties

University Institutes

Parts

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models