Detail publikačního výsledku

Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language

MACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S.

Originální název

Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language

Anglický název

Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

As speech technology has matured, there has been a push to- wards systems that can process conversational speech, reflect- ing the so-called "cocktail party problem," which includes not only more challenging acoustic conditions, but also necessi- tates solutions to new problems, such as identifying who spoke when and processing multiple concurrent streams of speech. Such problems have been approached primarily via corpora comprising business meetings and dinner parties, overlooking the broad range of conversational dynamics and speaker de- mographics that fall under the category of multi-talker speech. To this end, we introduce the use of the Santa Barbara Corpus of Spoken American English for evaluation of speech technol- ogy-including preparing the corpus and annotations for auto- matic processing, demonstrating the failure of state-of-the-art systems to withstand the heterogeneity of conditions, and high- lighting the situations where standard methods struggle to per- form at all

Anglický abstrakt

As speech technology has matured, there has been a push to- wards systems that can process conversational speech, reflect- ing the so-called "cocktail party problem," which includes not only more challenging acoustic conditions, but also necessi- tates solutions to new problems, such as identifying who spoke when and processing multiple concurrent streams of speech. Such problems have been approached primarily via corpora comprising business meetings and dinner parties, overlooking the broad range of conversational dynamics and speaker de- mographics that fall under the category of multi-talker speech. To this end, we introduce the use of the Santa Barbara Corpus of Spoken American English for evaluation of speech technol- ogy-including preparing the corpus and annotations for auto- matic processing, demonstrating the failure of state-of-the-art systems to withstand the heterogeneity of conditions, and high- lighting the situations where standard methods struggle to per- form at all

Klíčová slova

conversational speech, diarization, speech recognition

Klíčová slova v angličtině

conversational speech, diarization, speech recognition

Autoři

MACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S.

Rok RIV

2025

Vydáno

01.09.2024

Nakladatel

International Speech Communication Association

Místo

Kos

Kniha

Proceedings of Interspeech 2024

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Svazek

2024

Číslo

9

Stát

Francouzská republika

Strany od

2155

Strany do

2160

Strany počet

5

URL

BibTex

@inproceedings{BUT193741,
  author="MACIEJEWSKI, M. and KLEMENT, D. and HUANG, R. and WIESNER, M. and KHUDANPUR, S.",
  title="Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="2155--2160",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-2119",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/maciejewski24_interspeech.pdf"
}

Dokumenty