Project Detail

Funding resources

Evropská unie - Horizon 2020

On the project

The ESPERANTO project aims at pushing speech processing technologies to their next step in order to enable the diffusion of these technologies in European SMEs and to maximize and securize their use in the civil society for forensic, health or education. The ESPERANTO consortium forsees that the next generation of artificial intelligence algorithms for speech processing should : 1. be more accessible : via a larger number of spoken languages, and for applications where resources are strongly limited (health, education, robotics); 2. integrate a human in the loop to guaranty a higher usability and ease of deployment and maintenance; 3. be explainable in order to enable sensitive applications related to forensic or health and contribute to personal data preservation by detecting and characterizing existing biases due to the data-driven nature of current speech technologies. ESPERANTO intends to lead the scientific community by releasing evaluation metrics, protocols and standards that will boost the development and evaluation of this new generation of algorithms. To achieve this ambitious goal, the ESPERANTO project gathers a large and trans-sectorial community of experts in speech related applications such as speech transcription, separation, enhancement, translation, understanding and speaker recognition and diarization to transfer knowledge, organize, produce and standardize resources with the aim of catalyzing and cross-pollenizing this area. The main goals of the ESPERANTO project are: - support the development of open-source tools that will encourage fast developement, exchanges and reproducibility; - produce tutorials and competitive baselines on various topics of speech processing in order to boost the fostering of new speech-AI students, researchers and engineers; - facilitate the collection and sharing of linguistic and speech resources through standards; - organize workshops to progress on the speech technologies and favor tranfer of knowledge.

Description in Czech
Cílem projektu ESPERANTO je posunout technologie zpracování řeči k jejich dalšímu kroku s cílem umožnit šíření těchto technologií v evropských malých a středních podnicích a maximalizovat a zabezpečit jejich využití v občanské společnosti pro forenzní, zdravotnické nebo vzdělávací účely. Konsorcium ESPERANTO předpokládá, že příští generace algoritmů umělé inteligence pro zpracování řeči by měla: 1. být přístupnější: prostřednictvím většího počtu mluvených jazyků a pro aplikace, kde jsou zdroje silně omezené (zdraví, vzdělávání, robotika); 2. integrovat člověka do smyčky, aby byla zaručena vyšší použitelnost a snadnost nasazení a údržby; 3. být vysvětlitelné, aby umožnily citlivé aplikace související s forenzním nebo zdravotním stavem a přispěly k uchovávání osobních údajů zjišťováním a charakterizováním existujících předpojatostí v důsledku povahy současných řečových technologií založených na údajích. ESPERANTO má v úmyslu vést vědeckou komunitu uvolněním hodnotících metrik, protokolů a standardů, které podpoří vývoj a hodnocení této nové generace algoritmů. K dosažení tohoto ambiciózního cíle shromažďuje projekt ESPERANTO velkou a meziodvětvovou komunitu odborníků na aplikace související s řeči, jako jsou transkripce, separace, vylepšení, překlady, porozumění a rozpoznávání řečníků a diarizace k přenosu znalostí, organizování, produkci a standardizaci zdrojů s cílem katalyzovat a napenovat tuto oblast. Hlavní cíle projektu ESPERANTO jsou: - podporovat vývoj nástrojů s otevřeným zdrojovým kódem, které budou podporovat rychlý vývoj, výměnu a reprodukovatelnost; - vytvářet návody a konkurenční základní linie na různá témata zpracování řeči s cílem podpořit podporu nových studentů, výzkumných pracovníků a inženýrů AI; - usnadňovat shromažďování a sdílení jazykových a řečových zdrojů prostřednictvím standardů; - organizovat semináře k pokroku v oblasti řečových technologií a upřednostňovat přenos znalostí.

Keywords
artificial intelligence, intelligent systems, multi agent systems, machine learning, data mining, statistical data processing and application, modelling engineering, human computer interaction, natural language processing, speech processing, neural networks, explainability, human assisted learning, low resources, natural language processing, standardization, evaluation

Default language

English

People responsible

Kudla Radim, Ing. - principal person responsible
Kohlová Renata, Ing. - fellow researcher
Landini Federico Nicolás, Ph.D. - fellow researcher
Matějka Pavel, Ing., Ph.D. - fellow researcher
Mošner Ladislav, Ing. - fellow researcher
Silnova Anna, M.Sc., Ph.D. - fellow researcher

Units

Department of Computer Graphics and Multimedia
- responsible department (12.5.2020 - not assigned)
Speech Data Mining Research Group BUT Speech@FIT
- internal (12.5.2020 - 31.12.2025)
Department of Computer Graphics and Multimedia
- co-beneficiary (12.5.2020 - 31.12.2025)
Johns Hopkins University
- co-beneficiary (12.5.2020 - 31.12.2025)
Omilia
- co-beneficiary (12.5.2020 - 31.12.2025)
The University of Sheffield
- co-beneficiary (12.5.2020 - 31.12.2025)
Universidad de Zaragoza
- co-beneficiary (12.5.2020 - 31.12.2025)
Universiti Sains Malaysia
- co-beneficiary (12.5.2020 - 31.12.2025)
Le Mans University
- beneficiary (12.5.2020 - 31.12.2025)

Results

VILLATORO-TELLO, E.; MADIKERI, S.; SHARMA, B.; KHALIL, D.; KUMAR, S.; NIGMATULINA, I.; MOTLÍČEK, P.; GANAPATHIRAJU, A. Probability-Aware Word-Confusion-Network-to-Text Alignment Approach for Intent Classification. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024. p. 12617-12621. ISBN: 979-8-3503-4485-1.
Detail

Link

http://esperanto.univ-lemans.fr/en/index.html

Responsibility: Kudla Radim, Ing.

VUT

Faculties

University Institutes

Parts

Exchanges for SPEech ReseArch aNd TechnOlogies