Přístupnostní navigace
E-application
Search Search Close
Publication detail
KESIRAJU, S. SARVAŠ, M. PAVLÍČEK, T. MACAIRE, C. CIUBA, A.
Original Title
Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models
Type
conference paper
Language
English
Original Abstract
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST). We conducted experiments on both simulated and reallow resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively. Using the encoder-decoder framework for ST, our results show that a multilingual automatic speech recognition system acts as a good initialization under low-resource scenarios. Furthermore, using the CTC as an additional objective for translation during training and decoding helps to reorder the internal representations and improves the final translation. Through our experiments, we try to identify various factors (initializations, objectives, and hyperparameters) that contribute the most for improvements in lowresource setups. With only 300 hours of pre-training data, our model achieved 7.3 BLEU score on Tamasheq - French data, outperforming prior published works from IWSLT 2022 by 1.6 points.
Keywords
speech translation, low-resource, multilingual, speech recognition
Authors
KESIRAJU, S.; SARVAŠ, M.; PAVLÍČEK, T.; MACAIRE, C.; CIUBA, A.
Released
20. 8. 2023
Publisher
International Speech Communication Association
Location
Dublin
ISBN
1990-9772
Periodical
Proceedings of Interspeech
Year of study
2023
Number
08
State
French Republic
Pages from
2148
Pages to
2152
Pages count
5
URL
https://www.isca-speech.org/archive/pdfs/interspeech_2023/kesiraju23_interspeech.pdf
BibTex
@inproceedings{BUT185572, author="KESIRAJU, S. and SARVAŠ, M. and PAVLÍČEK, T. and MACAIRE, C. and CIUBA, A.", title="Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2023", journal="Proceedings of Interspeech", volume="2023", number="08", pages="2148--2152", publisher="International Speech Communication Association", address="Dublin", doi="10.21437/Interspeech.2023-2506", issn="1990-9772", url="https://www.isca-speech.org/archive/pdfs/interspeech_2023/kesiraju23_interspeech.pdf" }