Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikace
PENG, J. QU, X. GU, R. WANG, J. XIAO, J. BURGET, L. ČERNOCKÝ, J.
Originální název
Effective Phase Encoding for End-To-End Speaker Verification
Typ
článek ve sborníku ve WoS nebo Scopus
Jazyk
angličtina
Originální abstrakt
The widely used magnitude spectrum based features have shown their superiority in the field of speech processing. In contrast, the importance of phase spectrum is always ignored. This is because the patterns hidden in phase cannot be intuitively modelled and interpreted, due to phase wrapping phenomenon. In this paper, we explore novel phase spectrum based features, named Learnable Group Delay (LearnGD), to capture useful information in speech signals. Specifically, firstly, the negative of the spectral derivative of the phase spectrum, called group delay (GD), is used to unwrap the phase. Then, to suppress the spiky nature of GD, which is caused by its roots close to the unit circle in the Z domain, a carefully designed light convolutional smoothing layer is employed to reconstruct the GD. Finally, an exponential hyper-parameter is introduced to reconstruct GD features to restore the spectrum range and generate LearnGD features. For performance evaluation, speaker verification experiments are conducted on the VoxCeleb2 corpus. Compared to the traditional acoustic feature derived from the magnitude spectrum, the proposed phase-based features reach a 27.8% relative improvement in terms of EER. Furthermore, experimental results on TIMIT phoneme recognition task also demonstrate the effectiveness of our proposed phase-based features.
Klíčová slova
end-to-end speaker verification, phase information, group delay, on-the-fly
Autoři
PENG, J.; QU, X.; GU, R.; WANG, J.; XIAO, J.; BURGET, L.; ČERNOCKÝ, J.
Vydáno
30. 8. 2021
Nakladatel
International Speech Communication Association
Místo
Brno
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Ročník
2021
Číslo
8
Stát
Francouzská republika
Strany od
2366
Strany do
2370
Strany počet
5
URL
https://www.isca-speech.org/archive/interspeech_2021/peng21c_interspeech.html
BibTex
@inproceedings{BUT175842, author="PENG, J. and QU, X. and GU, R. and WANG, J. and XIAO, J. and BURGET, L. and ČERNOCKÝ, J.", title="Effective Phase Encoding for End-To-End Speaker Verification", booktitle="Proceedings Interspeech 2021", year="2021", journal="Proceedings of Interspeech", volume="2021", number="8", pages="2366--2370", publisher="International Speech Communication Association", address="Brno", doi="10.21437/Interspeech.2021-2025", issn="1990-9772", url="https://www.isca-speech.org/archive/interspeech_2021/peng21c_interspeech.html" }