Detail projektu

Zdroje financování

Vysoké učení technické v Brně - Vnitřní projekty VUT

O projektu

Multimediální a 3D data jsou důležitými a potřebnými daty pro vzrůstající počet aplikací moderních počítačových systémů, v nichž je jejich využití nenahraditelné. Současně je známo, že zpracování takových dat je obtížné a výpočetně náročné a to platí i o jejich zobrazování a analýze. Proto je výzkum v této oblasti jedním z obtížnějších a důležitých. Projekt navazuje na dřívější projekt "Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat".

Označení

FIT-S-23-8278

Originální jazyk

čeština

Řešitelé

Útvary

Ústav počítačové grafiky a multimédií
- interní (1.1.2023 - 31.12.2025)
Fakulta informačních technologií
- příjemce (1.1.2023 - 31.12.2025)

Výsledky

KUMAR, S.; MADIKERI, S.; NIGMATULINA, I.; VILLATORO-TELLO, E.; MOTLÍČEK, P.; PANDIA, K.; DUBAGUNTA, P.; GANAPATHIRAJU, A. Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024. p. 12592-12596. ISBN: 979-8-3503-4485-1.
Detail

RANGAPPA, P.; MUSCAT, A.; SANCHEZ-LARA, A.; MOTLÍČEK, P.; ANTONOPOULOU, M.; FOURFOURIS, I.; SKARLATOS, A.; AVGERINOS, N.; TSANGARIS, M.; KOSTKA, K. Detecting Criminal Networks via Non-Content Communication Data Analysis Techniques from the TRACY Project. Proceedings of the15th EAI International Conference on Digital Forensics & Cyber Crime (EAI-ICDF2C24). Dubrovnik: 2024. p. 1-15.
Detail

BURDISSO, S.; RAMIREZ, A.; VILLATORO-TELLO, E.; SÁNCHEZ-VEGA, F.; LÓPEZ-MONROY, P.; MOTLÍČEK, P. DAIC-WOZ: On the Validity of Using the Therapist's prompts in Automatic Depression Detection from Clinical Interviews. Proceedings of the 6th Clinical Natural Language Processing Workshop. Association for Computational Linguistics. Mexico City: Association for Computational Linguistics, 2024. p. 82-90.
Detail

ASHIHARA, T.; MORIYA, T.; HORIGUCHI, S.; PENG, J.; OCHIAI, T.; DELCROIX, M.; MATSUURA, K.; SATO, H. Investigation of Speaker Representation for Target-Speaker Speech Processing. Proc. 2024 IEEE Spoken Language Technology Workshop (SLT). Macao: IEEE Signal Processing Society, p. 423-430. ISBN: 979-8-3503-9225-8.
Detail

ZULUAGA-GOMEZ, J.; VESELÝ, K.; SZŐKE, I.; BLATT, A.; MOTLÍČEK, P.; KOCOUR, M.; RIGAULT, M.; CHOUKRI, K.; PRASAD, A.; SARFJOO, S.; NIGMATULINA, I.; CEVENINI, C.; KOLČÁREK, P.; TART, A.; ČERNOCKÝ, J.; KLAKOW, D. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications has been verified and confirmed by the Action Editor. Journal of Machine Learning Research, vol. 2, no. 1, p. 1-45. ISSN: 1533-7928.
Detail

ŠILLING, P.; ŠPANĚL, M. DEMIS: Electron Microscopy Image Stitching using Deep Learning Features and Global Optimisation. Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOIMAGING. Porto: Institute for Systems and Technologies of Information, Control and Communication, 2025. p. 255-256. ISBN: 978-989-758-731-3.
Detail

NOVÁK, J.; CHUDÝ, P.; HANÁK, J. Model Predictive Control Driven Aerial Grasping with Soft Operational Constraints. In ICAS Proceedings. ICAS Proceedings. Florence: International Council of the Aeronautical Sciences, 2024. p. 1-15. ISSN: 2958-4647.
Detail

CHLUBNA, T.; MILET, T.; ZEMČÍK, P. Out-of-Focus Artifacts Mitigation and Autofocus Methods for 3D Displays. Visual Informatics, 2024, vol. 9, no. 1, p. 31-42. ISSN: 2468-502X.
Detail

MACIEJEWSKI, M.; KLEMENT, D.; HUANG, R.; WIESNER, M.; KHUDANPUR, S. Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. p. 2155-2160. ISSN: 1990-9772.
Detail

PEŠÁN, J.; JUŘÍK, V.; KARAFIÁT, M.; ČERNOCKÝ, J. BESST Dataset: A Multimodal Resource for Speech-based Stress Detection and Analysis. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. p. 1355-1359. ISSN: 1990-9772.
Detail

YUSUF, B.; BASKAR, M.; ROSENBERG, A.; RAMABHADRAN, B. Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. p. 792-796. ISSN: 1990-9772.
Detail

PRASAD, A.; MADIKERI, S.; KHALIL, D.; MOTLÍČEK, P.; SCHUEPBACH, C. Speech and Language Recognition with Low-rank Adaptation of Pretrained Models. In Proceedings of Interspeech. Proceedings of Interspeech. Kos Island: International Speech Communication Association, 2024. p. 2825-2829. ISSN: 1990-9772.
Detail

ESPUNA, A.; PRASAD, A.; MOTLÍČEK, P.; MADIKERI, S.; SCHUEPBACH, C. Normalising Flows for Speaker and Language Recognition Backend. Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop. Quebec: International Speech Communication Association, 2024. p. 74-80.
Detail

BHATTACHARJEE, M.; NIGMATULINA, I.; PRASAD, A.; RANGAPPA, P.; MADIKERI, S.; MOTLÍČEK, P.; HELMKE, H.; KLEINERT, M. Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 12652-12656. ISBN: 979-8-3503-4485-1.
Detail

PRASAD, A.; CAROFILIS, A.; VANDERREYDT, G.; KHALIL, D.; MADIKERI, S.; MOTLÍČEK, P.; SCHUEPBACH, C. Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11921-11925. ISBN: 979-8-3503-4485-1.
Detail

KIŠŠ, M.; HRADIŠ, M. Self-supervised Pre-training of Text Recognizers. In Barney Smith, E.H., Liwicki, M., Peng, L. (eds) Document Analysis and Recognition - ICDAR 2024. Lecture Notes in Computer Science. Atény: Springer Nature Switzerland AG, 2024. p. 218-235. ISBN: 978-3-031-70545-8.
Detail

KUBÍK, T.; ŠPANĚL, M. LMVSegRNN and Poseidon3D: Addressing Challenging Teeth Segmentation Cases in 3D Dental Surface Orthodontic Scans. Bioengineering, 2024, vol. 11, no. 10, p. 1-18. ISSN: 2306-5354.
Detail

BENEŠ, K.; KOCOUR, M.; BURGET, L. Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 11276-11280. ISBN: 979-8-3503-4485-1.
Detail

HANÁK, J.; NOVÁK, J.; CHUDÝ, P. Tactical Scenario Adaptation for Pilot Training. In AIAA/IEEE Digital Avionics Systems Conference - Proceedings. San Diego: Institute of Electrical and Electronics Engineers, 2024. p. 1-7. ISBN: 979-8-3503-4961-0. ISSN: 2155-7195.
Detail

NOVÁK, J.; HANÁK, J.; CHUDÝ, P. Predictive Control Driven Tactical Maneuvering. In ICAS Proceedings. ICAS Proceedings. Florence: International Council of the Aeronautical Sciences, 2024. p. 1-12. ISSN: 2958-4647.
Detail

Odpovědnost: Zemčík Pavel, prof. Dr. Ing., dr. h. c.

VUT

Fakulty

Vysokoškolské ústavy

Součásti

Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat