Přístupnostní navigace
E-application
Search Search Close
Publication detail
VRÁBEL, J. KÉPEŠ, E. POŘÍZKA, P. KAISER, J.
Original Title
Distance of Spectroscopic Data
Type
abstract
Language
English
Original Abstract
Machine learning (ML) techniques are essential in a wide variety of modern spectroscopic applications. The majority of ML models use some form of distance computation. In the case of supervised learning, we may need to compute the distance of unknown spectra to the labeled representatives to decide the class correspondence. Also, in unsupervised learning, e.g. reconstruction error is considered (in autoencoders), where the distance between original data input and model output is computed. Dealing with high-dimensional and sparse data as spectra [1], a curse of dimensionality (COD) emerges, which implies many challenges for distance computation. It is a well-known [2] consequence of COD that standardly utilized Euclidean metric is behaving poorly in high-dimensional spaces. In the present work, we study alternative metrics to measure the distance of spectroscopic data and discuss the consequences for various ML models. Additionally, we exploit properties of spectroscopic data (high-dim., sparsity, redundancy [1]) to design novel custom metrics for distance measurement. All metrics are compared to the baseline approaches in both, supervised (KNN) and unsupervised (autoencoder) tasks. The methodology is demonstrated on Laser-Induced Breakdown Spectroscopy data. References: [1] Vrábel et al. (2020). Restricted Boltzmann Machine method for dimensionality reduction of large spectroscopic data. Spectrochimica Acta Part B: Atomic Spectroscopy, 167, 105849. [2] Aggarwal et al. (2001) On the Surprising Behavior of Distance Metrics in High Dimensional Space. In: Van den Bussche J., Vianu V. (eds) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol 1973. Springer, Berlin, Heidelberg
Keywords
distance metric, spectroscopic data, machine learning, high-dimensional space, LIBS
Authors
VRÁBEL, J.; KÉPEŠ, E.; POŘÍZKA, P.; KAISER, J.
Released
2. 10. 2022
BibTex
@misc{BUT180065, author="Jakub {Vrábel} and Erik {Képeš} and Pavel {Pořízka} and Jozef {Kaiser}", title="Distance of Spectroscopic Data", year="2022", note="abstract" }