Publication detail

Distance of Spectroscopic Data

VRÁBEL, J. KÉPEŠ, E. POŘÍZKA, P. KAISER, J.

Original Title

Distance of Spectroscopic Data

Type

abstract

Language

English

Original Abstract

Machine learning (ML) techniques are essential in a wide variety of modern spectroscopic applications. The majority of ML models use some form of distance computation. In the case of supervised learning, we may need to compute the distance of unknown spectra to the labeled representatives to decide the class correspondence. Also, in unsupervised learning, e.g. reconstruction error is considered (in autoencoders), where the distance between original data input and model output is computed. Dealing with high-dimensional and sparse data as spectra [1], a curse of dimensionality (COD) emerges, which implies many challenges for distance computation. It is a well-known [2] consequence of COD that standardly utilized Euclidean metric is behaving poorly in high-dimensional spaces. In the present work, we study alternative metrics to measure the distance of spectroscopic data and discuss the consequences for various ML models. Additionally, we exploit properties of spectroscopic data (high-dim., sparsity, redundancy [1]) to design novel custom metrics for distance measurement. All metrics are compared to the baseline approaches in both, supervised (KNN) and unsupervised (autoencoder) tasks. The methodology is demonstrated on Laser-Induced Breakdown Spectroscopy data. References: [1] Vrábel et al. (2020). Restricted Boltzmann Machine method for dimensionality reduction of large spectroscopic data. Spectrochimica Acta Part B: Atomic Spectroscopy, 167, 105849. [2] Aggarwal et al. (2001) On the Surprising Behavior of Distance Metrics in High Dimensional Space. In: Van den Bussche J., Vianu V. (eds) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol 1973. Springer, Berlin, Heidelberg

Keywords

distance metric, spectroscopic data, machine learning, high-dimensional space, LIBS

Authors

VRÁBEL, J.; KÉPEŠ, E.; POŘÍZKA, P.; KAISER, J.

Released

2. 10. 2022

BibTex

@misc{BUT180065,
  author="Jakub {Vrábel} and Erik {Képeš} and Pavel {Pořízka} and Jozef {Kaiser}",
  title="Distance of Spectroscopic Data",
  year="2022",
  note="abstract"
}