Publication detail

Distance of spectroscopic data

VRÁBEL, J. KÉPEŠ, E. POŘÍZKA, P. KAISER, J.

Original Title

Distance of spectroscopic data

Type

abstract

Language

English

Original Abstract

Machine learning (ML) techniques are essential in a wide variety of modern spectroscopic applications. The majority of ML models use some form of distance computation. In the case of supervised learning, we may need to compute the distance of unknown spectra to the labeled representatives to decide the class correspondence. Also, in unsupervised learning, reconstruction error is considered (e.g. autoencoders), where distance is computed. One of the most prominent properties of spectroscopic data is high-dimensionality, sparsity and redundancy. [1] Thus, we are dealing with the curse of dimensionality (COD) in the processing of such data. It is a well-known [2] consequence of COD, that standardly utilized euclidean metric is behaving poorly in high-dimensional spaces. In the present work, we are studying alternative metrics to measure the distance of spectroscopic data and discuss resulting improvements in the performance of ML models. References: [1] Vrábel, J., Pořízka, P., & Kaiser, J. (2020). Restricted Boltzmann Machine method for dimensionality reduction of large spectroscopic data. Spectrochimica Acta Part B: Atomic Spectroscopy, 167, 105849. https://doi.org/10.1016/j.sab.2020.105849 [2] Aggarwal C.C., Hinneburg A., Keim D.A. (2001) On the Surprising Behavior of Distance Metrics in High Dimensional Space. In: Van den Bussche J., Vianu V. (eds) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol 1973. Springer, Berlin, Heidelberg

Keywords

machine learning, spectroscopic data, laser-induced breakdown spectroscopy, distance, curse of dimensionality, metric

Authors

VRÁBEL, J.; KÉPEŠ, E.; POŘÍZKA, P.; KAISER, J.

Released

21. 9. 2020

BibTex

@misc{BUT165755,
  author="Jakub {Vrábel} and Erik {Képeš} and Pavel {Pořízka} and Jozef {Kaiser}",
  title="Distance of spectroscopic data",
  year="2020",
  note="abstract"
}