Skip to main content
. Author manuscript; available in PMC: 2022 Apr 8.
Published in final edited form as: Appl Spectrosc. 2021 Aug 3;76(4):485–495. doi: 10.1177/00037028211034543

Figure 2.

Figure 2.

Open-source data pipeline developed for this study. The first stage of the pipeline converts and processes raw spectra files into a binary NetCDF file format. The second data labeling stage employs a custom Python “Labeler” app, allowing an expert Raman user to quickly assign labels (e.g., “good”, “bad”, or “maybe”) to the spectra serialized in the netCDF files. After labels have been assigned, the last stage of the pipeline is model training, where the binary files are loaded into NumPy arrays to train and test various ML models.