Table 1.
Method | Description | Characteristic | |
---|---|---|---|
Denoise | Kernel smoothing | Smooths the spectra based on a normal kernel function | Parameter free |
Savitzky-Golay differentiation | Estimates the derivative by consecutively fitting window-wised sub-sets of adjoining data points with a degree (custom designed) polynomial using linear least squares | Parameter-free; can be used for both baseline correction and smoothing/noise reduction. | |
Baseline removal | MPLS | Finds a rough background based on a penalized least squares function | Relatively time-consuming; competitive results; insensitive to the parameters. |
SNV | Transposes and then auto-scales the data. | Parameter free; scales the data | |
MSC | Each input spectrum is regressed against a reference (e.g. the mean spectrum) and the results are used to correct the input spectrum. | Reference dependent; scales the data. | |
Cosmic ray removal | Sharp spike detection | Detects spikes which are significantly narrower than the peaks in the spectrum. | Insensitive to the relatively wide spikes; threshold dependent. |
Abnormal spike detection | A series of replicate spectra are compared. A spike is detected and removed since the probability of a spike occurring at the same point in multiple spectra is considered low. | Time-consuming since multiple spectra must be compared. | |
Image curvature correction | Optimizes optical systems by comparing spectra from different rows of pixels on the detector. | User intervention needed for implementation; parameter based; time-consuming. | |
Mapping based technique | The abnormal spikes are detected by comparing the neighboring spectra from the map. | A relatively large number of pixels needed for the accuracy of the detection. | |
Scaling method | Normalization by a peak (e.g. maximal peak). | Divides every row (spectrum) by the value at the selected peak of that row (e.g. maximal peak). |
Emphasizes the variation of the Raman bands against the selected peak |
Auto-scaling | Subtracts the mean and then divides the standard deviation of that row. |
The shape of the spectra may be lost; reduces the variation in the objects and gathers the objects towards the center. | |
Row normalization (length/area) | Divides every row/ object by the length (Manhattan distance)/area (Euclidean distance) of that row.(length)(area) | The variation of objects is reduced. | |
Column normalization (length/area) | Divides every column/variable by the length (Manhattan distance)/area (Euclidean distance) of that column. (length) (area) |
The shape of the spectra may be lost; Reduces the variation from variables | |
Mean-center | Subtracts the mean of each row for all the elements | Reduces the deviation of the data from its center; gathers the objects towards the center. |
n/w represents the nth/wth row /column of the spectral matrix X for scaling. All the X blocks in the paper are arranged in a way that objects are stored in different rows and variables are stored in different columns.