Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2026 Jan 12;130(3):602–613. doi: 10.1021/acs.jpca.5c06253

Analyzing Spectral Similarities for Structural Identification Using a New Benchmark Database

Rami Rahimi 1, Noga Saban 1, Ilana Bar 1,*
PMCID: PMC12833857  PMID: 41521716

Abstract

Vibrational spectra, characterized by structurally sensitive features, offer critical insights into molecular structures, bonding, and dynamics. Yet, interpreting measured spectra and identifying corresponding structures require theoretical equivalents and quantitative analysis. Here, we introduce a new experimental database that includes broad-range ionization-detected stimulated Raman scattering signatures besides harmonic Raman frequencies calculated with widely used density functional methods/basis sets. By comparing experimental fundamental bands and computed data, we derive single global and multiple range- and mode-dependent scaling factors and analyze the resulting error distributions, showing that mode-dependent scaling provides the greatest accuracy. Additionally, we explore various methods for evaluating similarities between measured fundamental spectra of different compounds and calculated data sets of conformers. Our findings indicate that Euclidean and Manhattan distance metrics for frequencies and intensities uncover subtle structural variations, yielding spectral similarity rankings that facilitate structural identifications. This new database and methodology address key challenges in spectral assignment, and we anticipate that they will serve as benchmarks for future predictive models and foster the development of advanced strategies.


graphic file with name jp5c06253_0009.jpg


graphic file with name jp5c06253_0007.jpg

Introduction

Vibrational spectroscopies, including spontaneous Raman and infrared (IR) spectroscopies and their various enhanced and nonlinear derivatives, provide signatures that reflect the oscillatory motion of compounds. These powerful techniques require matching unknown experimental spectra with corresponding spectral signatures found in reference-measured spectral databases, playing a significant role in characterizing compounds in different scientific disciplines. , To manage the complexities of rich spectra, spectral matching is often performed with machine-learning algorithms and chemometric models.

Moreover, there is a growing interest in the observable features and patterns revealed by these methods, which require quantum chemical methods for molecular and structural characterization. , This support frequently comes from IR and Raman spectra calculations within the harmonic approximation, balancing computational cost, and tolerable accuracy. These calculations involve evaluating the second derivative of the energy with respect to nuclear coordinates at the optimized structure and determining intensities from changes in the dipole moment or polarizability along the respective normal-mode vectors. Typically, they are performed using density functional theory (DFT) or more computationally demanding quantum ab initio methods, , depending on the chosen approximations.

To correct the systematic overestimation of harmonic frequencies, single or region-dependent empirical scaling factors are commonly applied, improving agreement with experimental data. Alternatively theoretical models such as vibrational self-consistent field and second-order vibrational theory (VPT2), , enable the calculation of anharmonic frequencies, providing a more comprehensive description of the anharmonic nature of vibrations near a well-defined equilibrium structure. In addition, IR or Raman spectra can be calculated by ab initio molecular dynamics, ,,− which is particularly suited for capturing molecular flexibility, temperature effects, and their influence on the overall spectral profile. By comparing results from different computational methods with experimental vibrational spectra, specifically subsets of spectral peaks corresponding to fundamental transitions, one can assess their agreement. This comparison ultimately identifies the calculated spectrum that best matches the experimental data, providing detailed structural insights.

DFT is a standard and widely used method for vibrational spectra calculations and is also applied in specialized areas such as double-resonance hole-burn techniques. These methods include infrared-ion dip spectroscopy (IR-IDS) and the more complex ionization-detected stimulated Raman spectroscopy, , which we pioneered for structural analysis. These methods use a single (IR) or two (pump, ωp, and tunable Stokes, ωS) laser beams to deplete the vibrational ground state population resonantly and probe it by a delayed ultraviolet (UV) laser beam, enabling resonance-enhanced two-photon ionization (R2PI). The stimulated Raman scattering (SRS) can induce depletion or gain of the background ionization signal, Figure S1 of the Supporting Information. , These approaches facilitate species- and structure-specific jet-cooled vibrational signatures at low concentrations, offering mass-selectivity, high sensitivity, and spectral resolution higher than those of condensed phases.

Nevertheless, IR-IDS measurements for various gas-phase compounds are mainly limited to hydride stretching spectral ranges. Yet, efforts to capture low-frequency vibrational signatures using IR-IDS, by radiation from the free electron laser for IR experiments (FELIX), , or ionization-loss stimulated Raman spectroscopy (ILSRS), Figure S1a of the Supporting Information, by table-top visible lasers have been proven effective. Consequently, the resulting vibrational bands are well-suited for comparison with theoretical predictions derived from harmonic approaches with empirical scaling, emphasizing visual agreement between the measured fundamental bands in the spectra and the calculated vibrational spectra. A notable example of this challenge is the spectral identification and assignment of conformers of neurotransmitters, analogs, and their hydrates. ,,− These species consist of a rigid skeleton with a flexible ethylamino (ethyl alcohol) side chain, which can undergo internal rotations about the C–N­(O) and C–C single bonds or attach water differently, leading to different conformers.

As highlighted by Mata and Suhm, benchmark experiments, specifically those conducted under gas-phase isolation and at low temperatures, are essential for validating theoretical predictions. A recently introduced resource, VIBFREQ1295, offers a valuable data set for benchmarking vibrational frequency calculations and training machine-learning models. It includes 1295 experimental fundamental frequencies compiled from a thorough review of recent literature alongside CCSD­(T)­(F12*)/cc-pVDZ-F12 ab initio harmonic frequencies for 141 small-to-medium-sized gas-phase molecules. While this database is highly useful, we propose an additional data set based on high quality, internally consistent data we have collected over the years. ,,− Although not exhaustive across all molecular types or environments, our data set includes jet-cooled ionization-loss and -gain SRS features of the aforementioned conformers (400–3755 cm–1) ,,− along with harmonic vibrational frequencies calculated using various density functionals. Based on visual similarities between the signatures obtained by comparing patterns and wavenumbers of fundamental features to those of predicted ones, we derive single global or multiple frequency region- and mode-dependent scaling factors and analyze the resulting error distributions.

Since different computational methods and basis sets introduce systematic errors in harmonic vibrational frequencies, tailored scaling factors are needed to correct these biases. Moreover, as higher-quality experimental data become available, they enable the extraction of updated global and multiple scaling factors, which in turn help extend the predictive accuracy, particularly for flexible conformers and complex molecular systems.

In the next phase, we evaluate the similarities between the measured fundamental bands in the spectra of specific conformers and the predicted scaled harmonic Raman spectra for possible conformers. To achieve this, we use multiple approaches in machine learning, including cross-correlation, Euclidean and Manhattan distances, and optimal transport and the Kuhn-Munkres algorithm (Hungarian method). , The cross-correlation technique identifies the patterns and relationships within data sets by quantifying the degree of similarity between the spectral features. To measure discrete differences between spectra, Euclidean and Manhattan distances as well as optimal transport metrics are employed. The Kuhn-Munkres algorithm is then applied to construct minimal cost matrixes, enabling the identification of the closest matches between fundamental spectral features in the measured spectra and the scaled predicted data.

The quantitative results from spectral matching led to the generation of spectral barcodes that represent the measured fundamental signatures of specific species and the calculated features of potential structures across various theoretical levels. These barcodes enable the computation of average Euclidean and Manhattan distances, as well as optimal transport distances, to rank the similarities between a specific measured conformer and candidate structures, ultimately identifying the best match that corresponds to the experimentally observed conformer.

To demonstrate the performance of the employed approaches, we apply them to analyze 2-phenylethyl alcohol (phenylethyl alcohol (PEAL)) in both mono- and dihydrated forms, 2-(2-fluorophenyl)­ethyl alcohol (2-FPEAL) conformers, and 2-(2-fluorophenyl)-ethylamine (2-FPEA) monohydrate, see Figure S2 of the Supporting Information. This evaluation provides a rigorous test of the spectral matching methods. By using Euclidean/Manhattan distances and Kuhn-Munkres assignment, we achieve the highest predictive accuracy, highlighting the ability of the proposed methodology to differentiate the spectra of closely related structures.

Results and Discussion

Database and Scaling Factors

Considering that the observed features in the vibrational spectra are Q-branches with a relatively low bandwidth and that rotational fine structures could not be resolved due to limited spectral resolution, we used the band maxima as estimated fundamental frequencies. The use of these maxima is sufficient for direct comparison with calculated harmonic frequencies.

Spectral features were selected through careful visual inspection, supported by vibrational calculations to ensure their consistency. Only distinct peaks with a signal-to-noise ratio >2.5 and full-width at half-maximum (fwhm) > 4 cm–1 were included, to exclude noise spikes and spurious features. Based on these fundamentals and their intensities, we compiled a database, Supporting Information Data S1.

The database contains 824 jet-cooled loss- and gain- Q-branches of neurotransmitter conformers, analogs, and their hydrates. ,,− The number of features varies for each conformer due to factors like population in the molecular beam and overlap of loss lines of a specific conformer with gain lines from another conformer. For example, only eight spectral features are identified for the A2 conformer of 2-(4-fluorophenyl)-ethylamine (4-FPEA) because of significant overlaps with ionization gain lines from the most stable conformer, indicating a low A2 population and challenging ILSR spectrum measurement. Additionally, the database includes harmonic frequencies we calculate for the corresponding species by selected DFT levels [M06-2X­(-D3), B3LYP­(-D3), and ωB97X-D] with 6-311++G­(d,p) and cc-pVTZ basis sets, and D and D3 dispersion corrections. ,

While there are comprehensive benchmarking studies evaluating the performance of density functional approximations, , and basis sets across thermochemistry, kinetics, noncovalent interactions, barrier heights, and isomerization energies, our scope is more focused, referring to our database and a small number of methods. We selected M06-2X­(-D3), B3LYP­(-D3), and ωB97X-D because they are widely used and have proven effective for vibrational frequency calculations. Upon pairing them with appropriate basis sets, they offer a favorable balance between computational efficiency and accuracy. Their reliability is further supported by the availability of scaling factors derived from benchmark data sets, making them well suited for vibrational spectral analysis.

Our database, though limited to NTs, analogs, and hydrates, consists of various groups and supports the retrieval of scaling factors across different spectral ranges, including global (full spectral range), mode-dependent [O–H, N–H, C–H ring , C–H ethyl , mid- (1000–2000 cm–1), and low-frequencies (<1000 cm–1)] and range-dependent [high- (>2000 cm–1), mid-, and low-frequencies]. ,− , The O–H, N–H, and C–H vibrations refer to pure stretching modes that are well-isolated and exhibit minimal coupling with other vibrational motions, allowing for mode-dependent scaling factors retrieval. Also, hydrogen bonding interactions influence most frequencies of these modes.

The collective motions in mid- and low-frequency ranges involve vibrations of large parts of the molecules, making a unique group classification challenging. Although a 1000 cm–1 threshold may seem arbitrary, it has been considered as an optimal cutoff, and applied in previous studies for calculating low-frequency scaling factors, supporting the validity of this approach. ,, Thus, we adopt the previously used spectral regions for determining range-dependent scaling factors. , While this approach improves accuracy for numerous molecular systems, further research is required to evaluate its effectiveness for complex structures. Global scaling factors, commonly used for converting calculated harmonic frequencies into experimental fundamental frequencies, are widely adopted but generally less accurate. In contrast, range-dependent scaling factorsthough more preciseare rarely used and limited to specific methods and frequency ranges. − ,, Mode-dependent scaling factors offer even greater specificity but have so far only been reported for select cases like C–O stretching. ,

It is worth noting that anharmonicities affect both the energies and spectral intensities of fundamental vibrations, as shown by VPT2 results. , Determining intensity scaling factors is required and would be helpful, but this work focuses only on frequency scaling factors due to their greater accuracy. Comparing experimental and calculated Raman intensities is challenging because of Placzek’s polarizability theory limitations and experimental uncertainties. Retrieving intensity scaling factors from measured ILSRS compared with calculated harmonic intensities is even more challenging and remains an issue to be addressed later.

We derived scaling factors (λ) and their associated uncertainties (Δλ), as shown in Table , by visually comparing the measured fundamental bands in spectra with the calculated vibrational spectra. These scaling factors were determined by comparing experimental fundamental frequencies, ν i , with calculated harmonic, ω i , frequencies. We evaluated scaling accuracy by calculating minimal and maximal frequency differences (error bars) along with statistical measures, including root-mean-square (RMS), mean absolute deviation, and standard deviation (σ), marked by rhombus, circle, and square, respectively, Figure . These metrics enable the comparison of scaled harmonic and experimental fundamental frequencies.

1. Derived Frequency Scaling Factors and Uncertainties (in Parentheses) at Various Theoretical Levels.

Scaling factor/Theory level global O–H N–H C–Hring C–Hethyl mid low high
M06-2X/6-311++G(d,p) 0.9523 (108) 0.9402 (31) 0.9454 (31) 0.9533 (20) 0.9493 (52) 0.9735 (83) 0.9727 (107) 0.9486 (57)
M06-2X-D3/6-311++G(d,p) 0.9523 (108) 0.9402 (30) 0.9454 (31) 0.9533 (21) 0.9493 (52) 0.9735 (82) 0.9728 (107) 0.9486 (57)
M06-2X-D3/cc-pVTZ 0.9534 (94) 0.9454 (38) 0.9485 (28) 0.9535 (21) 0.9499 (51) 0.9723 (83) 0.9672 (112) 0.9502 (46)
B3LYP/6-311++G(d,p) 0.9648 (98) 0.9583 (61) 0.9564 (26) 0.9640 (22) 0.9631 (55) 0.9833 (62) 0.9837 (123) 0.9615 (53)
B3LYP-D3/6-311++G(d,p) 0.9649 (93) 0.9599 (74) 0.9566 (25) 0.9642 (21) 0.9634 (55) 0.9818 (65) 0.9790 (134|) 0.9619 (54)
B3LYP-D3/cc-pVTZ 0.9653 (79) 0.9641 (79) 0.9590 (22) 0.9637 (21) 0.9640 (56) 0.9790 (58) 0.9731 (147) 0.9630 (50)
ωB97X-D/6-311++G(d,p) 0.9541 (109) 0.9399 (67) 0.9435 (35) 0.9566 (27) 0.9543 (56) 0.9712 (8) 0.9699 (15) 0.9510 (79)
ωB97X-D/cc-pVTZ 0.9539 (90) 0.9435 (65) 0.9453 (23) 0.9556 (22) 0.9537 (54) 0.9688 (66) 0.9655 (115) 0.9513 (65)

1.

1

Different frequency measures between scaled harmonic and fundamental frequencies for global and mode- and range-dependent scaling factors appear in the blue-shaded sections. The right section includes high-, mid-, and low-frequency regions, where the last two also participate in the mode-dependent scaling factors.

Figure shows that the global scaling factors and statistical measures for M06-2X and B3LYP without and with dispersion corrections (D3) using the 6-311++G­(d,p) basis set yield close values, while the cc-pVTZ basis set improves accuracy. This preservation trend also occurs for the ωB97X-D functional with the two basis sets. Comparisons of our derived global scaling factors with those from other databases (Table ) show no significant discrepancies, with differences within absolute uncertainties, indicating the reliability of our scaling factors. The stability of the scaling factors was confirmed by recalculating them using only the top 20% of the most intense experimental fundamental features. The results showed minimal variations and stayed within the uncertainty margins of the full data set, demonstrating consistent derivation.

2. Comparative Global Scaling Factors (λ) and Uncertainties (Δλ) for Different Levels of Theory and Benchmark Data Sets.

M06-2X B3LYP ωB97X-D
6-311++G(d,p) cc-pVTZ 6-311++G(d,p) cc-pVTZ 6-311++G(d,p) cc-pVTZ
0.9523 ± 0.0108 0.9534 ± 0.0094 0.9644 ± 0.0091 0.9653 ± 0.0079 0.9539 ± 0.0090 0.9545 ± 0.0098
0.944 0.948 ± 0.027 0.959 0.965 ± 0.020   0.956
0.947 ± 0.028 0.955 0.963 ± 0.020 0.967    
0.9567 ± 0.0371     0.9691    
a

This work.

b

ref .

c

ref .

d

ref .

e

ref .

f

ref .

Also, mode-dependent scaling factors demonstrate superior performance compared with global and range-dependent ones. For instance, using M06-2X and M06-2X­(-D3) with 6-311++G­(d,p) basis reduces errors by ∼50% relative to global scaling factors, except in the case of C–H ethyl , which exhibit larger deviations. These discrepancies are likely due to anharmonic effects, such as Fermi resonance, which are not captured by scaling factors. Such interactions can shift apparent frequencies, alter intensities, and complicate direct comparisons to harmonic predictions. Among the tested methods, M06-2X­(-D3)/6-311++G­(d,p) shows a better performance. Our observations align with those of Zapata Trujillo and McKemmish, , who evaluated both range-dependent and global scaling approaches and recommended the former for improved frequency predictions.

It is interesting to note that variations between global and specific discrepancies arise due to differences in vibrational frequencies and the number of vibrations in each subgroup. Collective modes in mid- and low-frequency ranges have a stronger influence on global scaling, making it less suitable for higher-frequency stretching modes. This observation supports the use of mode-dependent scaling factors.

In addition, the scatterplots of the frequencies demonstrate a strong correlation between mode-dependent scaled harmonic frequencies and measured fundamental ionization-loss- and -gain SRS frequencies, with Pearson correlation close to one [e.g., 0.99994 for M06-2X-D3/6-311++G­(d,p)], see Figure , and slightly less for other theory levels. This highlights the utility of our database for benchmarking, particularly given the limited availability of scaling factors for various methods and basis sets. However, this database is specific to neurotransmitters and their analogs and hydrates, lacking the representation of all functional groups.

2.

2

Scattered plot of mode-dependent scaled harmonic frequencies against measured fundamental ionization-detected stimulated Raman frequencies and the resulting fitting line. The scaled frequencies are at the M06-2X-D3/6-311++G­(d,p) level of theory.

It is worth noting that mode-dependent scaling factors also suffer from significant drawbacks, primarily poor portability, due to their higher specificity. Since these factors are tailored to individual vibrational modes within a given system, their general applicability remains to be tested for different molecules, limiting their adoption. Additionally, they can obscure deficiencies in the underlying electronic structure method. If scaling factors are applied when high accuracy is not the primary goal, then global scaling factors may provide a practical and more transformable compromise between simplicity and reliability.

To further assess the reliability of the scaling approaches, we analyzed the distribution of prediction errors, defined as the differences between scaled and experimental fundamental frequencies, across various theoretical levels and scaling schemes. We computed the mean absolute percentage error (MAPE; see Methods in Supporting Information) as a normalized metric of accuracy. Additionally, we calculated the first (Q1), second; median (Q2), and third (Q3) quartiles for each classification, representing the 25%, 50%, and 75% of the error distribution, respectively. These metrics were obtained from 100 cycles of 70%/30% training/test partitioning, with results averaged across runs and standard deviations calculated accordingly. The average scaling factors, MAPE, and quartile values obtained for each data classification, are presented in Table .

3. Scaling Factors, MAPE, and First (Q1), Second; Median (Q2), and Third (Q3) Quartiles of the Error Distribution, Calculated Over the Full Dataset .

Theory level/Scaling factors type   M06-2X/6-311++G(d,p) M06-2X-D3/6-311++G(d,p) M06-2X-D3/cc-pVTZ B3LYP/6-311++G(d,p) B3LYP-D3/6-311++G(d,p) B3LYP-D3/cc-pVTZ ωB97X-D/6-311++G(d,p) ωB97X-D/cc-pVTZ
Global {824} global scaling factor 0.9523(4) 0.9523(3) 0.9534(3) 0.9649(3) 0.9650(3) 0.9654(3) 0.9541(5) 0.9539(4)
  MAPE (%) 1.57(5) 1.57(5) 1.32(4) 1.43(5) 1.31(4) 1.07(3) 1.44(4) 1.19(3)
  Q1 (cm–1) 9.7(7) 9.6(8) 7.5(5) 9.7(9) 9.0(7) 7.5(4) 11.2(10) 8.4(5)
  Q2 (cm–1) 20.1(6) 20.1(6) 15.4(8) 19.4(5) 17.4(5) 13.7(4) 17.9(7) 15.1(6)
  Q3 (cm–1) 31.1(7) 31.1(7) 27.4(8) 25.9(7) 24.8(6) 20.9(5) 27.1(4) 23.4(4)
Mode-dependent {O–H [34] + N–H [54] + C–H ring [114] + C–H ethyl [108] + mid [309] + low [175] = 824} O–H scaling factor 0.9403(3) 0.9402(3) 0.9455(4) 0.9584(7) 0.9598(8) 0.9642(9) 0.9400(7) 0.9437(7)
  N–H scaling factor 0.9454(2) 0.9454(3) 0.9485(2) 0.9565(2) 0.9566(3) 0.9590(2) 0.9434(3) 0.9453(2)
  C–H ring scaling factor 0.9533(1) 0.9533(1) 0.9535(1) 0.9640(1) 0.9642(1) 0.9637(2) 0.9566(2) 0.9556(1)
  C–H ethyl scaling factor 0.9493(3) 0.9494(3) 0.9499(3) 0.9631(3) 0.9634(3) 0.9640(3) 0.9543(3) 0.9537(3)
  mid-range scaling factor 0.9735(6) 0.9735(7) 0.9723(5) 0.9833(4) 0.9818(5) 0.9791(4) 0.9712(7) 0.9688(6)
  low-range scaling factor 0.9725(24) 0.9729(21) 0.9671(19) 0.9837(25) 0.9791(23) 0.9731(24) 0.9700(19) 0.9659(22)
  MAPE (%) 0.62(3) 0.62(2) 0.65(2) 0.59(3) 0.64(3) 0.60(3) 0.67(3) 0.58(2)
  Q1 (cm–1) 2.6(3) 2.5(3) 3.0(2) 2.3(4) 2.7(3) 2.2(2) 2.6(5) 2.3(3)
  Q2 (cm–1) 5.5(6) 5.4(5) 6.0(4) 5.0(6) 5.7(6) 5.1(5) 5.5(6) 5.2(4)
  Q3 (cm–1) 10.7(7) 10.5(6) 10.8(5) 9.9(6) 10.6(6) 9.9(5) 12.3(7) 9.7(6)
Range-dependent {high [340] + mid [309] + low [175] = 824} high-range scaling factor 0.9486(2) 0.9486(2) 0.9502(1) 0.9615(2) 0.9619(2) 0.9630(2) 0.9510(3) 0.9513(2)
  mid-range scaling factor 0.9735(6) 0.9735(7) 0.9723(5) 0.9833(4) 0.9818(5) 0.9791(4) 0.9712(7) 0.9688(6)
  low-range scaling factor 0.9725(24) 0.9729(21) 0.9671(19) 0.9837(25) 0.9791(23) 0.9731(24) 0.9700(19) 0.9659(22)
  MAPE (%) 0.68(3) 0.67(2) 0.67(2) 0.62(3) 0.67(3) 0.61(2) 0.76(3) 0.64(2)
  Q1 (cm–1) 3.2(4) 3.2(4) 3.3(2) 2.8(4) 3.2(4) 2.4(2) 3.3(6) 2.8(3)
  Q2 (cm–1) 6.7(7) 6.6(6) 6.9(5) 5.9(7) 6.6(6) 5.4(6) 7.0(6) 6.4(5)
  Q3 (cm–1) 14.4(7) 14.3(8) 12.1(6) 12.1(6) 12.7(6) 10.8(5) 17.6(8) 14.2(5)
a

Scaling factors are categorized as global, range-dependent, or mode-dependent and are specified for each subgroup within the dataset (as indicated in braces). Values in parentheses denote the standard deviation in the last digit, calculated over 100 iterations of random 70%/30% training/test splits.

b

For mode- and range-related vibrations, the values in square brackets indicate the number of frequencies associated with each type, while curly braces show the total number of vibrations present in each data set used for the partitioning.

Comparison of these values reveals key trends. The mean scaling factors from randomized runs match or fall within the uncertainty bounds of the deterministic values in Table . Conversely, each deterministic value lies within one standard deviation of the corresponding values obtained from the randomized runs. The relatively low MAPE values across all classifications confirm that the scaling factors effectively reduce relative errors and improve predictive accuracy. Quartile statistics further characterize the error distribution, with the median (Q2) serving as a robust indicator of the typical discrepancy between scaled harmonic predictions and experimental fundamentals. Calculations using range- and mode-dependent scaling factors yield lower absolute wavenumber differences than those obtained with global scaling, with mode-dependent scaling achieving the greatest reduction. The small standard deviations from 100 iterations of random 70%/30% train–test splits confirm the stability of these results. Collectively, these findings demonstrate that improvements are systematic and robust across the data set.

Modeling and Structure Identification

We present the process for spectral matching using the selected methods: cross-correlation and distance metrics (Euclidean, Manhattan, and optimal transport) (see below). Figure shows a flowchart for evaluating similarity between features in measured and predicted spectra for different structures (see Methods in Supporting Information).

3.

3

Workflow for matching measured and calculated spectra to identify structures. It involves preprocessing the spectra, calculating similarities using various methods, and finding the best match to identify a structure.

The procedure starts with preprocessing of the measured ILSR spectrum and predicted Raman signatures for various conformers. This involves normalizing the signal, scaling harmonic frequencies, and extracting representative frequencies and intensities, as shown in Figure . For cross-correlation, we use the entire measured spectrum to create a full spectrum vector for each computed spectrum, aligning feature intensities with corresponding indices based on frequency matching. It is important to note that scaling harmonic frequencies is essential, as comparative tests have demonstrated that unscaled frequencies often lead to inaccurate spectral features identification and, consequently, incorrect structural assignments.

In contrast, the distance metric approaches focus on discrete spectra consisting of features extracted manually from the measured data. After preprocessing, we either assess the cross-correlation of the measured spectrum and the predicted spectra of the different conformers, Figure a, or calculate Euclidean and Manhattan distances to generate cost matrixes, Figure b. The Kuhn-Munkres assignment is then applied to identify the minimal cost matrix, , Figure c.

4.

4

Schematic of similarity metrics utilized in spectral analysis, including: (a) cross-correlation for entire spectra, which measures similarity by displacing the calculated spectrum (green) relative to the measured one (blue), (b) Euclidean and Manhattan distances, which evaluate distances between extracted features from the measured spectrum (blue dots) and calculated spectrum (red dots), considering frequencies and weighted intensities. The resulting distances generate cost matrixes and the (c) Kuhn-Munkres algorithm assigns features to minimize the cost matrix, thereby generating spectral barcodes for specific compounds. The measured barcode is then compared to predicted barcodes of different conformers using average Euclidean and Manhattan distances to rank similarities. (d) Optimal transport classification, links the measured spectrum (top) with the calculated Euclidean distance-Kuhn-Munkres barcode (center) through an optimal transport plan (bottom) transport cost. This plan redistributes intensity using Euclidean distance for frequency alignment, ranking similarity to find the best match.

Although previous analyses, using Pearson correlation, RMS, Euclidean and Manhattan distances, focused on determining spectral matches by comparing frequency vectors solely, we chose methods that consider both frequency and intensity vectors to better capture spectral patterns. Figure presents a comparison between a small portion of the measured spectrum [panel (a)] and the scaled harmonic spectrum [panel (b)] of the monohydrate, PEAL-H2O C*. The C* denotes the feature at 37,669 cm–1 in the R2PI spectrum. This feature allowed the measurement of the ILSR spectrum by setting the UV laser at this frequency and monitoring the m/z = 122 mass channel. Assigning features based just on frequency is challenging, as the calculated spectrum contains more features than the measured one, leading to problematic assignment when using only frequency-based Euclidean distance, panel (c).

5.

5

Small portions of the measured, calculated, and resulting assignments spectra. (a) Measured ionization-loss mass spectrum of 2-phenyethyl alcohol-H2O C* (ref ) and (b) mode-dependent scaled harmonic Raman spectrum for the global minimum structure at the M06-2X-D3/6-311++G­(d,p) level, convolved with Lorentzian lines [full width at half-maximum of 4 cm–1]. (c,d) Resulting assignment from the minimal cost matrixes of the Kuhn-Munkres algorithm, using one-dimensional Euclidean distances for frequencies and two-dimensional Euclidean distances for frequencies and weighted intensities. The marked frequency values are given in cm–1.

In contrast, the Euclidean distance that incorporates weighted intensities [panel (d)] achieves a close match between calculated and measured spectra, enabling more reasonable assignment. Therefore, in subsequent tests, we employ the selected methods, which also account for intensities, to assign each measured feature to its corresponding calculated feature in a potential matching spectrum.

Given the choice of the similarity approaches can significantly influence model performance and prediction accuracy, we will evaluate the selected methods, including cross-correlation, Euclidean, Manhattan, and optimal transport distances (Figure a,b,d). Optimal transport calculates the cost of transforming intensities from their positions in the measured spectrum (top panel, Figure d) to those generating the Raman spectrum (central panel, Figure d) (see Methods in Supporting Information). This cost is determined by minimizing the work required to transfer intensity units from the features in the measured spectrum to those forming the bottom spectrum, as shown in Figure d. The process accounts for the Manhattan distance along the frequency axis required to transport intensity between features.

We investigate the potential to associate a measured ILSR spectrum with a specific structure by analyzing single and double PEAL hydrates and 2-FPEAL conformers, using the M06-2X-D3/6-311++G­(d,p) theory level. These systems are suitable for testing our approach due to the diverse three-dimensional structures that they form. These variations arise from side chain folding and the formation of numerous intra- and intermolecular hydrogen bonds, , leading to slight differences in energies and spectra. Supporting Information Figure S3A–C show the shapes of PEAL-H2O, PEAL-(H2O)2, and 2-FPEAL, ranked by relative energies, including zero-point vibrational energy corrections, following full structural optimizations and zero-point vibrational energy corrections at the M06-2X-D3/6-311++G­(d,p) level. While energy rankings might differ at other theory levels, our goal here is to match measured spectra with calculated spectra to determine species structures, Supporting Information Figures S4.

Our approach was evaluated by plotting cross-correlations and distance values using the Kuhn-Munkres assignment with Euclidean, Manhattan, and optimal transport distance metrics. Figure a–c show results for PEAL-H2O C* and F, PEAL-(H2O)2 C* and G, and 2-FPEAL G1h and A1f with their geometries in Figures S2 and S3 of the Supporting Information. The nomenclature used for the identified geometries is consistent with refs and .

6.

6

Cross-correlations and distance metrics representing similarities between mode-dependent scaled Raman spectra at the M06-2X-D3/6-311++G­(d,p) level and experimental ionization-loss stimulated Raman signatures of specific species. These values are derived from the optimized geometries and their corresponding spectra as shown in Figures S3 and S4 of the Supporting Information. The color maps show results for 2-phenylethyl alcohol (PEAL) (a) mono- and (b) dihydrates, and of (c) 2-(2-fluorophenyl)­ethyl alcohol (2-FPEAL) conformers. These maps use cross-correlation and Kuhn-Munkres assignment, including average Euclidean and Manhattan distances and optimal transport distances. Within each cell, cross-correlation values and distances are visualized using gradients with color bar on the right indicating the best matches. The horizontal arrows mark the structures identified through visual comparisons.

As shown by the right color bar in Figure , where blue denotes maximal similarities and red denotes minimal similarities for each method, it is probable that the highest cross-correlations and lowest distances, indicated by the bluest color, should identify species structures. Using these criteria for conformers identification, we find that the evaluated Euclidean and Manhattan distances and mostly cross-correlations (except for PEAL-(H2O)2 G), correspond closely with those determined through visual comparisons (indicated by horizontal arrows) of measured and calculated spectra, effectively identifying the structures. , In contrast, optimal transport fails at the structural identification of the species. Supporting Information Data 2 shows that visual matching of predicted data at the M06-2X-D3/6-311++G­(d,p) level of theory strongly agrees with the quantitatively assigned data at the same level.

It is important to note that the uncertainties in the Euclidean and Manhattan distances, determined through error propagation (see Methods in Supporting Information, eqs S13 and S14), range from 0.10 to 0.72, depending on the specific molecule and similarity method used. Based on the metrics and their uncertainties, some configurations may present comparable Euclidean and Manhattan distances, suggesting that the introduced criterion might not be sufficient. Therefore, in cases with minimal metric changes, it is possible to consider an additional criterion that refers to the relative energies and interconversion barriers for identifying the correct structure.

Indeed, by using the similarity criterion for PEAL-H2O C*, PEAL-(H2O)2 C*, and 2-FPEAL G1h, the structures are ranked as 1 and correspond to the global minimum. On the other hand, PEAL-H2O F and PEAL-(H2O)2 G, and 2-FPEAL A1f are associated with local minima, with the hydrates ranked as 2 and the 2-FPEAL as 5. This ranking is consistent with the tendency of molecular beams to primarily contain lower-energy species. For 2-FPEAL, the low potential energy barriers separating wells on the potential energy surface allow relaxation and the observation of a higher energy conformer.

These findings confirm that the cross-correlation and Euclidean/Manhattan distance-based Kuhn-Munkres assignment effectively recognize spectral data patterns through similarity evaluations. Particularly, these methods and their subsequent outcome enable informative determinations matching measured to closely related calculated spectra while distinguishing dissimilar ones. Conversely the optimal transport approach, which may redistribute feature intensity to additional ones, proves ineffective for identifying similarities between measured and calculated spectra.

Nevertheless, it is important to note that our structural identification of the PEAL hydrates and 2-FPEAL conformers using the Euclidean/Manhattan distance-based-Kuhn-Munkres assignment differs from previous assignments based on low- and high-resolution electronic spectroscopy. , For the PEAL hydrates, they attributed the C* R2PI feature to the global minimum structure, PEAL-H2O C*, consistent with our findings but without recognizing the contribution of PEAL-(H2O)2 C*. Interestingly, they could not resolve the F band due to its low intensity, while the G band, despite its higher intensity, appeared superimposed on a high background, exhibiting a nonresolved rotational microstructure, precluding its rotational fit. Therefore, they tentatively assigned this band to a structure with a sideways-bound water molecule rather than the PEAL-(H2O)2 G structure, identified in our study. Similarly, for 2-FPEAL, while our identification of the global minimum structure agreed with theirs, the lack of a convincing fit of the rotational structure for the other band led them to tentatively assign it to the G1f conformer. In contrast, our findings, supported by visual comparison as well as cross-correlation and Euclidean/Manhattan distance-based Kuhn-Munkres assignment, point to the A1f conformer as the correct assignment.

While the Euclidean and Manhattan distance metrics effectively identify structurally isolated conformers using mass-resolved ILSR spectra in comparison to predicted spectra, several factors can influence their success. For example, experimental systems may not provide isolated spectra or adequate spectral resolution, leading to overlapping spectral features and noise that can complicate the analysis. Additionally, computational predictions depend on the selected theoretical model and basis set, which may result in discrepancies in predicted vibrational frequencies or intensities due to harmonic approximations and inaccuracies in polarizabilities. These factors affect the alignment between measured and calculated spectra.

Our current methodology does not address some of these issues, including overlapping measured spectral features and computational errors. To enhance robustness, incorporating deconvolution methods, data preprocessing techniques, and error-correcting protocols could be beneficial. Extended case studies with complex data sets could further refine the metrics and broaden their applicability across diverse chemical systems.

Summary

We have developed a new database from our previously measured broad-range ionization-loss and -gain SRS spectra of compounds containing phenyl or indole rings attached to ethylamino (or ethyl alcohol) side chains along with their hydrates. These compounds exhibit high structural complexity due to their intricate conformational landscapes. By comparing the experimental fundamental bands with the predicted Raman spectra of visually identified structures across different levels of theory, we derived global, range-dependent, and mode-dependent scaling factors. The global scaling factors are close to those previously reported for similar theoretical levels, affirming the reliability of our database. Statistical analyses indicate that mode- and range-dependent scaling factors outperform global scaling approaches, showing higher accuracy and offering promising applications for benchmarking future predictions.

Further testing of our approach involves assigning measured ILSR spectra that reflect isolated individual structures by selective ion detection through various mass channels. We use cross-correlation for the entire spectra, Euclidean/Manhattan distance-Kuhn-Munkres assignment to generate spectral barcodes, and determine their average Euclidean/Manhattan or optimal transport distances obtained via comparison of measured spectra to mode-dependent scaled spectra. This strategy demonstrated that both cross-correlation and, even more effectively, the Euclidean/Manhattan distance-based Kuhn-Munkres approaches successfully capture spectral data patterns by evaluating similarities through these metrics, providing a basis for spectral rankings. These evaluations enable matching measured with closely related calculated spectra, informative structural identifications, and classification of unrelated ones. These findings represent a new frontier of spectral identifications, and we anticipate a new window opening to structural recognition.

Supplementary Material

jp5c06253_si_002.xlsx (76.9KB, xlsx)
jp5c06253_si_003.xlsx (76.9KB, xlsx)
jp5c06253_si_004.txt (504KB, txt)
jp5c06253_si_005.txt (86.4KB, txt)

Acknowledgments

We would like to acknowledge the financial support of this research by the Israel Science Foundation (ISF) founded by the Israel Academy of Sciences and Humanities.

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpca.5c06253.

  • Schematics of ionization-loss and -gain stimulated Raman spectroscopies, geometries and labels of conformers and hydrates of the various species, and Methods; geometries of global and local minima of conformers of 2-phenylethyl alcohol mono- and dihydrates and of 2-(2-fluorophenyl)­ethyl alcohol; measured ionization-loss stimulated Raman spectra pointing downward and scaled harmonic Raman spectra upward of these conformers; and spreadsheets of measured and calculated features for different compounds and Kuhn-Munkres-Euclidean distances and visual spectral assignments (PDF)

  • (XLSX)

  • (XLSX)

  • (TXT)

  • (TXT)

The authors declare no competing financial interest.

References

  1. Weber A., Hoplight B., Ogilvie R., Muro C., Khandasammy S. R., Pérez-Almodóvar L., Sears S., Lednev I. K.. Innovative Vibrational Spectroscopy Research for Forensic Application. Anal. Chem. 2023;95:167–205. doi: 10.1021/acs.analchem.2c05094. [DOI] [PubMed] [Google Scholar]
  2. Yang D., Ying Y.. Applications of Raman Spectroscopy in Agricultural Products and Food Analysis: A Review. Appl. Spectrosc. Rev. 2011;46:539–560. doi: 10.1080/05704928.2011.593216. [DOI] [Google Scholar]
  3. Roggo Y., Chalus P., Maurer L., Lema-Martinez C., Edmond A., Jent N.. A Review of Near Infrared Spectroscopy and Chemometrics in Pharmaceutical Technologies. J. Pharm. Biomed. Anal. 2007;44:683–700. doi: 10.1016/j.jpba.2007.03.023. [DOI] [PubMed] [Google Scholar]
  4. Lussier F., Thibault V., Charron B., Wallace G. Q., Masson J.-F.. Deep Learning and Artificial Intelligence Methods for Raman and Surface-Enhanced Raman Scattering. Trends Anal. Chem. 2020;124:115796. doi: 10.1016/j.trac.2019.115796. [DOI] [Google Scholar]
  5. Morais C. L. M., Lima K. M. G., Singh M., Martin F. L.. Tutorial: Multivariate Classification for Vibrational Spectroscopy in Biological Samples. Nat. Protoc. 2020;15:2143–2162. doi: 10.1038/s41596-020-0322-8. [DOI] [PubMed] [Google Scholar]
  6. Guo S., Popp J., Bocklitz T.. Chemometric Analysis in Raman Spectroscopy from Experimental Design to Machine Learning–Based Modelling. Nat. Protoc. 2021;16:5426–5459. doi: 10.1038/s41596-021-00620-3. [DOI] [PubMed] [Google Scholar]
  7. Pribble R. N., Zwier T. S.. Size-Specific Infrared Spectra of Benzene-(H2O)n Clusters (n = 1 Through 7): Evidence For Noncyclic (H2O)n Structures. Science. 1994;265:75–79. doi: 10.1126/science.265.5168.75. [DOI] [PubMed] [Google Scholar]
  8. Li G., Zhang Y.-Y., Li Q., Wang C., Yu Y., Zhang B., Hu H.-S., Zhang W., Dai D., Wu G., Zhang D. H., Li J., Yang X., Jiang L.. Infrared Spectroscopic Study of Hydrogen Bonding Topologies in the Smallest Ice Cube. Nature. Commun. 2020;11:5449. doi: 10.1038/s41467-020-19226-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Hohenberg P., Kohn W.. Inhomogeneous Electron Gas. Phys. Rev. 1964;136:B864. doi: 10.1103/PhysRev.136.B864. [DOI] [Google Scholar]
  10. Kohn W., Sham L. J.. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 1965;140:A1133–A1138. doi: 10.1103/physrev.140.a1133. [DOI] [Google Scholar]
  11. Schermann, J.-P. Spectroscopy and Modelling of Biomolecular Building Blocks; Elsevier: Amsterdam, The Netherlands, 2008. [Google Scholar]
  12. Breidung, J. ; Thiel, W. . Prediction of Vibrational Spectra from Ab Initio Theory. In Handbook of High-Resolution Spectroscopy Quack, M. ; Merkt, F. , Eds.), Wiley, Chishester, 2011. [Google Scholar]
  13. Hess B. A. Jr., Schaaad L. J., Carsky P., Zahradnik R.. Ab Initio Calculations of Vibrational Spectra and their Use in Identification of Unusual Molecules. Chem. Rev. 1986;86:709–730. doi: 10.1021/cr00074a004. [DOI] [Google Scholar]
  14. Scott A. P., Radom L.. Harmonic Vibrational Frequencies: An Evaluation of Hartree-Fock, Møller-Plesset, Quadratic Configuration Interaction, Density Functional Theory, and Semiempirical Scale Factors. J. Phys. Chem. 1996;100:16502–16513. doi: 10.1021/jp960976r. [DOI] [Google Scholar]
  15. Alecu I. M., Zheng J., Zhao Y., Truhlar D. G.. Computational Thermochemistry: Scale Factor Databases and Scale Factors for Vibrational Frequencies Obtained from Electronic Model Chemistries. J. Chem. Theory Comput. 2010;6:2872–2887. doi: 10.1021/ct100326h. [DOI] [PubMed] [Google Scholar]
  16. Laury M. L., Boesch S. E., Haken I., Sinha P., Wheeler R. A., Wilson A. K.. Harmonic Vibrational Frequencies: Scale Factors for Pure, Hybrid, Hybrid Meta, and Double-Hybrid Functionals in Conjunction with Correlation Consistent Basis Sets. J. Comput. Chem. 2011;32:2339–2347. doi: 10.1002/jcc.21811. [DOI] [PubMed] [Google Scholar]
  17. Kashinski D. O., Chase G. M., Nelson R. G., Di Nallo O. E., Scales A. N., VanderLey D. L., Byrd E. F. C.. Harmonic Vibrational Frequencies: Approximate Global Scaling Factors for TPSS, 2017 Factors for TPSS, M06, and M11 Functional Families Using Several Common Basis Sets. J. Phys. Chem. A. 2017;121:2265–2273. doi: 10.1021/acs.jpca.6b12147. [DOI] [PubMed] [Google Scholar]
  18. Palafox M. A.. DFT Computations on Vibrational Spectra: Scaling Procedures to Improve the Wavenumbers. Phys. Sci. Rev. 2018;3:20170184. doi: 10.1515/psr-2017-0184. [DOI] [Google Scholar]
  19. Zapata Trujillo J. C., McKemmish L. K.. Meta-Analysis of Uniform Scaling Factors for Harmonic Frequency Calculations. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022;12:e1584. doi: 10.1002/wcms.1584. [DOI] [Google Scholar]
  20. Zapata Trujillo J. C., McKemmish L. K.. VIBFREQ1295: Anew Database for Vibrational Frequency calculations. J. Phys. Chem. A. 2022;126:4100–4122. doi: 10.1021/acs.jpca.2c01438. [DOI] [PubMed] [Google Scholar]
  21. Bowman J. M.. The Self-Consistent-Field Approach to Polyatomic Vibrations. Acc. Chem. Res. 1986;19:202–208. doi: 10.1021/ar00127a002. [DOI] [Google Scholar]
  22. Gerber R., Ratner M. A.. Self-Consistent-Field Methods for Vibrational Excitations in Polyatomic Systems. Adv. Chem. Phys. 1988;70:97–132. doi: 10.1002/9780470141199.ch4. [DOI] [Google Scholar]
  23. Brauer B., Pincu M., Buch V., Bar I., Simons J. P., Gerber R. B.. Vibrational Spectra of α-Glucose, β-Glucose, and Sucrose: Anharmonic Calculations and Experiment. J. Phys. Chem. A. 2011;115:5859–5872. doi: 10.1021/jp110043k. [DOI] [PubMed] [Google Scholar]
  24. Xie H.-b., Pincu M., Brauer B., Gerber R. B., Bar I.. Raman and Infrared Spectra of Cellobiose in the Solid State: What Can be Learned from Single-Molecule Calculations? Chem. Phys. Lett. 2011;514:284–290. doi: 10.1016/j.cplett.2011.08.082. [DOI] [Google Scholar]
  25. Barone V.. Anharmonic Vibrational Properties by a Fully Automated Second-Order Perturbative Approach. J. Chem. Phys. 2005;122:014108. doi: 10.1063/1.1824881. [DOI] [PubMed] [Google Scholar]
  26. Yang Q., Mendolicchio M., Barone V., Bloino J.. Accuracy and Reliability in the Simulation of Vibrational Spectra: A Comprehensive Benchmark of Energies and Intensities Issuing from Generalized Vibrational Perturbation Theory to Second Order (GVPT2) Front. Astron. Space Sci. 2021;8:665232. doi: 10.3389/fspas.2021.665232. [DOI] [Google Scholar]
  27. Futrelle R. P., McGinty D. J.. Calculation of Spectra and Correlation Functions from Molecular Dynamics Data using the Fast Fourier Transform. Chem. Phys. Lett. 1971;12:285–287. doi: 10.1016/0009-2614(71)85065-0. [DOI] [Google Scholar]
  28. Gaigeot M.-P., Sprik M.. Ab Initio Molecular Dynamics Computation of the Infrared Spectrum of Aqueous Uracil. J. Phys. Chem. B. 2003;107:10344–10358. doi: 10.1021/jp034788u. [DOI] [Google Scholar]
  29. Putrino A., Parrinello M.. Anharmonic Raman Spectra in High-Pressure Ice from Ab Initio Simulations. Phys. Rev. Lett. 2002;88:176401. doi: 10.1103/PhysRevLett.88.176401. [DOI] [PubMed] [Google Scholar]
  30. Thomas M., Brehm M., Fligg R., Vöhringer P., Kirchner B.. Computing Vibrational Spectra from Ab Initio Molecular Dynamics. Phys. Chem. Chem. Phys. 2013;15:6608–6622. doi: 10.1039/c3cp44302g. [DOI] [PubMed] [Google Scholar]
  31. Robertson E. G., Simons J. P.. Getting into Shape: Conformational and Supramolecular Landscapes in Small Biomolecules and their Hydrated Clusters. Phys. Chem. Chem. Phys. 2001;3:1–18. doi: 10.1039/b008225m. [DOI] [Google Scholar]
  32. de Vries M. S.. Gas-Phase IR Spectroscopy of Nucleobases. Top. Curr. Chem. 2014;364:271–297. doi: 10.1007/128_2014_577. [DOI] [PubMed] [Google Scholar]
  33. Gloaguen E., Mons M., Schwing K., Gerhards M.. Neutral Peptides in the Gas Phase: Conformation and Aggregation Issues. Chem. Rev. 2020;120:12490–12562. doi: 10.1021/acs.chemrev.0c00168. [DOI] [PubMed] [Google Scholar]
  34. Mayorkas N., Izbitski S., Bernat A., Bar I.. Simultaneous Ionization-Detected Stimulated Raman and Visible–Visible–Ultraviolet Hole-Burning Spectra of Two Tryptamine Conformers. J. Phys. Chem. Lett. 2012;3:603–607. doi: 10.1021/jz300026a. [DOI] [PubMed] [Google Scholar]
  35. Mayorkas N., Bernat A., Izbitski S., Bar I.. Vibrational and Vibronic Spectra of Tryptamine Conformers. J. Chem. Phys. 2013;138:124312. doi: 10.1063/1.4798218. [DOI] [PubMed] [Google Scholar]
  36. Bakker J. M., Aleese L. M., Meijer G., von Helden G.. Fingerprint IR Spectroscopy to Probe Amino Acid Conformations in the Gas Phase. Phys. Rev. Lett. 2003;91:203003. doi: 10.1103/PhysRevLett.91.203003. [DOI] [PubMed] [Google Scholar]
  37. Bakels S., Gaigeot M.-P., Rijs A. M.. Gas-Phase Infrared Spectroscopy of Neutral Peptides: Insights from the Far-IR and THz Domain. Chem. Rev. 2020;120:3233–3260. doi: 10.1021/acs.chemrev.9b00547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Golan A., Mayorkas N., Rosenwaks S., Bar I.. Raman Spectral Signatures as Conformational Probes of Gas Phase Flexible Molecules. J. Chem. Phys. 2009;131:024305. doi: 10.1063/1.3168402. [DOI] [PubMed] [Google Scholar]
  39. Mayorkas N., Malka I., Bar I.. Ionization-Loss Stimulated Raman Spectroscopy for Conformational Probing of Flexible Molecules. Phys. Chem. Chem. Phys. 2011;13:6808–6815. doi: 10.1039/c0cp02334e. [DOI] [PubMed] [Google Scholar]
  40. Mayorkas N., Cohen S., Sachs H., Bar I.. Photofragment Ionization-Loss Stimulated Raman Spectroscopy of a Hydrated Neurotransmitter: 2-Phenylethylamine–Water. RSC Adv. 2014;4:58752–58757. doi: 10.1039/C4RA10493E. [DOI] [Google Scholar]
  41. Rahimi R., Shachar A., Bar I.. Experimental/Computational Study on the Impact of Fluorine on the Structure and Noncovalent Interactions in the Monohydrated Cluster of ortho-Fluorinated 2-Phenylethylamine. J. Am. Chem. Soc. 2022;144:8337–8346. doi: 10.1021/jacs.2c02480. [DOI] [PubMed] [Google Scholar]
  42. Mayorkas N., Sachs H., Schütz M., Ishiuchi S.-I., Fujii M., Dopfer O., Bar I.. Structural Motifs of 2-(2-Fluoro-Phenyl)-Ethylamine Conformers. Phys. Chem. Chem. Phys. 2016;18:1191–1201. doi: 10.1039/C5CP06131H. [DOI] [PubMed] [Google Scholar]
  43. Shachar A., Mayorkas N., Sachs H., Bar I.. The Conformational Landscape of 2-(4-Fluorophenyl)-Ethylamine: Consequences of Fluorine Substitution at The Para Position. Phys. Chem. Chem. Phys. 2017;19:510–522. doi: 10.1039/C6CP06456F. [DOI] [PubMed] [Google Scholar]
  44. Shachar A., Mayorkas N., Bar I.. Structural Features of Monohydrated 2-(4-Fluorophenyl) Ethylamine: A Combined Spectroscopic and Computational Study. Phys. Chem. Chem. Phys. 2017;19:23999–24008. doi: 10.1039/C7CP04195K. [DOI] [PubMed] [Google Scholar]
  45. Rahimi R., Saban N., Bar I.. Synergistic Spectroscopic and Computational Characterization Evidencing the Preservation or Flipping of the Hydroxyl Group of 2-Phenylethyl Alcohol upon Single and Double Hydration. J. Am. Chem. Soc. 2023;145:18455–18467. doi: 10.1021/jacs.3c04762. [DOI] [PubMed] [Google Scholar]
  46. Rahimi R., Saban N., Bar I.. Conformational Mapping, Interactions, and Fluorine Impact by Combined Spectroscopic Approaches and Quantum Chemical Calculations. J. Phys. Chem. Lett. 2024;15:3658–3667. doi: 10.1021/acs.jpclett.4c00301. [DOI] [PubMed] [Google Scholar]
  47. Mata R. A., Suhm M. A.. Benchmarking Quantum Chemical Methods: Are We Heading in the Right Direction? Angew. Chem., Int. Ed. 2017;56:11011–11018. doi: 10.1002/anie.201611308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Seifert N. A., Prozument K., Davis M. J.. Computational Optimal Transport for Molecular Spectra: The Fully Discrete Case. J. Chem. Phys. 2021;155:184101. doi: 10.1063/5.0069681. [DOI] [PubMed] [Google Scholar]
  49. Kuhn H. W.. The Hungarian Method for the Assignment Problem. Nav. Res. Logist. Q. 1955;2:83–97. doi: 10.1002/nav.3800020109. [DOI] [Google Scholar]
  50. Munkres J.. Algorithms for the Assignment and Transportation Problems. J. Soc. Ind. Appl. Math. 1957;5:32–38. doi: 10.1137/0105003. [DOI] [Google Scholar]
  51. Zhao Y., Truhlar D. G.. The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-Class Functionals and 12 Other Functionals. Theor. Chem. Acc. 2008;120:215–241. doi: 10.1007/s00214-007-0310-x. [DOI] [Google Scholar]
  52. Becke A. D.. Density-Functional Thermochemistry. III. The Role of Exact Exchange. J. Chem. Phys. 1993;98:5648–5652. doi: 10.1063/1.464913. [DOI] [Google Scholar]
  53. Lee C., Yang W., Parr R. G.. Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of the Electron Density. Phys. Rev. B. 1988;37:785–789. doi: 10.1103/PhysRevB.37.785. [DOI] [PubMed] [Google Scholar]
  54. Chai J. D., Head-Gordon M.. Systematic Optimization of Long-Range Corrected Hybrid Density Functionals. J. Chem. Phys. 2008;128:084106. doi: 10.1063/1.2834918. [DOI] [PubMed] [Google Scholar]
  55. Hariharan P. C., Pople J. A.. The Influence of Polarization Functions on Molecular Orbital Hydrogenation Energies. Theor. Chim. Acta. 1973;28:213–222. doi: 10.1007/BF00533485. [DOI] [Google Scholar]
  56. Clark T., Chandrasekhar J., Spitznagel G. W., Schleyer P. V. R.. Efficient Diffuse Function-Augmented Basis-Sets for Anion Calculations. 3. The 3–21+G Basis Set for 1st-Row Elements, Li-F. J. Comput. Chem. 1983;4:294–301. doi: 10.1002/jcc.540040303. [DOI] [Google Scholar]
  57. Kendall R. A., Dunning T. H. Jr., Harrison R. J.. Electron Affinities of the First-Row Atoms Revisited. Systematic Basis Sets and Wave Functions. J. Chem. Phys. 1992;96:6796–6806. doi: 10.1063/1.462569. [DOI] [Google Scholar]
  58. Grimme S., Ehrlich S., Goerigk L.. Effect of the Damping Function in Dispersion Corrected Density Functional Theory. J. Comput. Chem. 2011;32:1456–1465. doi: 10.1002/jcc.21759. [DOI] [PubMed] [Google Scholar]
  59. Hanson-Heine M. W. D.. Benchmarking DFT-D Dispersion Corrections for Anharmonic Vibrational Frequencies and Harmonic Scaling Factors. J. Phys. Chem. A. 2019;123:9800–9808. doi: 10.1021/acs.jpca.9b07886. [DOI] [PubMed] [Google Scholar]
  60. Goerigk L., Hansen A., Bauer C., Ehrlich S., Najibi A., Grimme S.. A Look at the Density Functional Theory Zoo with the Advanced GMTKN55 Database for General Main Group Thermochemistry, Kinetics and Noncovalent Interactions. Phys. Chem. Chem. Phys. 2017;19:32184–32215. doi: 10.1039/c7cp04913g. [DOI] [PubMed] [Google Scholar]
  61. Mardirossian N., Head-Gordon M.. Thirty Years of Density Functional Theory in Computational Chemistry: An Overview and Extensive Assessment of 200 Density Functionals. Mol. Phys. 2017;115:2315–2372. doi: 10.1080/00268976.2017.1333644. [DOI] [Google Scholar]
  62. Pitman S. J., Evans A. K., Ireland R. T., Lempriere F., McKemmish L. K.. Benchmarking Basis Sets for Density Functional Theory Thermochemistry Calculations: Why Unpolarized Basis Sets and the Polarized 6–311G Family Should Be Avoided. J. Phys. Chem. A. 2023;127:10295–10306. doi: 10.1021/acs.jpca.3c05573. [DOI] [PubMed] [Google Scholar]
  63. Zapata Trujillo J. C., McKemmish L. K.. Model Chemistry Recommendations for Scaled Harmonic Frequency Calculations: A Benchmark Study. J. Phys. Chem. A. 2023;127:1715–1735. doi: 10.1021/acs.jpca.2c06908. [DOI] [PubMed] [Google Scholar]
  64. Sinha P., Boesch S. E., Gu C., Wheeler R. A., Wilson A. K.. Harmonic Vibrational Frequencies: Scaling Factors for HF, B3LYP, and MP2Methods in Combination with Correlation Consistent Basis Sets. J. Phys. Chem. A. 2004;108:9213–9217. doi: 10.1021/jp048233q. [DOI] [Google Scholar]
  65. Assefa M. K., Devera J. L., Brathwaite A. D., Mosley J. D., Duncan M. A.. Vibrational Scaling Factors for Transition Metal Carbonyls. Chem. Phys. Lett. 2015;640:175–179. doi: 10.1016/j.cplett.2015.10.031. [DOI] [Google Scholar]
  66. Yu L., Srinivas G. N., Schwartz M.. Scale Factors for C–O Vibrational Frequencies in Organometallic Complexes. J. Mol. Struct. (THEOCHEM) 2003;625:215–220. doi: 10.1016/S0166-1280(03)00024-1. [DOI] [Google Scholar]
  67. Bloino J., Biczysko M., Barone V.. Anharmonic Effects on Vibrational Spectra Intensities: Infrared, Raman, Vibrational Circular Dichroism, and Raman Optical Activity. J. Phys. Chem. A. 2015;119:11862–11874. doi: 10.1021/acs.jpca.5b10067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Barone V., Biczysko M., Bloino J.. Fully Anharmonic IR and Raman Spectra of Medium-Size Molecular Systems: Accuracy and Interpretation. Phys. Chem. Chem. Phys. 2014;16:1759–1787. doi: 10.1039/C3CP53413H. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Neugebauer J., Reiher M., Hess B. A.. Coupled-Cluster Raman Intensities: Assessment and Comparison with Multiconfiguration and Density Functional Methods. J. Chem. Phys. 2002;117:8623–8633. doi: 10.1063/1.1506919. [DOI] [Google Scholar]
  70. Kanchanakungwankul, S. ; Bao, J. L. ; Zheng, J. ; Alecu, I. M. ; Lynch, B. J. ; Zhao, Y. ; Truhlar, D. G. . Database of Frequency Scale Factors for Electronic Model Chemistries, Version 5. https://comp.chem.umn.edu/freqscale/(accessed 09 05, 2025).
  71. Ünal Y., Nassif W., Özaydin B. C., Sayin K.. Scale Factor Database for the Vibration Frequencies Calculated in M06–2X, one of the DFT Methods. Vib. Spectrosc. 2021;112:103189. doi: 10.1016/j.vibspec.2020.103189. [DOI] [Google Scholar]
  72. Precomputed Vibrational Scaling Factors within Computational Chemistry Comparison and Benchmark Database; National Institute of Standards and Technology (NIST), Release 22, May 2022, https://cccbdb.nist.gov/vibscalejust.asp (accessed 09 05, 2025). [Google Scholar]
  73. Keating M. E., Bonnier F., Byrne H. J.. Spectral Cross-Correlation as a Supervised approach for the Analysis of Complex Raman Datasets: The Case of Nanoparticles in Biological Cells. Analyst. 2012;137:5792–5802. doi: 10.1039/c2an36169h. [DOI] [PubMed] [Google Scholar]
  74. Samuel A. Z., Mukojima R., Horii S., Ando M., Egashira S., Nakashima T., Iwatsuki M., Takeyama H.. On Selecting a Suitable Spectral Matching Method for Automated Analytical Applications of Raman Spectroscopy. ACS Omega. 2021;6:2060–2065. doi: 10.1021/acsomega.0c05041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Karaminkov R., Chervenkov S., Neusser H. J.. Identification of Conformational Structures of 2-Phenylethanol and its Singly Hydrated Complex by Mass Selective High-Resolution Spectroscopy and Ab Initio Calculations. J. Phys. Chem. A. 2008;112:839–848. doi: 10.1021/jp076541t. [DOI] [PubMed] [Google Scholar]
  76. Karaminkov R., Chervenkov S., Neusser H. J., Ramanathan V., Chakraborty T.. Competition between σ and π Hydrogen Bonds and Conformational Probing of 2-Ortho Fluorophenylethanol by Low- and High-Resolution Electronic Spectroscopy. J. Chem. Phys. 2009;130:034301. doi: 10.1063/1.3058483. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jp5c06253_si_002.xlsx (76.9KB, xlsx)
jp5c06253_si_003.xlsx (76.9KB, xlsx)
jp5c06253_si_004.txt (504KB, txt)
jp5c06253_si_005.txt (86.4KB, txt)

Articles from The Journal of Physical Chemistry. a are provided here courtesy of American Chemical Society

RESOURCES