Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 Jul 7;36(8):1677–1685. doi: 10.1021/jasms.5c00078

Assigning Peptide Structure from Ion-Mobility Mass Spectrometry Collision Cross Section Data

Mithony Keng 1, Kenneth M Merz Jr 1,2,*
PMCID: PMC12333332  PMID: 40623297

Abstract

The ion mobility (IM) technique coupled with traditional mass spectrometry (IM-MS) has introduced a practical tool for the characterization of peptide analyte ions by exploiting the difference in their collision cross section (CCS) values. CCS holds molecular level information that can be used to computationally assign the conformation(s) of a molecular system. However, a reliable and accurate method for peptide structure prediction remains a challenge because peptides exhibit dynamic gas-phase intramolecular interactions, and hence, the methods used will need to account for this. In this work, we systematically assessed the performance of a computational workflow involving both classical and density functional theory (DFT) steps to elucidate the peptide structure. Not unexpectedly, extensive enumeration of available peptide conformations was critical to obtain high-quality results. Due to the size of the systems studied and the large numbers of conformers that needed to be optimized, we initially chose the D3-B3LYP/6-31G­(d) level of theory and obtained good agreement between experimental and computed CCS values. However, in several cases, suboptimal accuracies were observed, but we found that increasing the basis set used to 6-31G­(d,p) was able to improve our agreement with experiment. Altogether, we demonstrated that accurate peptide structure assignment is achievable with adequate sampling of the conformational space and using the appropriate quantum mechanical level of theory to account for intramolecular interactions in the gas phase.


graphic file with name js5c00078_0008.jpg


graphic file with name js5c00078_0007.jpg

Introduction

Trypsin digestion of a protein analyte in combination with mass spectrometry (MS) is one of the most common and practical methods for proteomic profiling of the metabolome. In most protein sequencing workflows, soft-ionization techniques such as matrix-assisted laser desorption/ionization and electrospray ionization (ESI) are utilized to generate precursor protein ions and fragment peptide ions, respectively, that are populated in unknown gas-phase conformations for MS analyses. The emergence of the ion mobility (IM) technique coupled with traditional MS (IM-MS) has added a practical quantitative and qualitative dimension to the characterization of biologically relevant peptide structures by exploiting the difference in the ions rotationally averaged collision cross section (CCS) values. , This is realized because ions with larger conformations have a greater frequency of collision with neutral gases in the IM-MS drift region, resulting in differentiated drift velocities between analytes with identical m/z. Within the confines of IM-MS, the CCS can be derived from ion mobility (K) and thus is affected by both the ion drift velocity (v d) and the strength of the electric field, E (see eq ). Depending on the IM technique, E can be applied as an electrodynamic field (e.g., TWIMS) or a uniform field (e.g., DTIMS).

K=vdE 1

We note that there have been several computational works relevant to CCS measurements like using machine learning, molecular dynamics (MD), and other approaches to predict CCS values for peptides; however, the outcomes of these works did not result in any experimentally validated molecular geometry or a reliable high throughput method for gas-phase structure assignment. Although works by Kim et al. (2017) and Wu et al. (2023) were able to produce final gas-phase peptide structures, their investigations were limited to resolving one and two peptide species, respectively. , Moreover, molecular modeling methods that rely solely on pairwise potentials for finding an equilibrium (ground state) peptide conformation are prone to mis-assigned structures.

To advance proteomics and support unambiguous MS based structure assignment, it is essential to have reliable gas-phase peptide theoretical and experimental standards for evaluating experimental results or for the query of physiochemical properties. The development of a theoretical reference database is an ongoing challenge due to the richness of peptide conformations in the gas phase. Unlike carbohydrates or lipids, which are structurally rigid polymers or have minimal titratable sites, respectively, the conformational space of peptides is more expansive since they can be constructed from 20 common amino acids that have flexible side chains with different degrees of acidity/basicity. Consequently, this makes finding equilibrium charge sites on peptides for modeling IM-MS ions a major challenge and, accordingly, requires an accurate and efficient quantum mechanical theory to generate reliable gas-phase conformations.

Moreover, the challenge posed by peptide conformational dynamics in replicating experimental geometry was, for example, demonstrated by the interconversion of polyalanine from a 310 helix to an α-helix conformation with increasing alanine residues using MP2/6-31G­(d) or the B97-D/def2-TZVP level of theory (interconversion onset began at alanine-8 for MP2 and at alanine-10 for B97-D). Importantly, when the B3LYP/6-311++G­(2d,2p) level with no long-range dispersion corrections was employed, the 310 helix remained the more stable and, thus, the preferred conformation for alanine-4 through alanine-16. Based on these results for the different QM theories, it is inferred that a significant amount of the intramolecular interaction is attributable to the weakly attractive, but in aggregate significant, London dispersion force. Classically, the dispersion interaction is represented in the van der Waal (E vdw) energy component of the total noncovalent energy (E nc) of a system (eq ). The energy attributed to nonbonded interaction is defined as

Enc=Evdw+Ees 2

E es is the electrostatic term. The van der Waals energy and electrostatic energy for an interacting pair of atoms a and b are described by eq and eq , respectively.

Evdw=ϵab[(rminrab)122(rminrab)6] 3
Ees=qaqb4πε0rab 4

From our experience, for relatively small molecules where the contribution of nonbonded intramolecular interactions is very small, using the popular density functional theory (DFT) functional B3LYP with a small basis set such as 6-31G­(d) is often sufficient for obtaining reasonably accurate structures in a timely manner. However, when peptide ions are large enough that distant nonbonding atoms loop back and are sterically allowed to interact, the long-range intramolecular interactions can become a significant factor in stabilizing the preferred equilibrium geometry.

In this work, we assess and validate an efficient and reliable method for the structural assignment of gas-phase peptide ions from IM-MS experimental CCS data. For the electronic structure theory (e.g., QM calculation step), we have elected to use DFT for geometry optimization and single-point energy calculation, which has shown reasonably good generalizability in prior works for small molecule and biomolecule CCS modeling. Moving forward, to prevent confusion regarding the different DFT implementations, the labels D3, D3(0), and D3­(BJ) represent a general designation of the Grimme dispersion correction, the dispersion correction with the zero-damping scheme, and the dispersion correction with the Becke–Johnson damping scheme, respectively. Additionally, the workflow also involves the use of ensembles for peptide ions produced for charge modeling and extensive conformational sampling coupled with downstream CCS computation. Carrying out CCS computations will allow us to quantify our structure assignment accuracy by comparing our results with the available experimental IM-MS reference CCS data. A detailed layout of our workflow is shown in Figure .

1.

1

Schematic overview of the workflow for a successful designation of structure assignment for a gas-phase peptide ion from CCS data.

Computational Methods

Experimental CCS Values

The experimental CCS values for the 23 tryptic peptide systems with a range of 3–14 amino acids were retrieved from the Mclean research group Unified CCS Compendium database. , Furthermore, the peptide analytes were produced via trypsin digestion and their arrival times (or drift time) were determined through a standardized interlaboratory protocol. To generate peptide ions in the [M + H]+ charge mode, electrospray ionization (ESI) was used as the ionization method. The averaged drift times from IM-MS experiments were used to calculate the CCS values through the Mason–Schamp relationship for low-field ion mobility (eq ),

Ω=(18π)1/216ze(kbT)1/2[1mi+1mb]1/2tAEL760PT273.151N 5

where z is the ion charge, k b is the Boltzmann constant, T is the temperature of the drift tube, e is the electron charge, m i is the ion mass, m b is the buffer gas mass, t A is the arrival time of the ion analyte, E is the electric field strength, P is the drift tube pressure, and N is the number density of the buffer gas. For this work, N2 is used as the drift gas; therefore, only the data and results related to N2 are considered.

Peptide Seed Generation

The peptide starting structures were generated using the protein structure prediction software AlphaFold2. To prevent structural violations or clashes, the Amber99SB force field with harmonic constraints was used to minimize and relax the predicted peptide structures. Other parameters that were used in the AlphaFold2 structure generation step were the “MMseq2­(UniRef+Enivoronmental)” option for the msa mode; “unpaired+paired” option for the pair mode; the model type was set at auto; and the number of recycles was set at 3. The AlphaFold2 run for each peptide query sequence produced five model conformations with a structure prediction quality grading for model one being the most accurate and model five being the least. Moreover, for all the peptide systems predicted by AlphaFold2 only the model one structure was chosen as the seed for the conformation generation step.

Conformation Generation

From the seed structure, models representing unique protonated +1 charge (i.e., [M + H]+) peptide species were generated according to their protonation behaviors with the aid of Maestro pK a prediction program from Schrodinger. Next, 1000 conformers were then generated from the parent protomer model utilizing the Schrodinger conformation generation software ConfGen. , We chose not to enable the OPLS3 force-field option for structure minimization because this would significantly reduce the number and diversity of conformations generated, which would have limited the amount of conformational space sampled. The steps that followed utilized the workflow established in our previous work on QM aided CCS calculation for metabolite annotation.

Conformation Optimization and Structure Similarity Reduction

Next, the 1000 conformers of each peptide protomer model were preoptimized with the neural network potential ANI. The resultant peptide conformers were clustered into ensembles of peptides with similar conformations (as measured by RMSD values), and from each ensemble, we extracted the centroid peptide (as a representative of the ensemble) for further processing. This was accomplished using the open-source program Autograph. Generally, the number of initial conformers per peptide was reduced to 14–25 conformer centers after clustering. The Autograph step is necessary to reduce the computational workload in the QM optimization step.

QM Geometry Optimization, Single-Point Energy, and Charge Calculations

The QM processing for the ensembles was accomplished using the Gaussian16 software package. For geometry optimization of the conformer centers and the subsequent single-point energy and (Mulliken) charge calculation steps, we utilized the DFT hybrid-GGA functional B3LYP with Pople’s split-valence basis set of 6-31G­(d). Additionally, to better account for contributions from dispersion effects on the ground state geometry, dispersion corrections using the D3 variants D3(0) and D3­(BJ) with this DFT level of theory were applied to the same conformer centers produced from Autograph in two independent runs (one for each dispersion method).

Additionally, to improve the representation of the gas-phase geometry, the 6-31G­(d) basis set was expanded to 6-31G (d,p) to include a polarization function for all hydrogen atoms. We again used the B3LYP functional. The protonation state that initially produced the energy minimum candidate conformation(s) at the D3-B3LYP/6-31G­(d) level of theory was carried forward to the D3-B3LYP/6-31G­(d,p) level of theory. Again, we employed both D3 variants in independent runs. The XYZ files containing the centers produced from the initial Autograph clustering (raw ensemble) step were recycled and used as the starting structure for this run. The ensuing steps used to produce this batch of CCS values followed the same workflow and software parameters as those used in the initial run. Finally, the D3 variant (D3(0) or D3­(BJ)) that produced a lower CCS error was then used to select the new candidate conformation with the lowest relative energy.

Collision Cross Section Calculation

The CCS values were calculated using the peptide DFT optimized geometry and corresponding partial charge value with the aid of the open-source software High Performance Collision Cross Section (HPCCS). This software uses the trajectory method, which requires solving both an orientationally averaged collision integral and an intermolecular interaction potential between the ion analyte and the buffer gas. The HPCCS configuration parameters were set to 1 for the number of conformations per input, 10 for average mobility cycles, 20 for velocity integration, 500 for Monte Carlos integrations, 1000 for the number of rotations in average potential, 298.0 for temperature (kelvin), and 2 for N2 buffer gas. The final computed CCS value for an ensemble is a Boltzmann weighted average CCS using the relative electronic energies of conformers that are within ≤3 kcal mol–1 of the energy minimum conformer. The final peptide structures were visually inspected using PyMol.

To be consistent with our previous works on CCS, we judged the accuracy performance of our prediction by using a 3% error threshold between the experimental and computed CCS values. This threshold allows us to account for the uncertainty that is historically encountered from experimental CCS values calculated using the Mason–Schamp equation and for laboratory IM-MS instrument calibration error. ,

Results and Discussion

Structure Prediction Performance of B3LYP/6-31G­(d) and D3-B3LYP/6-31G­(d)

As stated previously, we started the DFT calculation step in our workflow using the smaller 6-31­(d) basis set initially and proceeded accordingly in response to how reliable this level of theory was in assigning experimentally viable structures. This helped us to explicitly determine whether a larger basis set with auxiliary functions (hence greater computational cost) is necessary to produce successful gas-phase structure assignment. Detailed results containing the ensemble relative energetics and CCS prediction performance for the 23 test peptide ions using B3LYP/6-31G­(d), D3(0)-B3LYP/6-31G­(d), and D3­(BJ)-B3LYP/6-31G­(d) are tabulated in Tables S1–S23 in the Supporting Information.

With the implementation of the smallest DFT level of theory for this work (i.e., B3LYP/6-31G­(d)), we obtained a structure prediction accuracy success rate of ∼22% (5 out of 23 systems achieve ≤3% CCS error) with an average CCS error of ∼9% at a standard deviation of ±6% (Table ). We observe that the 18 peptide systems with suboptimal predictions (i.e., >3% error) all have computed CCS values that overestimate experiment, which translates to predicted ground state structures that exhibited a more extended conformation, and for isolated molecular ions in the gas-phase, this hints that the intramolecular interactions were underestimated during the DFT geometry optimization process.

1. Computed Boltzmann-Weighted CCS Values for Singly Protonated, [M + H]+, Peptide Structures Processed at the DFT B3LYP/6-31G­(d) Level of Theory with Different Dispersion Models.

  peptide ions exp CCS (Å2) B3LYP/6-31G(d) CCS (Å2) D3(0)-B3LYP/6-31G(d) CCS (Å2) D3(BJ)-B3LYP/6-31G(d) CCS (Å2)
1 ANELLINVK 322.50 347.62 (7.8) 322.93 (0.1) 342.93 (6.3)
2 AWEVTVK 278.20 333.91 (20.1) 295.78 (6.3) 296.11 (6.4)
3 AWSVAR 255.70 290.00 (11.8) 264.29 (3.4) 274.98 (7.5)
4 DYYFALAHTVR 374.80 410.42 (9.5) 412.60 (10.1) 422.64 (12.8)
5 ELR 200.30 197.81 (1.2) 203.43 (1.6) 201.62 (0.7)
6 EWTR 229.70 250.20 (8.9) 241.93 (5.3) 243.40 (6.0)
7 EYK 203.20 213.81 (5.2) 210.64 (3.7) 209.78 (3.2)
8 FAAYLER 293.10 335.07 (14.3) 302.10 (3.1) 307.17 (4.8)
9 FLNR 231.60 250.13 (8.0) 231.08 (0.2) 238.48 (3.0)
10 FPK 184.70 193.09 (4.5) 188.53 (2.1) 188.59 (2.1)
11 FSSDR 234.50 250.05 (6.6) 250.86 (7.0) 240.25 (2.5)
12 GLVK 205.30 210.80 (2.7) 210.69 (2.6) 210.63 (2.6)
13 LWSAK 237.70 267.85 (12.7) 243.40 (2.4) 243.63 (2.5)
14 NFNR 224.00 252.51 (12.7) 247.61 (10.5) 239.92 (7.1)
15 NIATGSK 255.30 298.74 (17.0) 257.95 (1.0) 248.67 (2.6)
16 poly-6-glycine 171.04 176.67 (3.3) 174.29 (1.9) 170.03 (0.6)
17 poly-8-glycine 206.52 203.34 (1.5) 201.43 (2.5) 197.56 (4.3)
18 poly-10-glycine 216.15 232.7 (7.7) 222.58 (3.0) 222.02 (2.7)
19 poly-14-glycine 254.51 277.82 (9.2) 266.42 (4.7) 266.33 (4.6)
20 TFAEALR 281.80 339.29 (20.4) 304.43 (8.0) 295.93 (5.0)
21 TIAQYAR 280.00 315.95 (12.8) 282.10 (0.8) 300.94 (7.5)
22 VASLR 232.00 230.65 (0.6) 223.39 (3.7) 232.05 (0.0)
23 WIR 213.20 210.41 (1.3) 217.68 (2.1) 208.50 (2.2)
average % error   8.8 ± 6 3.7 ± 3 4.2 ± 3
a

The average of values in parentheses.

Although energetics obtained at the MP2 level using HF/6-31­(d,p) geometries showed good agreement with experiment for small peptides, extending this to peptides of the size studied herein is still computationally intensive. With that consideration, we added a computationally inexpensive empirical dispersion correction to the initial B3LYP/6-31G­(d) level of theory for geometry optimization, single-point energy calculation, and charge calculation. Since there is no available metric or consensus in the computational chemistry community on which damping function for the D3 dispersion correction works best for peptides, we used both D3(0) and D3­(BJ) independently, and thus, we also report in this manuscript the effect of a damping function choice on the quality of structure prediction for peptide ions. Even though D3(0) and D3­(BJ) give rise to more repulsion and attraction at short-range interatomic distances, respectively, on the potential energy surface, there is no trend where one damping function favors a smaller or larger peptide conformation over another. Overall, we found that accounting for long-range dispersion reduces the average CCS error to ∼4% ± 3 (success rates of ∼52% for D3(0) and ∼47% for D3­(BJ)), which indicates a significant improvement in both the accuracy and transferability of our method.

Additionally, we measured the distance between our reference points N-terminal amine and the C-terminal carboxylic acid Cα for the peptide ions since it has been previously shown that conformation compactness is decreased with side chain–side chain interactions. This is especially relevant here since most of the charge sites assigned to the test systems are located on the terminal residues (for the location of assigned amino acid charge site(s), see Figure S1). Thus, this measurement provides a good quantitative indication of structural compactness since the distance between N to C-terminal is related to the packing efficiency of amino acid residues, driven by intramolecular interactions. For the 23 candidate peptides measured, the average N to C-terminal distance is ∼7 Å with dispersion correction compared to ∼8.7 Å without dispersion (Figure A). Moreover, Figure B shows an example of the DFT theory dependent major and minor shifts in the N to C distance for the equilibrium structures of AWSVAR and FSSDR that correlate well with their computed CCS values, which we can attribute to the overall packing efficiency of the amino acids. It should be remarked that generally our computed CCS values for the peptide systems overestimate the reference CCS; therefore we believe that equilibrium structures that are assigned using the competing DFT levels of theory that yield the smaller N to C-terminal will on average have better agreement with experiment. When we averaged the CCS values for the D3(0) and D3­(BJ), the difference in N to C-terminal distance translates to a ∼7.5% and ∼1.8% larger CCS values for the dispersion uncorrected equilibrium conformations for AWSVAR and FSSDR, respectively.

2.

2

(A) Individual and pooled average N to C-terminal distances for 23 peptide ion equilibrium structures at different levels of DFT theories. The two distinctions here are geometry optimization with unsupported (blue bar) and supported (green bar) long-range dispersion. (B) N to C-terminal (green highlight group) distance measurements (yellow dash) and computed CCS values for AWSVAR and FSSDR at three different DFT dispersion statuses.

Addition of a p-Type Polarization Function

The residual overestimation of experimental CCS using D3-B3LYP/6-31G­(d) is evidence that further improvement in the level of theory might have some benefits. Much of the main chain and side chain of peptides have hydrogen bond donors and/or acceptors, and their propensity to form hydrogen bond interactions require that polar contacts be ideally within a distance of approximately 2.8 Å. , Moreover, the B3LYP functional with the D95­(d,p) augmented basis set was routinely used in numerous works by Dannenberg and co-workers for modeling peptide in the gas-phase. ,− Thus, following their well-established precedent, we added an additional p-type polarization function for hydrogen atoms (i.e., 6-31G­(d,p)) with the aim to improve upon our initial results. Although previous works , comparing 6-31G­(d) and 6-31G­(d,p) have revealed negligible differences in resulting peptide conformation energetics (i.e., cistrans conversion barrier) and geometric parameters (i.e., bond length and torsion angle), we found that the noted differences in the atomic charges produced in these works was worth pursuing further as a way to improve structure assignment success. As already mentioned, Mulliken atomic partial charges are used in HPCCS calculations employing the trajectory method, which makes these calculations sensitive to atomic charges. For example, a survey (Figure A) of atomic charges for the peptide ELR shows a subtle difference from atom to atom between the equilibrium conformations assigned by 6-31G­(d) and 6-31­(d,p). Nevertheless, in concert with the partial charge changes and their effect on CCS calculations, we obtained slightly more compact conformations for 6-31­(d,p) that better align with experimental data (see Figure B,C).

3.

3

Effect of p-type polarization function on the ground-state electronic peptide structure. (A) ELR, [M + H]+, ion atomic partial charge distribution for heavy atoms (hydrogen summed) using the Mulliken charge scheme for 6-31G­(d) and 6-31G­(d,p). The value in red and blue indicates negative and positive partial charge, respectively; 6-31G­(d) and 6-31G­(d,p) are represented on the right of partial charge values by * and **, respectively. (B) N to C-terminal distance and computed CCS values for equilibrium ELR conformation obtained from 6-31G­(d) and (C) 6-31G­(d,p) basis sets. Both conformations are protonated at the same atomic site.

To further probe the significance of the choice of atomic point charges, we recomputed the CCS for eight arbitrarily selected peptides (ranging from 3 to 9 residues) using atomic partial charges calculated with D3BJ-B3LYP/6-31G­(d) on D3BJ-B3LYP/6-31G­(d,p) optimized geometries (i.e., D3BJ-B3LYP/6-31G­(d)//D3BJ-B3LYP/6-31G­(d,p)). Since the geometries between the two computed CCS values are identical, any difference in their CCS is entirely due to their respective partial charge distributions. As Figure shows, CCS calculated using 6-31G­(d,p) charges is significantly larger than for 6-31­(d) for large peptides (i.e., ANELLINVK, AWEVTVK, TFAEALR). We hypothesize that this is because 6-31G­(d,p) produces a greater partial charge separation. For a given ion, this ultimately resulted in a stronger dipole-induced dipole effect on the colliding N2 buffer gas and therefore a decreased in ion mobility (stronger ion–neutral interaction). On the other hand, 6-31G­(d) significantly underestimated CCS values for the D3BJ-B3LYP/6-31G­(d,p) optimized geometries for the same large peptides. For the tested peptides, we found that the CCS changes with the 6-31G­(d,p) partial charges are favorable for achieving a much better agreement with experiment.

4.

4

Percent change in CCS going from partial charges calculated with 6-31G­(d) to the 6-31­(d,p) basis set while maintaining the 6-31­(d,p) geometry. The DFT levels of theory for charge calculation are D3BJ-B3LYP/6-31G­(d)//D3BJ-B3LYP/6-31G­(d,p).

The complete performance results for the assigned equilibrium structures generated with this larger basis set are displayed in Table (for DFT results at the conformer level per ensemble charge model, see Tables S24–S46 in the Supporting Information). The addition of the p-type function significantly increases the prediction success rate to ∼74% (17 out of 23 peptide systems achieved ≤3% CCS error) at a lower average CCS error of ∼2.2% ± 1.3. Furthermore, the average CCS overestimation for D3-B3LYP/6-31G­(d,p) is ∼1% ± 2.4 compared to 2% for D3-B3LYP/6-31G­(d). It should be noted that the lower standard deviation achieved with this level of theory increases the transferability of this method to larger peptides or even other biomolecule classes. Given the average N to C-terminal distance measurements shown in Figure , which shows only an average marginal difference (∼0.2 Å), we believe that the general improved accuracy in structure assignment at this point is still a combination of a more precise description of both intramolecular and intermolecular effects.

2. Computed Boltzmann-Weighted CCS Values for Singly Protonated, [M + H]+, Peptide Structures Processed at the DFT B3LYP/6-31G­(d,p) Level of Theory with Different Dispersion Models.

  peptide ions D3(0)-B3LYP/6 31G(d,p) CCS (Å2) D3(BJ)-B3LYP/6-31G(d,p) CCS (Å2) D3-B3LYP/6-31G(d,p) CCS (Å2)
1 ANELLINVK 335.91 (4.2) 332.21 (3.0) 332.21 (3.0)
2 AWEVTVK 286.04 (2.8) 294.42 (5.8) 286.04 (2.8)
3 AWSVAR 274.57 (7.4) 265.11 (3.7) 265.11 (3.7)
4 DYYFALAHTVR 389.49 (3.9) 392.12 (4.6) 389.49 (3.9)
5 ELR 207.93 (3.8) 199.88 (0.2) 199.88 (0.2)
6 EWTR 237.44 (3.4) 237.03 (3.2) 237.03 (3.2)
7 EYK 209.74 (3.2) 209.02 (2.9) 209.02 (2.9)
8 FAAYLER 292.77 (0.1) 287.53 (1.9) 292.77 (0.1)
9 FLNR 237.89 (2.7) 237.60 (2.6) 237.60 (2.6)
10 FPK 188.09 (1.8) 187.84 (1.7) 187.84 (1.7)
11 FSSDR 241.62 (3.0) 241.31 (2.9) 241.31 (2.9)
12 GLVK 208.71 (1.7) 211.01 (2.8) 208.71 (1.7)
13 LWSAK 236.19 (0.6) 250.71 (5.5) 236.19 (0.6)
14 NFNR 219.20 (2.1) 217.09 (3.1) 219.20 (2.1)
15 NIATGSK 260.08 (1.9) 253.95 (0.5) 253.95 (0.5)
16 Poly-6-glycine 172.39 (0.8) 175.29 (2.5) 172.39 (0.8)
17 Poly-8-glycine 197.76 (4.2) 197.50 (4.4) 197.76 (4.2)
18 Poly-10-glycine 225.89 (4.5) 219.44 (1.5) 219.44 (1.5)
19 Poly-14-glycine 264.76 (4.0) 263.74 (3.6) 263.74 (3.6)
20 TFAEALR 295.01 (4.7) 290.69 (3.2) 290.69 (3.2)
21 TIAQYAR 280.33 (0.1) 277.79 (0.8) 280.33 (0.1)
22 VASLR 228.70 (1.4) 228.3 (1.6) 228.70 (1.4)
23 WIR 205.63 (3.6) 210.41 (1.3) 210.41 (1.3)
average % error 2.98 ± 1.8 2.75 ± 1.5 2.19 ± 1.3
a

The average of the values in parentheses.

b

Column contains CCS values from D3(0) or D3­(BJ) method that better agree with experiment.

5.

5

Pooled measurements of N to C-terminal distances for 23 peptides at the DFT D3-B3LYP level with 6-31G­(d) basis set (blue) or 6-31G­(d,p) basis set (orange). Average change in distance of (d ) → (d,p) is ≈−0.2 Å.

While the inclusion of dispersion , and polarization functions have been noted to improve peptide equilibrium geometry optimization in the past, this work is notable for showing the significant effect on the CCS of gas-phase peptides. Figure A summarizes the CCS response to the different levels of theory, which again reinforces the need to account for long-range dispersion for sizable molecular systems. That said, the performance of the dispersion correction with different damping function can vary on a system basis; however, we found that increasing the basis set size (i.e., 6-31G­(d) → 6-31G­(d,p)) can reduce the disparity between D3(0) and D3­(BJ) CCS values (Figure B).

6.

6

Trend for (A) average % CCS overestimation at three different DFT level implementations and for (B) effect of basis set auxiliary function on absolute % CCS difference between two dispersion correction methods.

Conclusion

When peptide analytes in solution are converted and transferred to gas-phase ions for IM-MS analysis, there are intramolecular interactions that actively and significantly contribute to the equilibrium geometry. In this work, we have established a practical method that utilizes extensive conformational modeling coupled with CCS computation to account for and quantify gas-phase intramolecular interactions for the accurate elucidation of peptide ion structures. For the DFT geometry optimization step, we demonstrated that the B3LYP/6-31G­(d) level of theory with dispersion correction significantly improved the agreement between experimental and calculated CCS values. However, the performance for this DFT level of theory is still unsatisfactory, since only marginally half of the predicted structures generated achieved an acceptable CCS error below the limit set for this work. Accordingly, we found that increasing the basis set used to 6-31G­(d,p) substantially improved accuracy, success rate, and transferability of our structure assignment method by better recovering the ion–neutral collision interaction. Altogether, we have systematically demonstrated that significant improvement in structure elucidation for peptide ions of the [M + H]+ type of 3–14 amino acids is achievable with the appropriate quantum mechanical level of theory that accounts for intramolecular and intermolecular interactions in the gas phase.

Supplementary Material

js5c00078_si_001.pdf (239.3KB, pdf)

Acknowledgments

The authors thank the Institute for Cyber-Enabled Research (ICER) at Michigan State University High-Performance Computing Center (HPCC) for computational resources. M.K. and K.M.M. acknowledge support from NIH 1U2CES030167-01.

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jasms.5c00078.

  • Peptide ensemble DFT energies per dispersion status, assigned peptide amino acid charge site(s), and peptide ensemble DFT energies per basis set type (PDF)

The authors declare no competing financial interest.

References

  1. Hiraoka, K. , Ed. Fundamentals of Mass Spectrometry; Springer New York, 2013. [Google Scholar]
  2. Gabelica V., Marklund E.. Fundamentals of ion mobility spectrometry. Curr. Opin. Chem. Biol. 2018;42:51–59. doi: 10.1016/j.cbpa.2017.10.022. [DOI] [PubMed] [Google Scholar]
  3. Jurneczko E., Barran P. E.. How useful is ion mobility mass spectrometry for structural biology? The relationship between protein crystal structures and their collision cross sections in the gas phase. Analyst. 2011;136(1):20–28. doi: 10.1039/C0AN00373E. [DOI] [PubMed] [Google Scholar]
  4. Dodds J. N., Baker E. S.. Ion Mobility Spectrometry: Fundamental Concepts, Instrumentation, Applications, and the Road Ahead. Journal of The American Society for Mass Spectrometry. 2019;30(11):2185–2195. doi: 10.1007/s13361-019-02288-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Mosier P. D., Counterman A. E., Jurs P. C., Clemmer D. E.. Prediction of Peptide Ion Collision Cross Sections from Topological Molecular Structure and Amino Acid Parameters. Anal. Chem. 2002;74(6):1360–1370. doi: 10.1021/ac0112059. [DOI] [PubMed] [Google Scholar]
  6. Meier F., Köhler N. D., Brunner A.-D., Wanka J.-M. H., Voytik E., Strauss M. T., Theis F. J., Mann M.. Deep learning the collisional cross sections of the peptide universe from a million experimental values. Nat. Commun. 2021;12(1):1185. doi: 10.1038/s41467-021-21352-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Song X.-C., Dreolin N., Damiani T., Canellas E., Nerin C.. Prediction of Collision Cross Section Values: Application to Non-Intentionally Added Substance Identification in Food Contact Materials. J. Agric. Food Chem. 2022;70(4):1272–1281. doi: 10.1021/acs.jafc.1c06989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Tao L., McLean J. R., McLean J. A., Russell D. H.. A Collision Cross-Section Database of Singly-Charged Peptide Ions. J. Am. Soc. Mass Spectrom. 2007;18(7):1232–1238. doi: 10.1016/j.jasms.2007.04.003. [DOI] [PubMed] [Google Scholar]
  9. Chang C.-H., Yeung D., Spicer V., Ogata K., Krokhin O., Ishihama Y.. Sequence-Specific Model for Predicting Peptide Collision Cross Section Values in Proteomic Ion Mobility Spectrometry. J. Proteome Res. 2021;20(7):3600–3610. doi: 10.1021/acs.jproteome.1c00185. [DOI] [PubMed] [Google Scholar]
  10. Ruotolo B. T.. Collision Cross Sections for Native Proteomics: Challenges and Opportunities. J. Proteome Res. 2022;21(1):2–8. doi: 10.1021/acs.jproteome.1c00686. [DOI] [PMC free article] [PubMed] [Google Scholar]; From NLM.
  11. Ghassabi Kondalaji S., Khakinejad M., Tafreshian A., Valentine S. J.. Comprehensive Peptide Ion Structure Studies Using Ion Mobility Techniques: Part 1. An Advanced Protocol for Molecular Dynamics Simulations and Collision Cross-Section Calculation. J. Am. Soc. Mass Spectrom. 2017;28(5):947–959. doi: 10.1007/s13361-017-1599-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Wu R., Metternich J. B., Kamenik A. S., Tiwari P., Harrison J. A., Kessen D., Akay H., Benzenberg L. R., Chan T. W. D., Riniker S., Zenobi R.. Determining the gas-phase structures of α-helical peptides from shape, microsolvation, and intramolecular distance data. Nat. Commun. 2023;14(1):2913. doi: 10.1038/s41467-023-38463-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Kim D., Wagner N., Wooding K., Clemmer D. E., Russell D. H.. Ions from Solution to the Gas Phase: A Molecular Dynamics Simulation of the Structural Evolution of Substance P during Desolvation of Charged Nanodroplets Generated by Electrospray Ionization. J. Am. Chem. Soc. 2017;139(8):2981–2988. doi: 10.1021/jacs.6b10731. [DOI] [PubMed] [Google Scholar]
  14. Wieczorek R., Dannenberg J. J.. H-Bonding Cooperativity and Energetics of α-Helix Formation of Five 17-Amino Acid Peptides. J. Am. Chem. Soc. 2003;125(27):8124–8129. doi: 10.1021/ja035302q. [DOI] [PubMed] [Google Scholar]
  15. Fadda E., Woods R. J.. Contribution of the empirical dispersion correction on the conformation of short alanine peptides obtained by gas-phase QM calculations. Can. J. Chem. 2013;91(9):859–865. doi: 10.1139/cjc-2012-0542. [DOI] [PMC free article] [PubMed] [Google Scholar]; Accessed Feb, 2, 2023.
  16. Man V. H., Wu X., He X., Xie X.-Q., Brooks B. R., Wang J.. Determination of van der Waals Parameters Using a Double Exponential Potential for Nonbonded Divalent Metal Cations in TIP3P Solvent. J. Chem. Theory Comput. 2021;17(2):1086–1097. doi: 10.1021/acs.jctc.0c01267. [DOI] [PubMed] [Google Scholar]
  17. Keng M., Merz K. M. Jr.. In Silico Characterization of Glycan Ions from IM-MS Collision Cross Section. J. Am. Soc. Mass Spectrom. 2025;36(3):504–513. doi: 10.1021/jasms.4c00370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Keng M., Merz K. M. Jr.. Eliminating the Deadwood: A Machine Learning Model for CCS Knowledge-Based Conformational Focusing for Lipids. J. Chem. Inf. Model. 2024;64:7864. doi: 10.1021/acs.jcim.4c01051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Das S., Tanemura K. A., Dinpazhoh L., Keng M., Schumm C., Leahy L., Asef C. K., Rainey M., Edison A. S., Fernández F. M., Merz K. M. Jr.. In Silico Collision Cross Section Calculations to Aid Metabolite Annotation. J. Am. Soc. Mass Spectrom. 2022;33(5):750–759. doi: 10.1021/jasms.1c00315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Goerigk, L. Chapter 6 - A Comprehensive Overview of the DFT-D3 London-Dispersion Correction. In Non-Covalent Interactions in Quantum Chemistry and Physics; Otero de la Roza, A. , DiLabio, G. A. , Eds.; Elsevier, 2017; pp 195–219. [Google Scholar]
  21. Picache J. A., Rose B. S., Balinski A., Leaptrot K. L., Sherrod S. D., May J. C., McLean J. A.. Collision cross section compendium to annotate and predict multi-omic compound identities. Chem. Sci. 2019;10(4):983–993. doi: 10.1039/C8SC04396E. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. May J. C., Goodwin C. R., Lareau N. M., Leaptrot K. L., Morris C. B., Kurulugama R. T., Mordehai A., Klein C., Barry W., Darland E., Overney G., Imatani K., Stafford G. C., Fjeldsted J. C., McLean J. A.. Conformational Ordering of Biomolecules in the Gas Phase: Nitrogen Collision Cross Sections Measured on a Prototype High Resolution Drift Tube Ion Mobility-Mass Spectrometer. Anal. Chem. 2014;86(4):2107–2116. doi: 10.1021/ac4038448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Stow S. M., Causon T. J., Zheng X., Kurulugama R. T., Mairinger T., May J. C., Rennie E. E., Baker E. S., Smith R. D., McLean J. A., Hann S., Fjeldsted J. C.. An Interlaboratory Evaluation of Drift Tube Ion Mobility–Mass Spectrometry Collision Cross Section Measurements. Anal. Chem. 2017;89(17):9048–9055. doi: 10.1021/acs.analchem.7b01729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Bryant P., Pozzati G., Elofsson A.. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 2022;13(1):1265. doi: 10.1038/s41467-022-28865-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Schrödinger Release 2023-3: Maestro; Schrödinger, LLC, New York, NY, 2023. [Google Scholar]
  26. Watts K. S., Dalal P., Murphy R. B., Sherman W., Friesner R. A., Shelley J. C.. ConfGen: A Conformational Search Method for Efficient Generation of Bioactive Conformers. J. Chem. Inf. Model. 2010;50(4):534–546. doi: 10.1021/ci100015j. [DOI] [PubMed] [Google Scholar]
  27. Schrödinger Release 2023-3: ConfGen; Schrödinger, LLC: New York, NY, 2023. [Google Scholar]
  28. Smith J. S., Isayev O., Roitberg A. E.. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chemical Science. 2017;8(4):3192–3203. doi: 10.1039/C6SC05720A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Tanemura K. A., Das S., Merz K. M. Jr.. AutoGraph: Autonomous Graph-Based Clustering of Small-Molecule Conformations. J. Chem. Inf. Model. 2021;61(4):1647–1656. doi: 10.1021/acs.jcim.0c01492. [DOI] [PubMed] [Google Scholar]
  30. Gaussian 16, revision C.01; Gaussian, Inc.: Wallingford, CT, 2016. [Google Scholar]
  31. Zanotto L., Heerdt G., Souza P. C. T., Araujo G., Skaf M. S.. High performance collision cross section calculationHPCCS. J. Comput. Chem. 2018;39(21):1675–1681. doi: 10.1002/jcc.25199. [DOI] [PubMed] [Google Scholar]; Accessed Feb 16, 2023.
  32. The PyMOL Molecular Graphics System, version 2.0; Schrödinger, LLC. [Google Scholar]
  33. Bush M. F., Campuzano I. D. G., Robinson C. V.. Ion Mobility Mass Spectrometry of Peptide Ions: Effects of Drift Gas and Calibration Strategies. Anal. Chem. 2012;84(16):7124–7130. doi: 10.1021/ac3014498. [DOI] [PubMed] [Google Scholar]
  34. Bush M. F., Hall Z., Giles K., Hoyes J., Robinson C. V., Ruotolo B. T.. Collision Cross Sections of Proteins and Their Complexes: A Calibration Framework and Database for Gas-Phase Structural Biology. Anal. Chem. 2010;82(22):9557–9565. doi: 10.1021/ac1022953. [DOI] [PubMed] [Google Scholar]
  35. Beachy M. D., Chasman D., Murphy R. B., Halgren T. A., Friesner R. A.. Accurate ab Initio Quantum Chemical Determination of the Relative Energetics of Peptide Conformations and Assessment of Empirical Force Fields. J. Am. Chem. Soc. 1997;119(25):5908–5920. doi: 10.1021/ja962310g. [DOI] [Google Scholar]
  36. Grimme S.. Density functional theory with London dispersion corrections. WIREs Computational Molecular Science. 2011;1(2):211–228. doi: 10.1002/wcms.30. [DOI] [Google Scholar]; Accessed Feb 16, 2023.
  37. Warnke S., von Helden G., Pagel K.. Protein Structure in the Gas Phase: The Influence of Side-Chain Microsolvation. J. Am. Chem. Soc. 2013;135(4):1177–1180. doi: 10.1021/ja308528d. [DOI] [PubMed] [Google Scholar]
  38. Nagarajan S., Rajadas J., Malar E. J. P.. Density functional theory analysis and spectral studies on amyloid peptide Aβ(28–35) and its mutants A30G and A30I. J. Struct. Biol. 2010;170(3):439–450. doi: 10.1016/j.jsb.2010.02.017. [DOI] [PubMed] [Google Scholar]
  39. Counterman A. E., Clemmer D. E.. Volumes of Individual Amino Acid Residues in Gas-Phase Peptide Ions. J. Am. Chem. Soc. 1999;121(16):4031–4039. doi: 10.1021/ja984344p. [DOI] [Google Scholar]
  40. Taylor R.. Short Nonbonded Contact Distances in Organic Molecules and Their Use as Atom-Clash Criteria in Conformer Validation and Searching. J. Chem. Inf. Model. 2011;51(4):897–908. doi: 10.1021/ci100466h. [DOI] [PubMed] [Google Scholar]
  41. Poudel H., Reid K. M., Yamato T., Leitner D. M.. Energy Transfer across Nonpolar and Polar Contacts in Proteins: Role of Contact Fluctuations. J. Phys. Chem. B. 2020;124(44):9852–9861. doi: 10.1021/acs.jpcb.0c08091. [DOI] [PubMed] [Google Scholar]
  42. Wieczorek R., Dannenberg J. J.. Amide I Vibrational Frequencies of α-Helical Peptides Based upon ONIOM and Density Functional Theory (DFT) Studies. J. Phys. Chem. B. 2008;112(4):1320–1328. doi: 10.1021/jp077527j. [DOI] [PubMed] [Google Scholar]
  43. Viswanathan R., Dannenberg J. J.. A Density Functional Theory Study of Vibrational Coupling in the Amide I Band of β-Sheet Models. J. Phys. Chem. B. 2008;112(16):5199–5208. doi: 10.1021/jp8001004. [DOI] [PubMed] [Google Scholar]
  44. Wieczorek R., Dannenberg J. J.. Hydrogen-Bond Cooperativity, Vibrational Coupling, and Dependence of Helix Stability on Changes in Amino Acid Sequence in Small 310-Helical Peptides. A Density Functional Theory Study. J. Am. Chem. Soc. 2003;125(46):14065–14071. doi: 10.1021/ja034034t. [DOI] [PubMed] [Google Scholar]
  45. Wieczorek R., Dannenberg J. J.. The Energetic and Structural Effects of Single Amino Acid Substitutions upon Capped α-Helical Peptides Containing 17 Amino Acid Residues. An ONIOM DFT/AM1 Study. J. Am. Chem. Soc. 2005;127(49):17216–17223. doi: 10.1021/ja052689j. [DOI] [PubMed] [Google Scholar]
  46. Kang Y. K.. Ab initio and DFT conformational study of proline dipeptide. Journal of Molecular Structure: THEOCHEM. 2004;675(1):37–45. doi: 10.1016/j.theochem.2003.12.031. [DOI] [Google Scholar]
  47. Morgon N. H., Custodio R., Tostes J. G. R., Taft C. A.. Influence of polarization functions on atomic properties of bridge N and H atoms of HCN ··· HCN. Journal of Molecular Structure: THEOCHEM. 1995;335(1):11–23. doi: 10.1016/0166-1280(94)03977-S. [DOI] [Google Scholar]
  48. Goerigk L., Collyer C. A., Reimers J. R.. Recommending Hartree–Fock Theory with London-Dispersion and Basis-Set-Superposition Corrections for the Optimization or Quantum Refinement of Protein Structures. J. Phys. Chem. B. 2014;118(50):14612–14626. doi: 10.1021/jp510148h. [DOI] [PubMed] [Google Scholar]
  49. Hameed R., Khan A., van Mourik T.. Intramolecular BSSE and dispersion affect the structure of a dipeptide conformer. Mol. Phys. 2018;116(9):1236–1244. doi: 10.1080/00268976.2017.1418029. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

js5c00078_si_001.pdf (239.3KB, pdf)

Articles from Journal of the American Society for Mass Spectrometry are provided here courtesy of American Chemical Society

RESOURCES