Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jul 15.
Published in final edited form as: Anal Chem. 2010 Jul 15;82(14):6056–6065. doi: 10.1021/ac1006415

Evaluation of the Identification Power of RPLC Analyses in the Screening for Drug Compounds

Melanie Dumarey 1,2, Yvan Vander Heyden 1, Sarah C Rutan 2,*
PMCID: PMC2914508  NIHMSID: NIHMS217095  PMID: 20578680

Abstract

The identification of drugs of abuse is an important issue in forensic science. The main goal is to trace and identify as many drugs as possible in the shortest possible time preferably with a simple analysis method. One possibility is to screen samples using a Liquid Chromatography – Diode Array Detection (LC-DAD) system. However, when simultaneously performing another analysis on a chromatographic column exhibiting selectivity differences from the first one, i.e., orthogonal or dissimilar columns, a greater number of drugs can be possibly identified without investing a lot of extra time or money. The primary difficulty is then selecting the most appropriate columns. In this paper, it is demonstrated that selecting the most dissimilar columns based on measures such as correlation or Snyder’s Fs value is not optimal, because these measures do not take into account the identification power of the individual systems. This implies that a large number of drugs may not necessarily be identified on the systems selected using these criteria. Therefore, three other measures are tested to evaluate the identification power obtained by parallel screening on two columns or by comprehensive two-dimensional LC (LC×LC). The simplest approach is counting the number of compounds separable with a difference in retention time greater than a predefined critical value. However, this measure does not reflect the co-elution pattern of the unidentified drugs nor the separation degree of all compounds. The second tested measure, information, enables differentiation between systems identifying the same number of compounds, but resulting in a different co-elution pattern. Multivariate selectivity, the third tested parameter, takes into account the degree of separation of all compounds and has the advantage that it reflects the gain in identification power achieved by introducing DAD data. All three proposed measures also enable evaluation of whether the corresponding LC×LC method will result in a greater identification power.

Keywords: information theory, two-dimensional separations, Snyder Fs, correlation, multivariate selectivity


In forensic laboratories, powder samples, liquids or biological fluids often need to be screened for a large number of drugs in a short time. For this purpose, the most simple analysis method with a high identification power is desired. The analyst should thus find a compromise between simplicity and efficiency. One-dimensional (1D) reversed phase liquid chromatography (LC) coupled to diode-array-detection (DAD) is one possible choice of several possible screening procedures.1,2 Although LC coupled with mass spectrometry (MS) offers more identification power than DAD detection, the cost of MS as a screening tool means that LC-DAD analyses are still quite useful. In ref. (3) it is even stated that the identification power of LC-DAD is similar to that of low resolution LC-MS-MS. Another means of improving the identification power is to carry out simultaneous screening on two dissimilar columns, i.e., columns showing significant selectivity differences. Simultaneous LC analyses, can improve the identification power of the analysis with only a little increase in analysis time and only a moderate increase in cost.4 If this approach does not result in a sufficient separation, one can consider applying more complex two-dimensional liquid chromatography (LC×LC) methods that can provide higher peak capacities than one dimensional separations5. If the number of studied drugs is large, it can be difficult for the analyst to decide whether the more complex analysis will imply a gain in information about the drugs. The primary issue, however, in using either a parallel 1D-LC approach or an LC×LC method, is the selection of the two columns that will give maximal identification power. The term “identification power is used frequently in the field of toxicological analysis6-10. Here, the term identification power is used to indicate a measure of the number of compounds, which can be identified unambiguously; in the best case, this metric will also reflect the degree of separation of the unidentified compounds. In this paper, several approaches are discussed to assess the identification power of (1) a single 1D-LC separation, (2) two parallel 1D-LC separations and (3) of the corresponding LC×LC method in a quantitative way based on the measured retention times of a set of drugs of abuse. The most common way to find a combination of two columns, that will provide maximal information, is to choose the two most dissimilar columns (i.e., columns showing significant selectivity differences), which can be evaluated based on the correlation between the retention times of the drug compounds11 or based on Snyder’s Fs values12. In addition, other research groups have developed tools to select (dis)similar columns. Hoogmartens et al.13 for example, introduced an F value based on four test parameters characterizing the column. The group of Vander Heyden applied chemometric techniques such as hierarchical clustering14, auto-associative multivariate regression trees 15 or an orthogonal projection approach16 on the retention times of a representative set of pharmaceutical compounds in order to select the most dissimilar chromatographic systems. Jandera has described the degree of column orthogonality in terms of the differences in functional group selectivities17,18. However, the fact that two columns are very dissimilar does not imply that the separations obtained on both columns exhibit high efficiency and/or selectivity. Moreover, it is also not guaranteed that all compounds are retained for a reasonable time. Therefore, it would be preferable to use measures that accurately assess the identification power of the individual separations, as well as the dissimilarity between the two separation systems. A very simple, straightforward approach is to count the number of the compounds of interest (N) that can be separated on one column by a difference in retention time greater than a predefined critical value. When two parallel 1D separations are considered, one can first consider the first column and then check whether the initially unresolved peaks are separated on the second column with a difference in retention time higher than a predefined value. For LC×LC, a critical resolution, i.e., the minimal resolution between two peaks still enabling their identification, must be defined for both separation dimensions. It should be pointed out that this approach is distinctly different from simply reporting the peak capacity, which deals only with hypothetical, equally spaced peaks, and not real compounds with measured retention times. Two more advanced tools, i.e., information theory19 and multivariate selectivity,20-22 can also be used to determine the quality of a given separation that uses either one or two columns. Multivariate selectivity has the advantage that the increase in identification power resulting from the use of spectral information (for instance by the use of DAD instead of single wavelength detection) can be evaluated.

In this paper some well established approaches, i.e., Fs and R2, will be compared with more innovative approaches (number of compounds separated (N), information theory and multivariate selectivity) regarding column selection in screening for unknown drugs using LC-DAD.

THEORY

Correlation Coefficient (R2) or Snyder’s Fs Value

The dissimilarity between two chromatographic columns can be assessed by calculating the Pearson’s correlation23 (R2) between the retention times of a set of compounds measured on the considered columns.11 If the retention times for the same compounds on the two columns are very different (i.e., they show a low correlation), it implies that the columns exhibit selectivity differences, and the columns are concluded to be dissimilar. The chance is then very high that the second column delivers complementary information resulting in an improved identification power.

Another method to assess the selectivity differences between two columns is the use of the well-established column selectivity function (Fs) of Snyder.12 It is based on Snyder’s hydrophobic subtraction model, which involves five column properties that define the column selectivity and that will determine the retention (more specifically the selectivity factor) of a solute. These parameters are hydrophobicity (H), steric interaction (S*), hydrogen-bond acidity (A), hydrogen-bond basicity (B) and the cation-exchange activity (C).24 The Fs value between two columns can then be calculated as follows (eq 1):

Fs={[12.5(H2H1)]2+[100(S2S1)]2+[30(A2A1)]2+[143(B2B1)]2+[83(C2C1)]2}12 (1)

Each term in this equation is weighted according to its relative importance to the overall selectivity.12 The greater the obtained Fs value, the more dissimilar the columns. The weighting factors in Eq. 1 are based on Snyder’s solutes, and it might be questioned if these are also applicable for gradient analysis of drugs. However, the appropriate experimental information to calculate new, reliable weighting factors was not available to us. Fan et al. have shown that the weighing factors in eq 1 are appropriate for the evaluation of column similarities for gradient elution separations of these drug compounds.25

Number of Separated Compounds

A very simple and logical measure to assess the separation power of an analytical method is to count the number of compounds that are separated by more than a preset critical difference in retention time (Δtcrit). In the present work, if two peaks are separated with a resolution (Rs) of 0.7, they are considered as distinguishable from each other, and the drugs corresponding to these two peaks can be distinguished. The Δtcrit is then calculated as follows (eq 2):

Δtcrit=Rs×w=0.7×w (2)

where w is the average base width of the peaks, defined as 4σ.

When two 1D separations are performed in parallel, one can first determine which compounds are not separated on the first column with a difference in retention time greater than Δtcrit. Then it can be determined whether these compounds are separated on the second column. For LC×LC, two Δtcrit values must be considered, one for each dimension of the LC×LC method. If either of the Δtcrit values is exceeded, then the corresponding compounds are deemed to be separated.

Information Theory

When an unknown compound is considered to be possibly one of a number of different compounds (in our case, the compound is one of 25 drugs), additional information is required to narrow the list of possible candidate compounds.16 This information can be produced by an analytical separation technique that can identify all or a number of the possible compounds. Information theory allows for calculation of the reduction of uncertainty achieved by an analytical procedure and thus makes it possible to evaluate qualitative analyses in a quantitative way.

Before this technique can be applied, the retention times are normalized according to eq 3:

tn=tRtmintmaxtmin (3)

where tR is the measured retention time, tmin is the retention time of the earliest eluting compound and tmax the maximal retention time obtained on the considered column. This results in normalized retention times with values between zero and one. This normalization effectively takes into account that when a method is developed, the mobile phase gradient will be optimized such that the retention times of the compounds will be spread across the gradient time.26.

The information content of a 1D analysis method, I10 can be defined as

I=kpklog2pk=knkNlog2nkN (4)

where pk is the probability of the incidence of a single possible result, k, out of N possible results. In this case nk is the number of compounds eluting during the kth retention time class, and N is the number of compounds. The size of the retention time classes is based on the Δtcrit calculated using eq 2.

The information content of the combination of two chromatographic systems in a comprehensive 2D format, I(1,2), can also be calculated using eq 4.27,28 In the case of LC×LC, the retention time classes are two-dimensional squares defined by two ranges of retention times, one range for the first dimension separation and one range for the second dimension separation..The information can again be calculated with eq 4 and now the probability is determined by counting the number of drugs located in each square. The mutual information of two systems, I (1;2) can be determined as follows (eq 5):

I(1;2)=I(1)+I(2)I(1,2) (5)

where I(1) and I(2) indicate the information obtained from the two systems independent of one another, and I(1,2) represents the information obtained from the LC×LC technique. The mutual information reflects the fact that some of the information obtained by the two separations is redundant.

The similarity coefficient, R(1,2), is another measure of the qualitative information yield. When its value approaches one, it means that the two combined systems do not yield any more information then the better of the two systems individually. It can be calculated using the formulas below (eqs 6 and 7):19,28

d(1,2)=1I(1;2)I(1,2) (6)
R(1,2)=[1(d)2] (7)

In the case of two parallel 1D separations, several simplified procedures were used to attempt to calculate I(1,2). The most successful approach was to assume that all non-separated compounds are co-eluting with only one compound. This resulted in realistic values for the information yield of the method, but does have a risk of overestimating the information yield of the parallel separations.

Multivariate selectivity

The evaluation of the identification power of chromatographic systems can also be based on a multivariate selectivity parameter (SEL)21,22,29,30 This parameter quantifies the precision of quantification for a compound present in a mixture, as compared to the precision of quantification when analyzed as a pure component. This implies that a compound with a high selectivity can be unambiguously identified. The average of the multivariate selectivities of all compounds has been used as an estimator for the identification power of the analysis.21 Another option is to count the number of compounds with a multivariate selectivity greater than 0.98, because it can be shown that if a compound is separated from all others by at least a resolution 0.7, that this corresponds to a multivariate selectivity of 0.98. An advantage of the multivariate selectivity measure is that spectral information can also be included, and consequently, the gain in identification power obtained by adding spectral information will be reflected in the SEL values.

For the case, where the identification is based on the retention times measured on one chromatographic system, SEL according to Messick, Kalivas and Lang (MKL) is given by eq 820,29

SELi=[(A1TA1)1]ii12 (8)

where A1 is a matrix containing the pure component chromatographic profiles in the columns of the matrix. When the chromatographic profiles on one system are combined with the spectra during identification (i.e., LC-DAD analysis) this equation is adapted as follows (eq 9)20,29:

SELi=[(A1TA1)1(A3TA3)]i,i12 (9)

where A3 is a matrix containing the spectra of the individual components of the candidate compounds, and the inner dot represents the element-wise Hadamard product. The latter equation can also be used when LC×LC is used, and no spectra are used for identification. In that case, A2, a matrix containing the pure component chromatographic profiles on the second dimension column, is substituted in place of A3. A last possibility is that LC×LC is combined with DAD detection. The multivariate selectivity is then calculated as follows (eq 10)20,29:

SELi=[(A1TA1)1(A2TA2)(A3TA3)]i,i12 (10)

The chromatographic profiles used to calculate the multivariate selectivity are Gaussian peaks simulated with Matlab based on the measured retention times and peak widths (4σ). The retention times are first normalized to values between zero and one and the σ values are adjusted correspondingly. In LC×LC the σ values for the first and second dimension peaks can be derived from the observed peak capacity values.31 In the first dimension, the interval between the simulated points in the profile also must be increased, to reflect the low sampling rate in this dimension.

EXPERIMENTAL SECTION

The retention times of 25 drug compounds (methapyrilene, zolpidem, aminoflunitrazepam, pyrilamine, chloripheniramine, chlodiazepoxide, bromazepam, desipramine, oxazepam, nitrazepam, amitriptyline, estazolam, perphenazine, clonazepam, desalkylflurazepam, nordazepam, triazolam, flunitrazepam, temazepam, clobazam, lormetazepam, diazepam, halazepam, prazepam and buclizine) were determined on 13 different columns.25 Four different groups can be distinguished among the tested columns.25 The first group contains Type B silica-based C18 columns (Discovery C18, StableBond C18, ACE C18 and Alltima C18), which show very similar selectivity. The silica-based phenyl columns (StableBond Phenyl, Betabasic Phenyl and Prontosil Phenyl) form a second group. They are similar to each other, but differ from the C18 columns. The third group consists of two polar embedded silica-based columns: the Prism RP and the Bonus RP. A variety of dissimilar columns are included in the last group: two cyano columns (Novapak CN and Inertsil CN) and two in-house prepared columns (HC-OH and HC-SO3).32,33 For all columns, a gradient of 30-70% ACN over 10 minutes was applied, except for the HC-SO3 column, for which a 20-70% ACN gradient was performed. For the StableBond C18 columns, both gradients were tested. This resulted in 14 different chromatographic systems.

All calculations were performed using Matlab, v. 7.5 (Mathworks, Inc., Natick, MA).

RESULTS AND DISCUSSION

Data Pretreatment

Before using the proposed methods to determine the identification power of a considered RPLC system, all retention times measured on each stationary phase are normalized according to eq 3, resulting in values ranging from zero to one. The normalization is applied to correct for the fact that the same gradient was applied on almost all columns. A consequence of the latter practice is that on some columns the peaks cover a large portion of the available separation area, such as the SB C18 column, while on other columns, such as the Novapak CN column, all compounds elute during the last three minutes of the analysis. When performing this normalization, the peaks will cover the whole separation space, which would also be achieved when optimizing the gradient for each stationary phase.

Parameters for Characterization of the Identification Power of Separation Methods

The parameters used for the calculations of the identification power of the methods for drug analysis are summarized in Table 1. For the correlation calculations, no specific parameters besides the observed retention times are required. The relationship between retention on two different columns is calculated, and is independent of the scale or spread of the retention times. For the Fs calculations, the results depend solely on the values obtained from the Snyder database of column selectivity parameters.12 For determining the number of compounds separated, as well as for the information calculations, we normalize retention times between zero and one, but we also need a Δtcrit value to determine at what degree of separation a pair of peaks will be deemed to be separated. The average σ for the experimental data was found to be 0.032 min, and was consistent across the different columns. From this σ value, we obtain a Δtcrit value of 0.091 min using eq 2. The normalized σ value is 0.0048, and then, using eq 2, the Δtcrit is found as 0.0135 (both dimensionless quantities). We use these σ and Δtcrit values for the normalized data for all columns when evaluating the single and parallel column separations. This gives rise to a peak capacity of 52 for each column.

Table 1.

Parameters for calculations of the identification power

Separation System σ (min) σ a (norm) Δtcrita
(norm)
Sampling
(norm)
Interval
1D separation 0.032 b 0.0048 0.0135 0.00048 f
Parallel 1D separations 0.032 b 0.0048 0.0135 0.00048 f
2D separation
1st dimension 0.0050 c /0.0086 d 0.024 0.0152 g
2nd dimension 0.0144 e 0.040 0.00144 f
a

The normalized σ and Δtcrit values are obtained by normalizing the retention range from zero to one. The Δtcrit values are calculated using eq 2.

b

The value of σ for the 1D separations was estimated from the average peak width of peaks in the experimental data.25

c

The first dimension σ value, 1σ, was estimated from an observed peak capacity in LC×LC separations of 50,31 such that σ = 1/(50×4) = 0.0050. This value was used in the multivariate selectivity calculations.

d

The effective first dimension σ value, 1σeff, was calculated from 1σ using eq 11. This value was used to calculate Δtcrit.

e

The second dimension σ value was estimated from an observed peak capacity in LC×LC separations of 17.4,31 such that σ = 1/(17.4×4) = 0.0144.

f

The sampling interval was calculated as σ/10.

g

The sampling interval for the first dimension separation was determined from the typical sampling rate of 0.048 Hz used in actual LC×LC experiments.31

For the determination of the appropriate σ and Δtcrit values for LC×LC separations, we use the reported peak capacities for a fast LC×LC separation as 50 and 17.4 for the first and second dimension separations, respectively.31 The first dimension peak capacity is very similar to the value found above. However, the reported peak capacity on the first dimension column does not take into the peak broadening caused by undersampling. Davis et al.34 have given the following expression for the effective peak width, 1σeff, as

σeff1=σ21+0.21ts2 (11)

where 1σ is for the peak as it elutes from the first dimension column and ts is the second dimension cycle time. The observed peak capacity on the first dimension column was found using a first dimension gradient time of 23 min and using a second dimension cycle time (sampling time) of 0.35 min. We scale the sampling time to obtain a dimensionless ts value of 0.35/23 = 0.0152. Then we calculate an appropriate dimensionless1σ value for our data based on the observed peak capacity19 of 50 over the first dimension gradient (1σ = 1/(50×4) = 0.0050). When introducing the calculated values of 1σ and ts in equation 11, a dimensionless 1σeff value of 0.0086 is obtained. For the second dimension separation, the peak capacity of 17.4 was achieved over a 17.4 s gradient. Therefore, an appropriate dimensionless 2σ value is 1/(17.4×4) = 0.0144. The corresponding (dimensionless) Δtcrit values (calculated using eq 2) then become 0.024 and 0.040, for the first and second dimension columns, respectively.

The bin size applied in information theory was based on the Δtcrit values calculated above. Practically, this means that for 1D and parallel 1D-separations the bin size is set to 0.0135 grid units. In LC×LC, the grid size is 0.024 in the first dimension and 0.040 in the second dimension. Different grid sizes are required, because the peak capacity and peak widths are different under the different conditions used for the first and second dimension separations. This also implies that a change in the order of the columns in the LC×LC system will result in a different information yield.

For the multivariate selectivity calculations, we also elected to use normalized retention time values. Gaussian peak simulations were carried out for an x-axis range from 0-4σ to 1+4σ, so that the peaks eluting at normalized retention times of zero and one were simulated with a full range of data points describing a Gaussian peak. For the single and parallel column separations, the dimensionless sampling interval was σ/10 = 0.00048. For the LC×LC calculations, the first dimension sampling interval (ts, determined as described above) was 0.0152. The second dimension sampling interval was 2σ/10 = 0.00144.

Correlation Coefficient (R2) and Snyder’s Fs Value

From the squared correlation coefficients calculated between the retention times of the 25 drugs on the 14 chromatographic systems shown in Table 2, it can be observed that the HC-SO3 column is the most dissimilar to all other columns, because it exhibits low squared correlation coefficients with all other columns. The most dissimilar combinations according to this metric (R2 = 0.01, shown in bold on Table 2) are the HC-SO3 column and the Discovery C18, the BetaBasic Phenyl, the HC-OH or the StableBond C18 (SB C18* with the 20-70% gradient). In Figures 1A-C examples are shown for two highly correlated columns, two columns with an intermediate correlation and two very dissimilar columns, respectively.

Table 2.

Squared Correlation Coefficients Calculated Between the Retention Times of 25 Drugs Measured on 14 Chromatographic Systems.

R2 SB C18 ACE C18 Disc C18 Allt C18 SB Ph BB Ph Pront Ph Prism RP Bonus RP Nova CN Inertsil CN HC-OH SB C18* HC-SO3*
SB C18 1.00
ACE C18 0.99 1.00
Disc C18 0.99 1.00 1.00
Allt C18 0.99 1.00 1.00 1.00
SB Ph 0.96 0.95 0.94 0.94 1.00
BB Ph 0.97 0.97 0.98 0.98 0.96 1.00
Pront Ph 0.97 0.96 0.95 0.95 0.98 0.96 1.00
Prism RP 0.83 0.80 0.77 0.78 0.83 0.74 0.86 1.00
Bonus RP 0.90 0.87 0.85 0.85 0.89 0.82 0.92 0.99 1.00
Nova CN 0.47 0.50 0.51 0.51 0.43 0.51 0.40 0.26 0.32 1.00
Inertsil CN 0.80 0.80 0.79 0.79 0.77 0.76 0.78 0.71 0.76 0.29 1.00
HC-OH 0.94 0.94 0.94 0.94 0.97 0.96 0.95 0.77 0.83 0.49 0.74 1.00
SB C18* 0.93 0.92 0.92 0.91 0.90 0.91 0.92 0.72 0.80 0.45 0.73 0.88 1.00
HC-SO3* 0.03 0.02 0.01 0.02 0.03 0.01 0.04 0.19 0.13 0.06 0.07 0.01 0.01 1.00

Calculated from retention times measured using a 30-70% ACN gradient, except for the columns denoted by the *, which were measured using a 20-70% ACN gradient.25 The least correlated columns are indicated in bold.

Figure 1.

Figure 1

tR of 25 drugs on (A) Alltima C18 plotted vs tR of the drugs on Discovery C18 (very highly correlated); y = 1.0028(±0.0020)x + 0.004(±0.010); sE = 0.0184 ; R2 = 0.9999; N = 25, (B) Intertsil CN plotted vs tR of the drugs on ACE C18 (moderately correlated);y = 0.595(±0.006)x − 0.43(±0.34); sE = 0.586 ; R2 = 0.7968; N = 25, (C) HC-SO3 plotted vs. tR of the drugs on the ACE C18 (poorly correlated); y = −0.17(±0.25)x + 6.5 (±1.4); sE = 2.38; R2 = −0.0246; N = 25

Snyder’s Fs values, shown in Table 3, do not totally agree with the results indicated by the correlation coefficients. The HC-SO3 column is considered dissimilar to the others (high Fs values) as concluded from the correlation coefficients; however, the RP columns and the Novapak CN exhibit similarly high Fs values with respect to the other columns, while they exhibit relatively higher correlation coefficients. According to the Fs values, the most dissimilar columns are the HC-SO3 column and the Bonus RP column (Fs = 469, shown in bold in Table 3). Also the HC-SO3 column and the Novapak CN or the Prism RP columns exhibit similar high Fs values. Both of those column combinations showed low correlation coefficients, though they were not considered the most dissimilar based on correlation.

Table 3.

Fs Values Calculated Between the Columns Determined Using Snyder’s H, S, A, B, C values.

Fs SB C18 ACE C18 Disc C18 Allt C18 SB Ph BB Ph Pront Ph Prism RP Bonus RP Nova CN Inertsil CN HC-OH SB C18* HC-SO3*
SB C18 0
ACE C18 12 0
Disc C18 14 3 0
Allt C18 8 7 10 0
SB Ph 18 23 24 18 0
BB Ph 26 23 23 22 16 0
Pront Ph 29 32 30 32 32 30 0
Prism RP 250 250 252 246 241 246 273 0
Bonus RP 267 266 268 262 257 261 289 24 0
Nova CN 231 230 232 226 222 226 253 38 52 0
Inertsil CN 49 50 48 53 55 52 24 296 312 276 0
HC-OH 30 25 24 29 41 41 47 253 269 235 57 0
SB C18* 0 12 14 8 18 26 29 250 267 231 49 30 0
HC-SO3* 207 208 205 211 214 209 182 454 469 435 160 208 207 0

Calculated from retention times measured using a 30-70% ACN gradient, except for the columns denoted by the *, which were measured using a 20-70% ACN gradient.25 The most dissimilar column pair is indicated in bold.

Number of Separated Compounds

In Table 4 it can be seen that in a single 1D-separation, the highest number of drugs (N = 15) could be identified using the NovaPak CN column. When combining the NovaPak CN column with the HC-SO3 column or with the Discovery C18 column in parallel 1D-analyses, this number was significantly increased (N = 23), and only two compounds remained indistinguishable. The first column combination also exhibits low correlation and a large Fs value. However, this is not the case for the Novapak CN column and the Discovery C18 column. The selection of this pair can be explained by the fact that the latter columns exhibit large individual N values. A large number of compounds (N = 20) could also be separated when the poorly correlated Discovery C18 and HC-SO3 columns were combined. However, this was not the case for the other three combinations indicated by the correlation coefficients: the HC-SO3 column paired with the BetaBasic Phenyl column, the StableBond C18 column (with the 20-70% ACN gradient) or the HC-OH column. This observation can be explained by the fact that relatively poor separations were obtained (seven, eight and eight compounds resolved by at least 0.7, respectively). This remark is also applicable to the column combinations indicated by the Fs values to be the most dissimilar. This confirms the consideration in the introduction: the most dissimilar columns as identified by the R2 or Fs values will not always result in the highest identification power. This result is because the correlation and Fs calculations do not reflect the quality of the individual separations, only the differences between the two separations. This is consistent with the conclusions of Gilar et al.,35 who emphasized that the percent utilization of the available separation space was key in providing the best indicator of “orthogonality” of separations. From Table 4 it can also be concluded that the analysis of the drugs on two different columns in parallel (instead of only one column) results in significantly better identification power.

Table 4.

Number of Drugs Separated with a Difference in Normalized Retention Times Greater Than 0.0135 (Rs > 0.7).

N separated SB C18 ACE C18 Disc C18 Allt C18 SB Ph BB Ph Pront Ph Prism RP Bonus RP Nova CN Inertsil CN HC-OH SB C18* HC-SO3*
SB C18 10
ACE C18 15 9
Disc C18 17 14 13
Allt C18 15 13 13 11
SB Ph 21 16 22 20 11
BB Ph 15 13 17 15 17 7
Pront Ph 18 15 19 17 14 15 9
Prism RP 17 15 19 18 15 13 14 9
Bonus RP 18 17 20 19 17 14 14 12 8
Nova CN 18 21 23 21 17 20 19 17 17 15
Inertsil CN 15 17 22 20 18 14 17 16 16 19 10
HC-OH 12 16 18 14 16 13 18 17 17 17 15 8
SB C18* 12 15 18 16 18 11 15 16 15 19 13 11 8
HC-SO3* 18 17 20 17 18 15 18 18 18 23 17 12 17 10

Calculated for the 14 individual chromatographic systems (italics) and on all their possible combinations in simultaneous 1D-separations from retention times measured using a 30-70% ACN gradient, except for the columns denoted by the *, which were measured using a 20-70% ACN gradient25

When considering comprehensive LC×LC, the highest number of drugs (19) can be separated, when the Discovery C18 is placed in the first dimension and the Novapak CN in the second dimension (Table S-1). These findings are also not consistent with the intermediate R2 and Fs values calculated for this column combination. At first glance, it is also surprising to note that the LC×LC separations almost invariably lead to fewer compounds being separated than even the single 1D separations. This observation arises from the Δtcrit value in the first dimension being increased from 0.0135 to 0.024 and the Δtcrit value in the second dimension being increased from 0.0135 to 0.040. These changes in Δtcrit indicate that there is a considerable decrease in resolving power (and therefore peak capacity) upon going from a 1D-LC method to a LC×LC method, specifically for the case of fast LC×LC. In the first dimension, this loss of resolution is due to undersampling, while in the second dimension, the loss of resolution is due to the requirements of the very fast gradient. These results underscore the importance of carefully considering whether a LC×LC method will really improve the resolving power of an analysis. However, a slow LC×LC method would require different Δtcrit values, which could result in a higher resolving power.

Information Theory

Considering a 1D-LC analysis, the Novapak CN column exhibits the largest information value (4.40), as shown in Table S-2, and thus the most informative separation can be obtained on this column. The same column was selected based on the number of compounds separable with a difference in retention time larger than the Δtcrit (Table 4). When relating the N values to the information value, this means that an information content of 15/25 log21/25 = 2.79 is contributed by the 15 separated drugs and 4.40 – 2.79 = 1.61 is the information yield for the non-separated drugs. This demonstrates that in information theory not only the number of totally separated compounds is important, but also the degree of co-elution of the non-separated compounds has a considerable influence on this metric.

When performing the information calculations for the analysis on two stationary phases in parallel as described previously, it is confirmed that combining two independent 1D-separations increases the information content and thus also the identification power (Table S-2). Information theory indicates one column combination that yields the maximal information (I = 4.64): the Discovery C18 combined with the Novapak CN column. This column pair is also indicated as having maximal identification power based on the number of separated compounds (N). The HC-SO3 column and Novapak CN, resulting in the same number of identifiable compounds, exhibit a slightly lower I(1,2) value (4.48). This is caused by the fact that in a situation of maximal information yield two compounds can be located in different bins, although they are not separated with a difference in retention time larger than Δtcrit or vice versa. The number of separated compounds, N, is moderately correlated with I(1,2) (r2 = 0.523, Figure S-1). However, one N value corresponds with many different information values, which is caused by the fact that for the same number of identifiable drugs different I(1,2) values can be obtained due to different co-elution patterns.

The most dissimilar systems according to information theory (lowest R(1,2) values in Table S-2) are the Discovery C18 and the BetaBasic Phenyl. This is not consistent with the results obtained with the correlation coefficients or the Fs values. The Pearson’s correlation coefficient is unrelated to the similarity values (r2 = 0.0042, Figure S-2). Secondly, it is noticeable that the similarity (R(1,2)) between all columns is very high. This is unexpected, since columns with different substituents were included, which should provide some selectivity differences. A possible explanation for the high R(1,2) values is that many compounds can be separated on one column, and thus, there is not much room for improvement by adding a second separation.

When applying information theory for comprehensive LC×LC analyses (Table S-3), in some cases the information is lower than when a 1D separation was performed. This is caused by the fact that in LC×LC the size of the bins was larger and thus two compounds will need a greater difference in retention time to be considered separated. The highest information could, amongst others, be obtained with the column combination also selected based on the N value: the Discovery C18 column in the first dimension and the Novapak CN in the second dimension. In addition, several other combinations of columns, denoted in bold in Table S-3, yielded maximal information. According to the information results for LC×LC, the highest dissimilarity occurs between the HC-SO3 column and the following two stationary phases: StableBond C18 and Prism C18. This confirms the earlier findings (correlation coefficients and Fs values) that HC-SO3 is most dissimilar to the other columns. Although the HC-SO3 column provides a rather low information yield when used in a single 1D separation, used in the first dimension of a 2D separation the broader peaks do not seem to degrade the separation as much as for the better performing columns.

Multivariate Selectivity

In case of single 1D-LC separations, the addition of spectra resulted in a higher average multivariate selectivity for all columns (Table 5). However, the number of identifiable compounds (number of drugs with SEL > 0.98) could not be increased for most columns. The highest average selectivity based on the retention times without spectral information was obtained on the StableBond phenyl column, but the highest number of compounds could be separated on the Novapak CN column. When adding the spectral information, the same columns are indicated as being most useful for identification. However, based on the average selectivity the Novapak CN was also selected. The above results are consistent with those based on the critical difference in retention time or based on information theory, since these techniques also indicated the Novapak CN as the column on which the highest number of compounds can be identified. When adding spectral information, the average selectivity on the Novapak CN column increased modestly (0.85 vs. 0.99), but no increase in the number of compounds was observed (17 compounds identified either with or without spectral information considered). When comparing the number of separable compounds determined based on Δtcrit (Table 4) and based on SEL without spectral information (Table 5), the values are similar for all columns. This is not surprising as we based our selectivity threshold on a target resolution of 0.7 that was used in deriving the Δtcrit values. It can also be seen that adding spectral information benefits those separations with lower initial average selectivity more than those separations with already high average selectivity. This can be seen by comparing the HC-SO3 column, where the average selectivity increases by 27 % (from 0.54 to 0.68) to the StableBond phenyl column, where the average selectivity increases by only 4.7 % (from 0.86 to 0.90).

Table 5.

Average Multivariate Selectivity and Number of Compounds with a Multivariate Selectivity Greater Than 0.98.

Without spectra With spectra

Av. SEL SEL>0.98 Av. SEL SEL>0.98
SB C18 0.65 10 0.74 10
ACE C18 0.65 9 0.76 9
Disc C18 0.63 13 0.74 13
Allt C18 0.64 13 0.75 13
SB Ph 0.86 11 0.90 11
BB Ph 0.57 9 0.73 10
Pront Ph 0.69 9 0.77 9
Prism RP 0.55 9 0.67 9
Bonus RP 0.73 10 0.80 10
Nova CN 0.85 17 0.90 17
Inertsil CN 0.77 12 0.83 12
HC-OH 0.57 8 0.71 9
SB C18* 0.56 8 0.71 8
HC-SO3* 0.54 10 0.68 10

Calculated from 1D-LC retention times measured using a 30-70% ACN gradient, except for the columns denoted by the *, which were measured using a 20-70% ACN gradient.25 Values in bold indicate the maximal values.

When performing independent 1D-LC separations on two columns in parallel (Table S-4), higher average selectivity values are obtained than with simple 1D-LC, and the number of identifiable compounds increases substantially. The maximal average SEL (0.99) without including spectral information was obtained with the Novapak CN column combined with either the Discovery C18 or the Alltima C18 columns. The highest number of drugs (22) could be separated using these same combinations. Information theory and the Δtcrit approach also identified the Discovery C18/Novapak CN combination as a top choice for the maximum identification power. The addition of the spectral information resulted, as expected, in greater average multivariate selectivities, but not a greater number of identifiable drugs in most cases. Whether or not the spectral information is included, the combinations of the Novapak CN column with either the Discovery C18 or the Alltima C18 columns are selected as providing the best identification power for parallel 1D separations.

The average multivariate selectivity values, obtained with LC×LC (Table S-5), are greater than those obtained for 1D-separations and, in most cases greater than those obtained for two parallel 1D-analyses. The reason that the LC×LC results are not always better than the parallel 1D-analyses is due to the need to use a larger Δtcrit for both the first and second dimension separations. In the first dimension, this is due to the undersampling inherent in the LC×LC method, and in the second dimension, this is due to the decrease in peak capacity for a very fast gradient run. An average selectivity of one was obtained for analyses on several column combinations indicated in boldface on Table S-5 and given in Table 6. However, this does not imply that all drugs are identifiable when using these columns. The maximal number of drugs (25) could only be identified when placing the StableBond C18 in the first dimension and the Novapak CN in the second dimension, and, when using spectral information, the combination of the Discovery C18 with the Novapak CN. The other column combinations exhibiting high average multivariate selectivity result in 21 to 23 identifiable drugs. The latter column combinations also exhibited maximal information. However, with information theory the number of column combinations giving maximal information was larger. This is possibly caused by the fact that the scale of information values is not continuous and similar separations will result in the same value of I(1,2). For most column combinations the addition of spectral information resulted in higher average selectivities and a higher number of compounds with a multivariate selectivity higher than 0.98. However, this was not always the case when the selectivity was already high without spectral information.

Table 6.

Summary of Results for Identification Power of Separations.

R2a Fsb Nc Informationd Similaritye Average Selectivityf N (SEL>0.98)g
Single 1D-LC NAh NAh N=15
Nova CN
I=4.40
Nova CN
NAh SEL=0.86
SB Phenyl
N=17
Nova CN
Single 1D-LC
w/DAD
NAh NAh NAh NAh NAh SEL=0.90
SB Phenyl
Novapak CN
N=17
Nova CN
Parallel 1D-LC R2 = 0.01
HC-SO3*&Disc C18 or
BB Phenyl or HC-OH
or SB C18*
Fs=469
HC-SO3*&Bonus
RP
N=23
HC-SO3*&Nova CN
Disc C18&Nova CN
I(1,2)=4.64
Disc C18&Nova CN
R(1,2)=0.971
Disc. C18&BB
Phenyl
SEL=0.99
Disc C18&Nova CN
Allt C18&Nova CN
N=22
Disc C18&Nova CN
Allt C18&Nova CN
Parallel 1D-LC
w/DAD
NAh NAh NAh NAh NAh SEL=0.99
Disc C18&SB Phenyl
Allt C18&SB Phenyl
Disc C18&Nova CN
Allt C18&Nova CN
N=22
Disc C18&Nova CN
Allt C18&Nova CN
Comprehensive
LC×LC
NAh NAh N=19
Disc C18&Novapak
CN
I(1,2)=4.64
SB C18&Nova CN
ACE C18&Nova CN
Disc C18&Nova CN or Inertsil CN
Allt C18&Nova CN or Inertsil CN
 or HC-SO3*
Pront Ph&ACE C18
Nova CN&SB C18 or ACE C18 or
 Disc C18 or Allt C18
Inertsil CN&ACE C18
HC-SO3*&Disc C18 or Allt C18
R(1,2)=0.969
SB C18& HC-
SO3
Prism C18& HC-
SO3
HC-SO3& SB
C18
SEL=1.00
SB C18&Nova CN
ACE C18&Nova CN or
 Inertsil CN
Disc C18&Nova CN or
 Intertsil CN
Allt C18&Nova CN or
 Inertsil CN
Nova CN&SB C18 or
 ACE C18 or Disc C18
 or Allt C18
N=25
SB C18&Nova CN
Comprehensive
LC×LC
w/DAD
NAh NAh NAh NAh NAh SEL=1.00
SB C18&Nova CN
ACE C18&Nova CN or
 Inertsil CN
Disc C18&Nova CN or
 Intertsil CN
Allt C18&Nova CN or
 Inertsil CN
Nova CN&SB C18 or
ACE C18 or Disc C18
 or Allt C18
N=25
SB C18& Nova CN
Disc C18& Nova CN

Comparison of Information Theory and Multivariate Selectivity

The measures to evaluate the identification power of the separation of the 25 drugs with information theory and with multivariate selectivity (SEL) should be correlated if both approaches are evaluating the same situation (in this case the same combination of parallel columns) well. The information measure I(1,2) is moderately correlated with both the average SEL (R2=0.47, Figure S-3) as well as the number of compounds with SEL>0.98 (R2=0.60, Figure S-4). Correlations for the I(1,2) with the selectivity measures for LC×LC methods show somewhat better correlations, with I(1,2) vs. the average selectivity giving an R2 = 0.715 (Figure S-5), and I(1,2) vs. the number of compounds with selectivity >0.98 giving an R2 = 0.708 (Figure S-6). The reason for this may be that when estimating the information yield for two chromatographic systems applied in parallel we always assume that a non-identifiable drug is only co-eluting with one other drug. However, in reality this is not always the case. Consequently, the obtained information value is only an approximation and not as well correlated with the multivariate selectivity, which takes into account the degree of co-elution correctly.

CONCLUSIONS

The column choices with the most identification power indicated by each of the studied methods are summarized in Table 6. We have demonstrated that (dis)similarity measures such as Fs and R2 do not always indicate the two dissimilar chromatographic systems with the highest identification power. In addition, these measures do not allow distinction between parallel 1D separations and comprehensive 2D separations. An advantage of the Fs metric, however, is that columns can be selected without carrying out actual separations of the target compounds. This tool allows us to choose which columns to select for experimental measurements, which allow for the calculation of the more informative metrics. The measures proposed in this paper enable the comparison between (1) a single 1D-LC method, (2) two 1D-LC separations performed in parallel and (3) the corresponding comprehensive LC×LC method. The number of identifiable compounds (N) can be determined by counting the compounds with a difference in retention time greater than a critical value, which depends on the efficiency of the LC analysis (the peak width). However, N does not take into account the degree of overlap. Information theory enables distinction between two separations with the same number of identifiable compounds, but with a difference in co-elution pattern. Two pairs of two co-eluting compounds, for instance, will yield more information than four co-eluting compounds. The only measure capable of quantifying the degree of co-elution as well as the degree of separation is multivariate selectivity. The latter can evaluate the number of compounds additionally identified by applying multivariate curve resolution methods when incorporating spectral information obtained with a DAD. Consequently, multivariate selectivity takes thus into account the degree of resolution between nominally unresolved peaks. We feel that this metric offers the most potential for comparing the performance of different analytical methods. It should be mentioned that only a specific set of compounds was considered, and that the results only can be generalized to the extent that these drug solutes serve as general probes of the separation characteristics of the system.

Supplementary Material

1_si_001

ACKNOWLEDGEMENTS

The Ph.D. studies of M. Dumarey are funded with a specialization grant from the institute for the Promotion of Innovation by Science and Technology in Flanders (IWT). The Research Institute Flanders is thanked for supporting her stay at the Virginia Commonwealth University. This work was supported by a grant (GM054585) from the National Institutes of Health as well as by a grant from the U.S. Department of Justice and the National Institute of Justice (Interagency Agreement 2008-DN-R-038) through the Ames Laboratory under Contract No. DE-AC02-07CH11358.

Footnotes

SUPPORTING INFORMATION AVAILABLE Additional Tables S-1–S-5 and Figures S-1–S-6 as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.

REFERENCES

  • (1).Maurer HH. J. Chromatogr. B. 1999;733:3–25. doi: 10.1016/s0378-4347(99)00266-2. [DOI] [PubMed] [Google Scholar]
  • (2).Maurer HH. Clin. Chem. Lab. Med. 2004;42:1310–1324. doi: 10.1515/CCLM.2004.250. [DOI] [PubMed] [Google Scholar]
  • (3).Stoev G, Stoyanov A. J. Chromatogr. A. 2007;1145:141–148. doi: 10.1016/j.chroma.2007.01.071. [DOI] [PubMed] [Google Scholar]
  • (4).Pellet J, Lukulay P, Mao Y, Bowen W, Reed R, Ma M, Munger RC, Dolan JW, Wresley L, Medwid K, Toltl NP, Chan CC, Skibic M, Biswas K, Wells KA, Snyder LR. J. Chromatogr. A. 2006;1101:122–135. doi: 10.1016/j.chroma.2005.09.080. [DOI] [PubMed] [Google Scholar]
  • (5).Stoll DR, Wang X, Carr PW. Anal. Chem. 2008;80:268–278. doi: 10.1021/ac701676b. [DOI] [PubMed] [Google Scholar]
  • (6).Boone CM, Jonkers EZ, Franke JP, de Zeeuw RA, Ensing K. J. Chromatogr. A. 2001;927:203–210. doi: 10.1016/s0021-9673(01)01100-1. [DOI] [PubMed] [Google Scholar]
  • (7).de Zeeuw RA, Witte DT, Franke JP. J. Chromatogr. A. 1990;500:661–671. [Google Scholar]
  • (8).Maier RD, Bogusz M. J. Anal. Toxicol. 1995;19:79–83. doi: 10.1093/jat/19.2.79. [DOI] [PubMed] [Google Scholar]
  • (9).Hegge HFJ, Franke JP, de Zeeuw RA. J. Forens. Sci. 1991;36:1094–1101. [Google Scholar]
  • (10).Schepers PGAM, Franke JP, de Zeeuw RA. J. Anal. Toxicol. 1983;7:272–278. doi: 10.1093/jat/7.6.272. [DOI] [PubMed] [Google Scholar]
  • (11).Van Gyseghem E, Van Hemelryck S, Daszykowski M, Questier F, Massart DL, Vander Heyden Y. J. Chromatogr. 2003;988:77–93. doi: 10.1016/s0021-9673(02)02012-5. [DOI] [PubMed] [Google Scholar]
  • (12).Snyder LR, Dolan JW, Carr PW. J. Chromatogr. A. 2004;1060:77–116. [PubMed] [Google Scholar]
  • (13).Dragovic S, Haghedooren E, Németh T, Palabiyik IM, Hoogmartens J, Adams E. J. Chromatogr. A. 2009;1216:3210–3216. doi: 10.1016/j.chroma.2009.02.023. [DOI] [PubMed] [Google Scholar]
  • (14).Van Gyseghem E, Dejaegher B, Put R, Forlay-Frick P, Elkihel A, Daszykowski M, Héberger K, Massart DL, Vander Heyden Y. J. Pharmaceut. Biomed. 2006;41:141–151. doi: 10.1016/j.jpba.2005.11.007. [DOI] [PubMed] [Google Scholar]
  • (15).Put R, Van Gyseghem E, Coomans D, Vander Heyden Y. J. Chromatogr. A. 2005;1096:187–198. doi: 10.1016/j.chroma.2005.03.138. [DOI] [PubMed] [Google Scholar]
  • (16).Dumarey M, Put R, Van Gyseghem E, Vander Heyden Y. Anal. Chim. Acta. 2008;609:223–234. doi: 10.1016/j.aca.2007.12.047. [DOI] [PubMed] [Google Scholar]
  • (17).Jandera P. J. Sep. Sci. 2006;29:1763–1783. doi: 10.1002/jssc.200600202. [DOI] [PubMed] [Google Scholar]
  • (18).Jandera P, Halama M, Kolárová L, Fischer J, Novotná K. J. Chromatogr. A. 2005;1087:112–123. doi: 10.1016/j.chroma.2005.01.061. [DOI] [PubMed] [Google Scholar]
  • (19).Massart DL, Kaufman L. The Interpretation of Analytical Chemical Data by the Use of Cluster Analysis. John Wiley & Sons; New York: 1983. pp. 25–26. [Google Scholar]
  • (20).Messick NJ, Kalivas JH, Lang PM. Anal. Chem. 1996;68:1572–1579. doi: 10.1021/ac951212v. [DOI] [PubMed] [Google Scholar]
  • (21).Cantwell MT, Porter SEG, Rutan SC. J. Chemom. 2007;21:335–345. [Google Scholar]
  • (22).Sinha AE, Hope JL, Prazen BJ, Fraga CG, Nilsson EJ, Synovec RE. J. Chromatogr. A. 2004;1056(1-2):145–154. [PubMed] [Google Scholar]
  • (23).Massart DL, Vandeginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J. Handbook of Chemometrics and Qualimetrics: Part A. Elsevier; Amsterdam: 1997. pp. 221–223. [Google Scholar]
  • (24).Gilroy JJ, Dolan JW, Snyder LR. J. Chromatogr. A. 2003;1000:757–778. doi: 10.1016/s0021-9673(03)00512-0. [DOI] [PubMed] [Google Scholar]
  • (25).Fan W, Zhang Y, Carr PW, Rutan SC, Dumarey M, Schellinger AP, Pritts W. J. Chromatogr. A. 2009;1216:6587–6599. doi: 10.1016/j.chroma.2009.07.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Wang X, Stoll DR, Schellinger AP, Carr PW. Anal. Chem. 2006;78:3406–3416. doi: 10.1021/ac0600149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Massart DL, Kaufman L. The Interpretation of Analytical Chemical Data by the Use of Cluster Analysis. John Wiley & Sons; New York: 1983. pp. 162–163. [Google Scholar]
  • (28).Slonecker PJ, Li XD, Ridgway TH, Dorsey JG. Anal. Chem. 1996;68:682–689. doi: 10.1021/ac950852v. [DOI] [PubMed] [Google Scholar]
  • (29).Olivieri AC. Anal. Chem. 2005;77:4936–4946. doi: 10.1021/ac050146m. [DOI] [PubMed] [Google Scholar]
  • (30).Vivó-Truyols G, Torres-Lapasió JR, García-Alvarez-Coque MC. J. Chromatogr. A. 2003;991:47–59. doi: 10.1016/s0021-9673(03)00172-9. [DOI] [PubMed] [Google Scholar]
  • (31).Stoll DR, Cohen JD, Carr PW. J. Chromatogr. A. 2006;1122(1-2):123–137. doi: 10.1016/j.chroma.2006.04.058. [DOI] [PubMed] [Google Scholar]
  • (32).Zhang Y. Ph.D. Dissertation. University of Minnesota; 2009. [Google Scholar]
  • (33).Luo H, Ma L, Zhang Y, Carr PW. J. Chromatogr. A. 2008;1182:41–55. doi: 10.1016/j.chroma.2007.11.104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Davis JM, Stoll DW, Carr PW. Anal. Chem. 2008;80:461–473. doi: 10.1021/ac071504j. [DOI] [PubMed] [Google Scholar]
  • (35).Gilar M, Olivova P, Daly AE, Gebler JC. Anal. Chem. 2005;77(19):6426–6434. doi: 10.1021/ac050923i. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1_si_001

RESOURCES