Abstract
In this study, a predictive model named COSMO-SAC was investigated in solid/liquid equilibria for pharmaceutical compounds. The examined properties were the solubility of drug in the pure and mixed solvents, octanol/water partition coefficient, and cocrystal formation. The results of the original COSMO-SAC model (COSMO-SAC (2002)) was compared with a semi-predictive model named Flory–Huggins model and a revised version of the COSMO-SAC (COSMO-SAC (2010)). The results indicated the acceptable accuracy of the COSMO-SAC (2002) in the considered scope. The results emphasized on the suitability of the COSMO-SAC model for simple molecules containing C, H, and O by covalent and hydrogen bonding interactions. Applicability of the COSMO-SAC for more complicated molecules made of various functional groups such as COO and COOH doubly requires more modification in the COSMO-SAC.
Subject terms: Biomedical engineering, Chemical engineering
Introduction
Knowing of phase equilibria, and thermodynamic properties such as solubility and partition coefficient for pharmaceutical compounds has wide applications in the design, development, and optimization of their manufacturing in laboratory or industry scale. Besides of experimental approach, which is time-consuming and expensive, the mathematical modeling gathered attentions due to lower cost and wide working range without further limitations from the substance type and ambient conditions. Generally, three aspects reported for the thermodynamic modeling are: (1) semi-empirical model, (2) semi-predictive model, (3) predictive model, which have different accuracies and reliable ranges. The theoretical quantum chemistry applied in the model proposal and the need to experimental data are the most significant differences between groups (2) and (3). While the semi-empirical models often are correlations without theoretical meaning obtained by experiment for certain species.
Among the mentioned models, the predictive models estimate entirely the desired properties by knowing only the molecular structure without the further requirement to experimental data. The UNIFAC1, the NRTL-SAC2, the COSMO-RS3–5, and the COSMO-SAC6 are a few examples. The predictive models, such as the UNIFAC, primarily defined based on functional group and several adjustable parameters. In contrast, other two predictive models, such as the COSMO-RS and the COSMO-SAC, are conductor-like screening models-realistic solvation and compute activity coefficient based on the computational quantum mechanics by knowing the molecular structure and fewer adjustable parameters in comparison to the UNIFAC. The COSMO-RS is the firstly developed by extension of a dielectric continuum-solvation model to liquid phase thermodynamics, and the COSMO-SAC is a modified version of the COSMO-RS7.
The several researcher studied the COSMO-SAC and the COSMO-RS. Tung et al.8 compared the NRTL-SAC and the COSMO-SAC to predict pharmaceutical solubilities for Lovastatin, Simvastatin, Rofecoxib, and Etoricoxib. Zhou et al.9 applied the COSMO-SAC to separate thioglycolic acid from its aqueous solution by ionic liquids. Paese et al.10 considered the COSMO-SAC for predicting phase equilibria of aqueous sugar solutions and industrial juices. Xavier et al.11 studied vapor–liquid equilibria (VLE) of systems containing fragrances using the COSMO-SAC. Bouillot et al.12 investigated drug solubilities by the COSMO-SAC. Shu and Lin13 predicted drug solubility in mixed solvent systems using the COSMO-SAC activity coefficient model. Buggert et al.14 applied the COSMO-RS for partition coefficient calculations. Hsieh et al.15 considered the original COSMO-SAC (COSMO-SAC 2002) and revised the COSMO-SAC models (COSMO-SAC 2010) for solubility and octanol/water partition coefficient for pharmaceutical compounds. They reported a 388% error for solubility prediction from the original COSMO-SAC (COSMO-SAC 2002).
In contrast to researchers focused on the predictive ability of the COSMO-SAC for different systems, some authors studied the primary quantum mechanism applied in the COSMO-SAC and developed various data bank. Mullins et al.16 developed a database consist of 1432 COSMO files and provided FORTRAN code for sigma profile and activity computations. Bell et al.17 assembled an extensive database of COSMO files for 2261 compounds. Ferrarini et al.18 distributed a sigma-profile database for a wide range of molecules using the GAMESS software. They also tested different quantum chemistry theories for the calculation of the electronic structure. Mu et al.19 examined the performance of COSMO-RS with sigma profiles from different theories.
Some authors modified the COSMO-SAC model in order to increase accuracy. Lee and Lin20 added Peng–Robinson EOS to the COSMO-SAC. Firstly, Lin et al.21 introduced the concept of modifying sigma profile to enhance model precisions. Hsieh et al.22 improved the COSMO-SAC for vapor–liquid and liquid–liquid equilibrium calculations by separating the sigma profile into HB-OH, HB-nonOH, and non-HB. Afterward, Paulechka et al.23 revised the COSMO-SAC model by splitting the sigma profile into OH and non-OH parts and Islam and Chen24 proposed a method for the sigma profile generation input into the COSMO-SAC.
The object of this study is to investigate the performances of two existing predictive models based on COSMO calculations, the COSMO-SAC (2002) and the COSMO-SAC (2010), for pharmaceutical compounds and to compare it with another widely applicable predictive model called the Flory–Huggins model25. By comparison of the COSMO-SAC to another predictive model such as the Flory–Huggins model, its unremarkable impacts in the predictive model scope is determined. The examined pharmaceutical compounds contain H, C, O, N, S, F, and Cl atoms and include at least one hydrogen bonding or double bond between atoms. The solubility in binary and ternary systems, octanol/water partition coefficient, and cocrystal formation are of interest in the current study. For solubility in the binary system, 918 data for 110 systems for 35 pharmaceutical compounds are over temperature ranges 262–360 K and the mole fractions to 0.7. Afterward, two systems of cocrystal formation, sulfamethazine-salicylic acid in methanol solvent and carbamazepine-acetyl salicylic acid in ethanol, are investigated by the COSMO-SAC (2002) model which have not been studied before.
Methods
COSMO file and sigma profile
As described before, the basis of the COSMO-SAC model is quantum mechanics through density function theory calculations. Several commercial or free software provide preliminary information for COSMO-SAC in the form of a text file called COSMO-file. Dmol3 module in Materials Studio and academic free software GAMESS are few examples. In COSMO calculations, a molecule separates into several parts called segment and charge distributions over entire segments are calculated in order to neutralize whole molecule. Location of segments, segment areas and charge densities are the computed properties in COSMO file. In order to perform COSMO-SAC calculations, the following data must obtain from COSMO-file: (1) surface area () and cavity volume of the molecule (), (2) location of segment (a vector with x, y and z coordination), its charge density () and area (). The mentioned information were modified in order to make the sigma profile () required for COSMO-SAC calculations. Klamt et al.4 introduced the following equation to average the charge densities from COSMO-file
| 1 |
In the above equation, dmn is the distance between two segments n and m. The rn (segment radius) is obtained from segment area as follows:
| 2 |
Mullins et al.16 reported the value of rave. The sigma profile defined as the probability of finding segments with charge density :
| 3 |
where n is determined from accounting the number of segments with specific charge density and is surface area with charge density .
Generally, for most molecules, charge density values range between − 0.025 to 0.025 . Four steps for generating the sigma profile are as below:
Consider 50 intervals by 0.001 increments in charge density range − 0.025 to 0. 025.
- Each interval is defined by lower and upper bounds, and . Firstly, find the charge densities distributed at interval i and calculated their contributions according to:
4 - Afterward, calculate probabilities at lower and upper bounds of interval i as below:
5 6 The sigma profile is generated by plotting sigma values versus the calculated probabilities.
As described in literature review, some authors divided the sigma profile into parts to have a better description of hydrogen-bonding (hb) interactions. Hsieh et al.22 proposed to separate the sigma profile into non hydrogen bounding, hydroxyl group (OH) and non-hydroxyl group as follows equation (COSMO-SAC (2010)):
| 7 |
where donates probabilities of all non-hydrogen bounding atoms, shows probabilities of OH bounding and determines F, N, and hydrogen atoms connected to F and N atoms. The above-mentioned contributions were determined as follows:
| 8 |
| 9 |
where is threshold for hydrogen bounding determination and its values is 0.007 .
COSMO-SAC model
COSMO-SAC (2002)
In the COSMO-SAC model, activity coefficients computed by solvation energy were obtained from ab initio solvation calculation at two steps: (1) the dissolution of a solute in the conductor, (2) conversion of the conductor into a real solvent. The activity coefficient of component i in solvent S in the COSMO-SAC () obtained by considering two contributions; combinatorial part.
() and residual part() as follows6:
| 10 |
The size and shape differences of the molecules are accounted in the combinatorial part and calculated by the Staverman–Guggenheim term as follows26:
| 11 |
where , and are defined as follows:
| 12 |
In the above expressions, and are related to cavity volume of component i () and total surface area of molecule i () obtained from the COSMO-file and defined as follows:
| 13 |
where and are the normalized volume and normalized surface area. The residual part of the COSMO-SAC (2002) was defined as follows6,17:
| 14 |
where , effective segment number of molecule i, is correlated with effective segment surface area () and surface area of molecule i () according to below expression:
| 15 |
where is the segment activity coefficient and calculated from:
| 16 |
| 17 |
The exchange energy is defined:
| 18 |
The and are the energy-type constant and cutoff value for hydrogen bonding interaction16. The and are maximum and minimum values of and . accounts the misfit energy and the T and R are system temperature and the universal gas constant. The values of above mentioned parameters are reported in Mullins et al.16. In Eq. (16), the sigma profile for the mixture () are obtained from:
| 19 |
COSMO-SAC (2010)
After establishing NHB, OH, and OT sigma profiles, the segment activity coefficient calculates as follows:
| 20 |
where subscript shows pure liquid or mixture and subscript denotes NHB, OH, and OT sites. The exchange energy has defined based on interaction between segments of different types, and is given by:
| 21 |
In contrast to COSMO-SAC (2002), the hydrogen bounding interaction chb has variable values for the contributions OH and OT:
| 22 |
Three hydrogen bounding interaction parameters (cOH-OH, cOT-OT, and cOH-OT), AES, and BES are adjustable parameters and their values were given in Hsieh et al.22. Afterward, the activity coefficient of component i in mixture S is determined from:
| 23 |
Flory–Huggins theory
In this study, a semi-predicative version of the Flory–Huggins model was incorporated based on the Hansen solubility parameters. In Flory–Huggins theory, activity coefficient of component i in mixture is obtained from25:
| 24 |
In the above equation, is the volume fraction () and V is the molar volume. is the Flory–Huggins interaction parameter obtained from the Hansen solubility () contributions in the forms non-polar (dispersion) forces (d), polar forces (p) and hydrogen-bonding (h) effects as follows27:
| 25 |
The Hansen solubility parameters and their contributions were obtained by group contribution methods according to the following equations28:
| 26 |
The , , and values were extracted from Barton28.
Solid–liquid equilibria
In solid–liquid equilibria, the solid solubility in liquid phase is calculated according to the following expression:
| 27 |
where and stand the solubility and activity coefficient of compound i. The activity coefficient in the above expression was computed from the considered models as described before. , and represent the fusion enthalpy, the heat capacity of phase change between solid and liquid phases and the melting point temperature, respectively. In the current study, the second term of Eq. (27) was neglected ().
Partition coefficient
When the equilibrium condition between two immiscible liquid phases establishes, the components distribute between two phases. The distribution of component i between two phases α and β measured by partition coefficient as follows15:
| 28 |
where and are mole fractions of component i in phases α and β; and their activity coefficients, and , respectively. Therefore, the octanol/water partition coefficient for component i () calculates from15:
| 29 |
where and are total concentrations in octanol-rich and water-rich phases. The and are activity coefficients of component i in octanol-rich and water-rich phases at dilute concentration. The default values for is 0.151. The octanol-rich phase is composed from 27.5 mol% water and 72.5 mol% octanol. The water-rich phase is free of octanol.
Cocrystal formation
The three-phases diagram for a drug and an API with cocrystal (CC) formation includes three lines named solubility lines, API/solvent and drug/solvent, and cocrystal line. The solubility lines of drug and API in solvent are determined from solubility calculations of drug/API in mixture according to Eq. (27) in corporation with the considered models. The cocrystal formation is identified by a chemical reaction between the drug (A) and the API (B) as follows29,30:
| 30 |
where a and b are stoichiometric coefficient of substances A and B in the cocrystal. In the above equations, the Kcc is solubility product and are computed by the following equation:
| 31 |
The activity coefficients in Eq. (31) computed from the examined model. The solubility product (KCC) is depend only on temperature and independent to solvent type. By knowing solubility product at single point, it can be applied to other conditions. After obtaining solubility product for desired system, the invariant points as intersections of cocrystal line and solubility line were computed by simultaneous solvation of Eqs. (27) and (31). Afterward, the cocrystal region is determined by varying drug mole fraction between two invariant points and obtaining API mole fraction from Eq. (31).
Statistical analysis
In order to explore model precision in comparison to experimental data, several statistics were applied such as absolute average percentage deviation (% AAD), root mean square error (RMSE), mean square error (MSE), normalized root mean square error (NRMSE) and normalized mean square error (NMSE). MSE, NRMSE and NMSE were obtained from goodness of Fit function in MATLAB programming software. Absolute average percentage deviation was calculated as following equations:
| 33 |
where are calculated and experimental data of desired properties and n is number of experimental data. The root mean square error (RMSE) was obtained as follows:
| 34 |
Results and discussion
The object of this section is to evaluate the performances of the COSMO-SAC (2002), the COSMO-SAC (2010) and the Flory–Huggins models for pharmaceutical compounds, which mostly are complicated/massive molecules containing electronegative atoms such as N, O, and S; and complicated bonds between atoms such as hydrogen bonding. The considered properties are solubilities of pharmaceutical compounds in pure solvent and solvent mixtures. The octanol/water partition coefficient and cocrystal formation of pharmaceutical compounds are other examined properties. In order to conduct the study, firstly, the COSMO files from DMol3 were required. Thus, the COSMO files prepared for 15 solvents and 35 pharmaceutical compounds from DMol3 modules in Materials Studio 2017 software. In performing the COSMO file, density function was chosen to GGA (VWN-BP) by quality fine. In electronic options, multipolar expansion was selected octupole. The calculations run at four parallel cores. Other options set to default values in DMol3.
After generating the COSMO file, it is time to test sigma profiles obtained in the current study by reported sigma profiles by other studies. Figures 1 and 2 compare sigma profiles generated in current studies for ibuprofen and acetyl salicylic acid in comparison to sigma profiles in the database provided by Mullins et al.16. Based on Figs. 1 and 2, the same trends between results in this study and Mullins et al.16 were observed. The small departures between two curves originated from the software version and the sigma profile generation program.
Figure 1.
Generated sigma profiles for acetyl salicylic acid in comparison to Mullins et al.16.
Figure 2.
Generated sigma profiles for Ibuprofen in comparison to Mullins et al.16.
After generating the sigma profiles and providing the COSMO-SAC computation program for the activity coefficient, the solubilities in the binary and ternary systems were calculated and compared by experimental data obtained from the literature.
Figure 3 shows the parity plots of experimental solubility in the pure solvents in comparison to calculated solubilities from the COSMO-SAC (2002), the COSMO-SAC (2010), and the Flory–Huggins models. The mean square error (MSE), normalized root mean square error (NRMSE), and normalized mean square error (NMSE) for the COSMO-SAC (2002) model are 0.0136, 0.0349, and 0.0685. The MSE, NMSE, and NRMSE for the COSMO-SAC (2010) are 0.0187, − 0.2718, and − 0.1277. While MSE, NMSE, and NRMSE for the Flory–Huggins model are 0.0360, − 1.2337, and − 0.4946. According to Fig. 3, it is observed that the Flory–Huggins model under predicts the solubility data. The examined pharmaceutical compounds contain a wide variety of components made of small to long-chain molecules. The pharmaceutical compounds compose of atoms C, H, N, O, S, F, and Cl, which joint by covalent bonds and stronger bonds such as hydrogen bonding. The reported statistics imply on the relatively acceptable performance of the COSMO-SAC (2002) regarding to the COSMO-SAC (2010). The comparison between accuracy of COSMO-SAC (2002) and COSMO-SAC (2010) seems to be inconsistent with those reported in the literature15. The accuracy of these two COSMO-SAC models has been comprehensively examined through a very large dataset, containing 29,173 data points of infinite dilution activity coefficient and 139,921 VLE data points of 6940 binary mixtures31. The mentioned inconsistency arises from different universal constants implemented in sigma profile generation. The differences in investigated systems attribute the second reason for the observed inconsistency.
Figure 3.
Parity plot of solubility in pure solvent (mole fraction) from the COSMO-SAC (2002) (dot symbol), the COSMO-SAC (2010) (plus symbol) and Flory–Huggins model (circle symbol) in comparison to experimental data.
It is interesting that the COSMO-SAC (2002) was obtained by only eight universal constant parameters without any further modifications. A list of considered pharmaceutical compounds and their physical properties and references for experimental data were presented in supplementary materials (Table S1).
The Hansen solubility parameters, molar volumes for the Flory–Huggins model and the COSMO molar volume of the examined pharmaceutical compounds and solvents were presented on Table 1. Based on Table1, the molar volume obtained from group contribution method in Barton28 and the COSMO calculations have some difference.
Table 1.
Hansen solubility parameters and molar volumes from group contribution method in comparison to molar volumes obtained from the COSMO calculations.
| Substance | V(cm3/mol) Flory–Huggins | V(cm3/mol) COSMO | ||||
|---|---|---|---|---|---|---|
| 1-Propanol | 24.5 | 16 | 6.8 | 17.4 | 75.2 | 52.64 |
| 2-Propanol | 23.5 | 15.8 | 6.1 | 16.4 | 76.8 | 52.64 |
| Acetic Acid | 21.4 | 14.5 | 8 | 13.5 | 57.1 | 43.38 |
| Acetone | 20 | 15.5 | 10.4 | 7 | 74.0 | 49.92 |
| Acetonitrile | 24.4 | 15.3 | 18 | 6.1 | 52.6 | 38.41 |
| Ethanol | 26.5 | 15.8 | 8.8 | 19.4 | 58.5 | 40.33 |
| Ethyl Acetate | 18.1 | 15.8 | 5.3 | 7.2 | 98.5 | 68.14 |
| Heptane | 15.3 | 15.3 | 0 | 0 | 147.4 | 94.24 |
| Hexane | 14.9 | 14.9 | 0 | 0 | 131.6 | 82.11 |
| Methanol | 29.6 | 15.1 | 12.3 | 22.3 | 40.7 | 29.09 |
| Methyl Acetate | 18.7 | 15.5 | 7.2 | 7.6 | 79.7 | 56.13 |
| Octanol | 21 | 17 | 3.3 | 11.9 | 157.7 | 112.43 |
| Water | 47.8 | 15.6 | 16 | 42.3 | 18.0 | 15.24 |
| 2-Phenylacetamide | 27.89 | 22.54 | 16.38 | 1.2 | 53.63 | 98.25 |
| 4-Methylphthalic anhydride | 32.45 | 27.18 | 17.7 | 1.02 | 66.0 | 103.50 |
| Aceclofenac | 28.02 | 26.64 | 8.65 | 0.79 | 121.83 | 213.03 |
| Acetaminophen | 28.24 | 23.37 | 15.75 | 1.85 | 60.07 | 104.32 |
| Acetylsalicylic acid | 29.06 | 27.45 | 9.46 | 1.15 | 70.63 | 116.83 |
| Atenolol | 21.38 | 20.23 | 6.88 | 0.81 | 161.3 | 190.11 |
| Atropine | 27.78 | 26.84 | 7.11 | 1.03 | 103.07 | 195.58 |
| Benzamide | 35.25 | 25.7 | 24.05 | 1.76 | 36.53 | 86.76 |
| Camphor | 20.59 | 19.29 | 7.21 | 0.29 | 106.7 | 111.53 |
| Capecitabine | 24.65 | 22.83 | 9.26 | 0.83 | 205.94 | 225.47 |
| Cefixime | 30.18 | 27.82 | 11.67 | 0.86 | 190.56 | 264.81 |
| Cephalexin | 33.22 | 29.67 | 14.91 | 0.99 | 111.46 | 219.59 |
| Cimetidine | 27.39 | 22.85 | 15.1 | 0.61 | 162.3 | 176.03 |
| Deferiprone | 22.59 | 19.64 | 11.09 | 1.25 | 83.97 | 95.13 |
| Flurbiprofen | 24.56 | 23.03 | 8.49 | 0.86 | 82.33 | 164.03 |
| Hydroquinone | 43.64 | 31.25 | 29.89 | 5.87 | 23.84 | 75.95 |
| Isoniazid | 44.85 | 27.74 | 35.19 | 1.89 | 45.53 | 93.62 |
| Lamotrigine | 40.17 | 29.09 | 27.67 | 1.23 | 90.26 | 147.47 |
| Meclofenamic acid | 25.08 | 23.52 | 8.69 | 0.78 | 106.43 | 177.91 |
| Pentoxifylline | 23.76 | 21.49 | 10.14 | 0.4 | 187.2 | 184.38 |
| Pindolol | 23.46 | 21.41 | 9.55 | 0.92 | 129.37 | 177.38 |
| Pnitrobenzamide | 36.71 | 26.8 | 25.05 | 1.28 | 55.33 | 104.25 |
| Vinpocetine | 32.28 | 31.84 | 5.28 | 0.43 | 108.8 | 235.71 |
| Benzocaine | 27.88 | 26.5 | 8.64 | 0.91 | 77.33 | 115.91 |
| Borneol | 19.15 | 18.54 | 4.68 | 0.93 | 106.67 | 115.13 |
| Carvedilol | 25.27 | 24.45 | 6.32 | 0.9 | 146.27 | 274.61 |
| Ibuprofen | 18.35 | 18.09 | 3.07 | 0.5 | 140.43 | 154.87 |
| Isoborneol | 20.06 | 19.48 | 4.72 | 0.94 | 105.67 | 114.97 |
| Salicylic acid | 30.52 | 25.83 | 16 | 2.94 | 41.2 | 90.21 |
Table 2 reports the COSMO-SAC (2002), the COSMO-SAC (2010), and the Flory–Huggins results for some pharmaceutical compounds categorized by the solvent type and sorted according to absolute average deviations (AAD%). The RMSE results for the COSMO-SAC (2002), the COSMO-SAC (2010), and the Flory–Huggins models were also reported in Table 2. Based on Table 2, the predictive model of the COSMO-SAC (2002) has a wide range of errors that are in agreement with errors reported by Hsieh et al.15. The COSMO-SAC (2010) and the Flory–Huggins have larger errors compared to the COSMO-SAC (2002).
Table 2.
The results of solubility from the COSMO-SAC model (2002) for some considered pharmaceutical compounds in comparison to Flory–Huggins model and the COSMO-SAC (2010).
| Drug | Solvent | RMSE COSMO-SAC (2002) model | RMSE Flory–Huggins model | RMSE COSMO-SAC (2010) model | AAD% COSMO-SAC model |
|---|---|---|---|---|---|
| 4-Methylphthalic anhydride | Methyl acetate | 0.0017 | 0.3106 | 0.0215 | 0.47 |
| Atropine | Ethanol | 0.0072 | 0.0769 | 0.0252 | 7.25 |
| Acetyl salicylic acid | Ethanol | 0.0050 | 0.1254 | 0.0324 | 8.13 |
| Camphor | Ethanol | 0.0832 | 0.0675 | 0.0625 | 10.40 |
| Isoborneol | Acetone | 0.0529 | 0.2311 | 0.1252 | 12.02 |
| Vinpocetine | Ethyl acetate | 0.0021 | 0.0108 | 0.0027 | 12.16 |
| Salicylic acid | Methanol | 0.0285 | 0.1375 | 0.0796 | 20.21 |
| Acetyl salicylic acid | Octanol | 0.011 | 0.0815 | 0.0400 | 21.35 |
| Atenolol | Octanol | 0.0011 | 0.0004 | 0.0046 | 21.40 |
| 4-Methylphthalic anhydride | Acetonitrile | 0.0475 | 0.2115 | 0.0310 | 23.92 |
| Atropine | Octanol | 0.0245 | 0.0916 | 0.0482 | 25.5 |
| Salicylic acid | Acetic Acid | 0.0156 | 0.0643 | 0.0271 | 26.26 |
| Camphor | Acetone | 0.1655 | 0.2131 | 0.1633 | 28.05 |
| Isoborneol | Ethanol | 0.1210 | 0.1359 | 0.1037 | 28.81 |
| Ibuprofen | Octanol | 0.1145 | 0.0349 | 0.0332 | 30.52 |
| Dapsone | Methyl Acetate | 0.0126 | – | 0.0162 | 30.82 |
| 4-Methylphthalic anhydride | Acetone | 0.0732 | 0.2585 | 0.0411 | 34.39 |
| Pindolol | Octanol | 0.0011 | 0.0023 | 0.0001 | 46.76 |
| Flurbiprofen | Octanol | 0.0769 | 0.113 | 0.0388 | 51.91 |
| Acetaminophen | Ethanol | 0.0352 | 0.0563 | 0.0112 | 51.96 |
| Pindolol | Hexane | 2.50E−07 | 1.00E−04 | 0.0000 | 63.05 |
| Ibuprofen | Ethanol | 0.1918 | 0.0741 | 0.1246 | 67.76 |
| Aceclofenac | Acetone | 0.0549 | 0.0812 | 0.0609 | 71.11 |
| Aceclofenac | Methanol | 0.019 | 0.0437 | 0.0105 | 72.28 |
| Lamotrigine | Acetonitrile | 0.002 | 0.0018 | – | 78.78 |
| Atenolol | Hexane | 3.28E−07 | 0.0011 | 0.0000 | 91.5 |
| Acetyl salicylic acid | 2-Propanol | 0.0423 | 0.0536 | 0.0189 | 93.87 |
| Pentoxifylline | Octanol | 0.1132 | 0.1246 | 0.1193 | 96.2 |
| Benzamide | Methanol | 0.1024 | 0.0969 | 0.1393 | 97.43 |
| Meclofenamic acid | Water | 0.0361 | 0.2142 | 0.1632 | 99.33 |
| p-Nitrobenzamide | Water | 0.0013 | 0.0007 | 0.0478 | 99.43 |
| Borneol | Acetone | 0.1248 | 0.0922 | 0.1250 | 101.77 |
| Sulfamethazine | Water | 4.86E−05 | – | 0.1250 | 116.51 |
| Probenecid | Acetone | 0.0298 | – | 0.0192 | 116.99 |
| Dapsone | Methanol | 0.0165 | 0.0118 | 0.0979 | 119.69 |
| Flurbiprofen | Ethanol | 0.1533 | 0.1038 | 0.0789 | 124.04 |
| Acetaminophen | Octanol | 0.0528 | 0.0091 | 0.0205 | 135.9 |
| Meclofenamic acid | Ethanol | 0.1813 | 0.1428 | 0.0608 | 142.16 |
| Benzamide | Acetonitrile | 0.0631 | 0.0037 | 0.1367 | 145.76 |
| Acetaminophen | 2-Propanol | 0.1029 | 0.0606 | 0.0160 | 157.2 |
According to Table 2, pharmaceutical compounds containing H, C and O with the lowest hydrogen bonding numbers have the lower error. Besides, the structure of molecule has a remarkable influence on accuracy. In the case of acetaminophen and acetyl salicylic acid, by solvent replacement from ethanol to acetone, deterioration in model prediction was observed. The impact of eliminating F atom from flurbiprofen observes in the lower error reported for ibuprofen. Although borneol and isoborneol have the same chemical formula, the accuracy of the COSMO-SAC (2002) for them is entirely different. The above studies implied that molecular structure, atoms, and intermolecular interaction must be widely incorporated into the COSMO-SAC model. Since, the COSMO-SAC (2002) provides better approximations of solubility in examined systems, we prefer utilizing the original COSMO-SAC (2002) in our further investigation on the binary and ternary systems. Afterward, two models, the COSMO-SAC (2002) and the Flory–Huggins models were considered for the octanol/water partition coefficient and cocrystal formation.
Afterward, the ternary systems of pharmaceutical compounds in binary solvents were also examined. On the basis of Table 2, two pharmaceutical compounds, acetaminophen and salicylic acid, were suggested. Acetaminophen consists of 20 atoms H, C, N, and O and two functional groups, OH and NH. Salicylic acid consists of 16 atoms H, C, and O, and two functional groups, OH and COOH. Figure 4 presents the comparison between the experimental and calculated solubilities of acetaminophen in ethanol/water mixtures as a function of ethanol mole fraction at two temperatures, 293.15 and 303.15 K. According to Fig. 4, a good agreement between experimental data and the COSMO-SAC calculations observe. The observed trends of the COSMO-SAC as a function temperature match with the reported experiments.
Figure 4.
The experimental (symbol) and calculated (line) solubility of acetaminophen in ethanol/water mixtures at 293.15 K(triangular symbol) and 303.15 K (circle symbol)32.
Figure 5 shows the calculated solubility of salicylic acid in ethanol/ethyl acetate mixture compared to experimental data. On the basis of Fig. 5, a departure from experimental data was observed at higher ethyl acetate mole fraction. The ethyl acetate has a functional group COO which its interaction with COOH in salicylic acid has been ignored in the COSMO-SAC (2002).
Figure 5.
The experimental (symbol) and calculated (line) solubility of salicylic acid in ethanol/ethyl acetate mixture33.
The octanol/water partition coefficients for some pharmaceutical compounds obtained from the COSMO-SAC model. In Table 3, the results of the octanol/water partition coefficient from the COSMO-SAC model compared to experimental data from the national library of medicine34. The MSE, NMSE, and NRMSE are 2.36, 0.1416, and 0.0735. The RMSEs for the COSMO-SAC and the Flory–Huggins are 1.25 and 4.45. On the basis of Table 3, the various accuracies obtained regarding activity ratio in the octanol/water partition coefficient. In the octanol/water partition coefficient, if the errors in the numerator and denominator cancel each other out, a good accuracy between the COSMO-SAC computation and experiment is harvested. Otherwise, the discrepancies in obtained errors were seen. It is possible that the COSMO-SAC model fails for solubility prediction (such as dapsone) but presents a reasonable estimation of the octanol/water partition coefficient due to the above discussions. As observed from Table 3, the simple molecules made of H, C, and O by only hydrogen bonding have better performance in the COSMO-SAC predictions. On the basis of Table 3, the octanol/water partition coefficients obtained from the Flory–Huggins model are farm from experimental data.
Table 3.
The calculated and experimental octanol/water partition coefficient for some pharmaceutical compounds.
| Substance | log KOW,COSMO-SAC | log KOW,Flory–Huggins | log KOW,exp |
|---|---|---|---|
| Aceclofenac | 1.57 | − 1.31 | 2.17 |
| Acetaminophen | 0.02 | −1.56 | 0.46 |
| Atropine | 0.65 | − 1.74 | 1.83 |
| Camphor | 1.16 | − 1.4 | 2.38 |
| Cefixime | − 2.22 | − 1.04 | − 0.40 |
| Celecoxib | 2.83 | − 2.48 | 3.53 |
| Dapsone | 0.33 | − 1.29 | 0.97 |
| Deferiprone | − 0.94 | − 1.14 | − 0.77 |
| Flurbiprofen | 1.34 | − 1.73 | 4.16 |
| Hydroquinone | 0.57 | − 3.4 | 0.59 |
| Isoniazid | − 0.98 | − 3.22 | − 0.70 |
| Lamotrigine | 0.82 | − 2.13 | 2.57 |
| Meclofenamic acid | 2.83 | − 2.17 | 5.00 |
| Pindolol | 1.43 | − 3.70 | 1.75 |
| p-Nitrobenzamide | 0.07 | − 0.89 | 0.82 |
| Sulfamethazine | 0.99 | − 1.69 | 0.89 |
| Borneol | 1.78 | − 1.3 | 3.24 |
| Carvedilol | 2.66 | − 0.83 | 4.19 |
| Ibuprofen | 2.03 | − 1.91 | 3.97 |
| Isoborneol | 2.35 | − 3.00 | 3.24 |
| Sulfacetamide | − 0.04 | − 1.68 | − 0.96 |
| Trifloxystrobin | 3.86 | − 3.4 | 4.50 |
In order to investigate a more complex system, a three-phases diagram of ternary system is explored by considering the sulfamethazine/salicylic acid cocrystal formation in methanol at 283.15 K, which studied by Ahuja et al.35. Details of calculation and methods were described in “Cocrystal formation” section. After performing the computation by the COSMO-SAC (2002), a triangular diagram of the considered system was plotted by a free software named ProSim Ternary Diagram. On the basis of Fig. 6 and experimental plots in Ahuja et al.35, some differences between experiments and the COSMO-SAC calculations were observed. The cocrystal region for SM/SA predicted by the COSMO-SAC is wider, while experimental data imply on the narrow region. The solubility line of SM in SA + ME mixture expanded in the COSMO-SAC model in comparison to experiments which interpreted by the COSMO-SAC ability in the considered system. The predicted solubility line of SA in the SM + SA is appropriately closer to the reported experimental data which indicates the good performance of the COSMO-SAC for SA. The reported inconsistencies in observed results originated from molecular structure, constituent atoms, and their interactions. The electronegative atoms S and N in sulfamethazine create the observed discrepancies, while their contributions were not considered in the COSMO-SAC (2002) model. The ternary phase diagram carbamazepine (CBZ)/acetylsalicylic acid (ASA) in ethanol (ET) at 298.15 K were computed by the COSMO-SAC (2002) and plotted in Fig. 7. Veith et al.29 studied the CBZ/ASA/ET by PC-SAFT EOS. According to Veith et al.29, the PC-SAFT EOS without binary interaction parameters estimated the narrow cocrystal region and low solubilities. Whilst the COSMO-SAC (2002) predicts higher solubilities and wider cocrystal region. By comparison the COSMO-SAC (2002) calculations to the PC-SAFT EOS by considering binary interaction parameters and experimental data Veith et al.29, a reasonable agreement observes between the COSMO-SAC (2002) and reported data.
Figure 6.
Ternary phase diagram of the system sulfamethazine (SM) /salicylic acid (SA)/methanol (ME) in mass fraction obtained by the COSMO-SAC (2002) model at 283.15 K. The solid lines represent solubility lines and highlighted area shows cocrystal region.
Figure 7.
Ternary phase diagram of CBZ/ASA/ET in mole fraction at 298.15 K. Solid lines represent solubility line by the COSMO-SAC (2002). The highlighted region shows cocrystal formation by the COSMO-SAC (2010).
Conclusions
The COSMO-SAC as a predictive model has been gained a great attention in thermodynamic modeling and phase equilibria considerations. The eight universal parameters and predefined atomic radiuses for C, H, O, S, N, F, and Cl are the general basis of the COSMO-SAC model. In the current study, the COSMO-SAC model implemented in solid–liquid phase equilibria in form of solubility data in binary and ternary systems, octanol/water partition coefficient, and cocrystal studies. For more comparison, the COSMO-SAC model was also compared with the Flory–Huggins model. The obtained results implied that molecular structure, constituent atoms, functional group, and their interactions have remarkable impacts on the obtained results. In general, the simple molecules made of atoms H, C, and O under special condition, atom N by simple covalent and hydrogen bonding interactions can be deliberated by the COSMO-SAC model. The presence of other atoms such as F and S and other functional groups such as COO and COOH made complex systems. This complexity provides some opportunities to modify the original the COSMO-SAC model.
Supplementary information
Acknowledgements
The authors gratefully acknowledge financial support (grand number: 98017343) from the Iran National Science Foundation (INSF).
Author contributions
S.Z.M.: Conceptualization, Methodology, Software, Writing. G.P.: Writing, Methodology, Supervision.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
is available for this paper at 10.1038/s41598-020-76986-3.
References
- 1.Jakob A, Grensemann H, Lohmann J, Gmehling J. Further development of modified UNIFAC (Dortmund): revision and extension 5. Ind. Eng. Chem. Res. 2006;45(23):7924–7933. doi: 10.1021/ie060355c. [DOI] [Google Scholar]
- 2.Chen C-C, Song Y. Solubility modeling with a nonrandom two-liquid segment activity coefficient model. Ind. Eng. Chem. Res. 2004;43(26):8354–8362. doi: 10.1021/ie049463u. [DOI] [Google Scholar]
- 3.Klamt A. Conductor-like screening model for real solvents: a new approach to the quantitative calculation of solvation phenomena. J. Phys. Chem. 1995;99(7):2224–2235. doi: 10.1021/j100007a062. [DOI] [Google Scholar]
- 4.Klamt A, Jonas V, Bürger T, Lohrenz JC. Refinement and parametrization of COSMO-RS. J. Phys. Chem. A. 1998;102(26):5074–5085. doi: 10.1021/jp980017s. [DOI] [Google Scholar]
- 5.Klamt A, Eckert F. COSMO-RS: a novel and efficient method for the a priori prediction of thermophysical data of liquids. Fluid Phase Equilib. 2000;172(1):43–72. doi: 10.1016/S0378-3812(00)00357-5. [DOI] [Google Scholar]
- 6.Lin S-T, Sandler SI. A priori phase equilibrium prediction from a segment contribution solvation model. Ind. Eng. Chem. Res. 2002;41(5):899–913. doi: 10.1021/ie001047w. [DOI] [Google Scholar]
- 7.Mullins E, Oldland R, Liu Y, Wang S, Sandler SI, Chen C-C, Zwolak M, Seavey KC. Sigma-profile database for using COSMO-based thermodynamic methods. Ind. Eng. Chem. Res. 2006;45(12):4389–4415. doi: 10.1021/ie060370h. [DOI] [Google Scholar]
- 8.Tung HH, Tabora J, Variankaval N, Bakken D, Chen CC. Prediction of pharmaceutical solubility via NRTL-SAC and COSMO-SAC. J. Pharm. Sci. 2008;97(5):1813–1820. doi: 10.1002/jps.21032. [DOI] [PubMed] [Google Scholar]
- 9.Zhou Y, Xu D, Zhang L, Ma Y, Ma X, Gao J, Wang Y. Separation of thioglycolic acid from its aqueous solution by ionic liquids: ionic liquids selection by the COSMO-SAC model and liquid-liquid phase equilibrium. J. Chem. Thermodyn. 2018;118:263–273. doi: 10.1016/j.jct.2017.12.007. [DOI] [Google Scholar]
- 10.Paese LT, Spengler RL, Soares RDP, Staudt PB. Predicting phase equilibrium of aqueous sugar solutions and industrial juices using COSMO-SAC. J. Food Eng. 2020 doi: 10.1016/j.jfoodeng.2019.109836. [DOI] [Google Scholar]
- 11.Xavier VB, Staudt PB, de Soares RP. Predicting VLE and odor intensity of mixtures containing fragrances with COSMO-SAC. Ind. Eng. Chem. Res. 2020;59(5):2145–2154. doi: 10.1021/acs.iecr.9b05474. [DOI] [Google Scholar]
- 12.Bouillot B, Teychené S, Biscans B. An evaluation of COSMO-SAC model and its evolutions for the prediction of drug-like molecule solubility: part 1. Ind. Eng. Chem. Res. 2013;52(26):9276–9284. doi: 10.1021/ie3015318. [DOI] [Google Scholar]
- 13.Shu C-C, Lin S-T. Prediction of drug solubility in mixed solvent systems using the COSMO-SAC activity coefficient model. Ind. Eng. Chem. Res. 2011;50(1):142–147. doi: 10.1021/ie100409y. [DOI] [Google Scholar]
- 14.Buggert M, Cadena C, Mokrushina L, Smirnova I, Maginn EJ, Arlt W. COSMO-RS calculations of partition coefficients: different tools for conformation search. Chem. Eng. Technol. Ind. Chem.-Plant Equip.-Process Eng.-Biotechnol. 2009;32(6):977–986. [Google Scholar]
- 15.Hsieh C-M, Wang S, Lin S-T, Sandler SI. A predictive model for the solubility and octanol–water partition coefficient of pharmaceuticals. J. Chem. Eng. Data. 2011;56(4):936–945. doi: 10.1021/je1008872. [DOI] [Google Scholar]
- 16.Mullins E, Liu Y, Ghaderi A, Fast SD. Sigma profile database for predicting solid solubility in pure and mixed solvent mixtures for organic pharmacological compounds with COSMO-based thermodynamic methods. Ind. Eng. Chem. Res. 2008;47(5):1707–1725. doi: 10.1021/ie0711022. [DOI] [Google Scholar]
- 17.Bell IH, Mickoleit E, Hsieh C-M, Lin S-T, Vrabec J, Breitkopf C, Jäger A. A benchmark open-source implementation of COSMO-SAC. J. Chem. Theory Comput. 2020;16(4):2635–2646. doi: 10.1021/acs.jctc.9b01016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ferrarini F, Flôres G, Muniz A, de Soares R. An open and extensible sigma-profile database for COSMO-based models. AIChE J. 2018;64(9):3443–3455. doi: 10.1002/aic.16194. [DOI] [Google Scholar]
- 19.Mu T, Rarey J, Gmehling J. Performance of COSMO-RS with sigma profiles from different model chemistries. Ind. Eng. Chem. Res. 2007;46(20):6612–6629. doi: 10.1021/ie0702126. [DOI] [Google Scholar]
- 20.Lee M-T, Lin S-T. Prediction of mixture vapor–liquid equilibrium from the combined use of Peng–Robinson equation of state and COSMO-SAC activity coefficient model through the Wong-Sandler mixing rule. Fluid Phase Equilib. 2007;254(1–2):28–34. doi: 10.1016/j.fluid.2007.02.012. [DOI] [Google Scholar]
- 21.Lin S-T, Chang J, Wang S, Goddard WA, Sandler SI. Prediction of vapor pressures and enthalpies of vaporization using a COSMO solvation model. J. Phys. Chem. A. 2004;108(36):7429–7439. doi: 10.1021/jp048813n. [DOI] [Google Scholar]
- 22.Hsieh C-M, Sandler SI, Lin S-T. Improvements of COSMO-SAC for vapor–liquid and liquid–liquid equilibrium predictions. Fluid Phase Equilib. 2010;297(1):90–97. doi: 10.1016/j.fluid.2010.06.011. [DOI] [Google Scholar]
- 23.Paulechka E, Diky V, Kazakov A, Kroenlein K, Frenkel M. Reparameterization of COSMO-SAC for phase equilibrium properties based on critically evaluated data. J. Chem. Eng. Data. 2015;60(12):3554–3561. doi: 10.1021/acs.jced.5b00483. [DOI] [Google Scholar]
- 24.Islam MR, Chen C-C. COSMO-SAC sigma profile generation with conceptual segment concept. Ind. Eng. Chem. Res. 2015;54(16):4441–4454. doi: 10.1021/ie503829b. [DOI] [Google Scholar]
- 25.Lindvig T, Michelsen ML, Kontogeorgis GM. A Flory-Huggins model based on the Hansen solubility parameters. Fluid Phase Equilib. 2002;203(1–2):247–260. doi: 10.1016/S0378-3812(02)00184-X. [DOI] [Google Scholar]
- 26.Staverman A. The entropy of high polymer solutions. Generalization of formulae. Recl. Trav. Chim. Pays-Bas. 1950;69(2):163–174. doi: 10.1002/recl.19500690203. [DOI] [Google Scholar]
- 27.Kurada KV, De S. Modeling of solution thermodynamics: A method for tuning the properties of blend polymeric membranes. J. Membr. Sci. 2017;540:485–495. doi: 10.1016/j.memsci.2017.06.049. [DOI] [Google Scholar]
- 28.Barton AF. Handbook of Polymer–Liquid Interaction Parameters and Solubility Parameters. New York: CRC Press; 1990. [Google Scholar]
- 29.Veith H, Schleinitz M, Schauerte C, Sadowski G. Thermodynamic approach for co-crystal screening. Cryst. Growth Des. 2019;19(6):3253–3264. doi: 10.1021/acs.cgd.9b00103. [DOI] [Google Scholar]
- 30.Ainouz A, Authelin JR, Billot P, Lieberman H. Modeling and prediction of cocrystal phase diagrams. Int. J. Pharm. 2009;374(1–2):82–89. doi: 10.1016/j.ijpharm.2009.03.016. [DOI] [PubMed] [Google Scholar]
- 31.Fingerhut R, Chen W-L, Schedemann A, Cordes W, Rarey J, Hsieh C-M, Vrabec J, Lin S-T. Comprehensive assessment of COSMO-SAC models for predictions of fluid-phase equilibria. Ind. Eng. Chem. Res. 2017;56(35):9868–9884. doi: 10.1021/acs.iecr.7b01360. [DOI] [Google Scholar]
- 32.Jiménez JA, Martínez F. Thermodynamic magnitudes of mixing and solvation of acetaminophen in ethanol+ water cosolvent mixtures. Rev Acad Colomb Cienc. 2006;30(114):87–99. [Google Scholar]
- 33.Matsuda H, Kaburagi K, Matsumoto S, Kurihara K, Tochigi K, Tomono K. Solubilities of salicylic acid in pure solvents and binary mixtures containing cosolvent. J. Chem. Eng. Data. 2009;54(2):480–484. doi: 10.1021/je800475d. [DOI] [Google Scholar]
- 34.National Library of Medicine, National Center for Biotechnology Information. Accessed 15 July 2020. https://pubchem.ncbi.nlm.nih.gov/.
- 35.Ahuja D, Svärd M, Rasmuson ÅC. Investigation of solid–liquid phase diagrams of the sulfamethazine–salicylic acid co-crystal. CrystEngComm. 2019;21(18):2863–2874. doi: 10.1039/C9CE00124G. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







