Abstract
To accurately quantify Pericarpium Citri Reticulatae samples, trilinear structure was restored in the stacked fingerprints for more robust modeling. Initially, liquid chromatography - diode array detector - mass spectrometry (LC-DAD-MS) and head space-solid phase micro extraction coupled to gas chromatography - mass spectrometry (HS-SPME/GC-MS) were utilized to analyze Pericarpium Citri Reticulatae. Faced with the time-shifts in two-dimensional (2D) matrices across different samples, three algorithms were developed to synchronize them. Furthermore, bilinear and trilinear models were used to realize the quantifications with different principles. Through real cases based on LC-DAD, the advantages and disadvantages of trilinear decomposition over multivariate curve resolution-alternating least-squares can be clarified in the quantification of raw or synchronized fingerprints. Also in the data processing, a modification version of multi-scale peak alignment (mMSPA) was proved to be more suitable for trilinearity restoring than the other two algorithms. Recognizing these facts, restoring trilinearity were developed for more robust modeling in the application of the Pericarpium Citri Reticulatae fingerprints from different storage periods. After effective analysis, the upward/downward trend of 13 flavonoids were drawn accurately; and several flavour components having the highest contribution rate during storage were outlined reasonably. In conclusion, more robust modeling can be realized in trilinear data synchronized by appropriate algorithms, leading to an accurate quantification in herbal quality researches.
Keywords: Trilinearity restoring, Advanced modeling, Fingerprints, Pericarpium Citri Reticulatae, Storage
Graphical abstract
Trilinearity restoring; Advanced modeling; Fingerprints; Pericarpium Citri Reticulatae; Storage.
1. Introduction
Storage is an important factor affecting the quality of functional food or herbs [1, 2], such as Pericarpium Citri Reticulatae (Chenpi in mandarin) [3, 4]. The significant differences might be taken place in fingerprints of Pericarpium Citri Reticulatae with different periods [5, 6]. A nondestructive approach has been developed for the rapid classification of different-age Citri Reticulatae Pericarpium using near infrared device [7]. Moreover, other scientific instruments should be utilized to investigate the molecular variations under cover. Presently, second-order analytical instruments have been widely used in the qualitative and quantitative analysis of different species of Pericarpium Citri Reticulatae, such as gas chromatography (GC) combined with mass spectrometry (MS) [8], and liquid chromatography (LC) coupled to diode array detector (DAD) or mass spectrometry (MS) [9]. Nevertheless, multiple scenes with interference are usually observed in the second-order tensor (matrix) for each sample. Thus, multivariate curve resolution and multi-way calibration methodologies [10] should be developed to realize the goals of green and effective analysis in herbal analyse field. But it is worth noting that how to use the best parameters to perform the herbal quality control more efficiently is a core in the application of these methods.
Presently, there are two main solutions for qualitative and quantitative analysis of complex second-order data. Bilinear modeling of an augmented data matrix, such as multivariate curve resolution - alternating least-squares (MCR-ALS) [11], was extensively used to tackle problems about the mixture analysis. Compared with trilinear modeling, it has the advantage of allowing for varying elution-time profiles across samples, especially chromatographic data. To reduce the number of possible solutions, different kinds of constraints were proposed for curve resolution in chromatographic data-set, including non-negativity, unimodality, trilinearity, local rank constraints, and so on. It's worth noting that the use of the trilinearity constraint is considered as the best choice to limit rotation ambiguities (the main source of uncertainty). With such efforts, a correct recovery of the concentration and/or spectra profiles for all the different species can be more easily achieved in MCR-ALS algorithm. Trilinear modeling [12] of three-way data array, as an alternative, is often used for resolving true physical factors and obtaining unique solution. These methods include parallel factor analysis (PARAFAC) [3], alternating trilinear decomposition (ATLD) [13], and so on. It can be seen that both MCR modeling (trilinear-constraint) and trilinear modeling have good results as well as strict requirements for data structure.
In reality, various factors may occur in the process of herbal analysis, leading to subtle/serious differences among a series of outputs from the instrument. Usually the time shifts are usually observed in two-dimensional (2D) matrices across different samples, and they will keep the stacked data-array non-trilinear in herbal data analysis. As a response, MCR-ALS [14] and PARAFAC2 [15] have been used for non-trilinear second-order chromatographic data with non-reproducible elution time profiles, even ATLD [16] can also tolerate very slight time drifts. Another way to solve this problem is to develop synchronization in time-shift fingerprints [17, 18]. At Present, MCR-COW-MCR strategy does not solve the rotational ambiguity issues, and it artificially modifies the raw data without significant improvements in analytical performance [19, 20]. Mazivila etc performed three-way calibration using PARAFAC and MCR-ALS with previous synchronization of second-order chromatographic data through functional alignment of pure vectors (FAPV) [21]. There are many algorithms that can realize synchronization, such as modification version of multi-scale peak alignment (mMSPA) [22], looping of correlation optimized warping (LCOW) [23], looping of icoshift (Licoshift) [24]. They should be discussed in detail, and the core idea can be obtained in the application to perform the herbal analysis more accurately.
In this paper, three kinds of algorithms were initially developed for restoring trilinearity of the different second-order data. As shown in Figure 1, MCR-ALS (A), PARAFAC2 (B), ATLD and MCR-ALS (trilinear constraint) (C) were used for the quantitative analysis of raw or synchronized fingerprints, respectively. Then, more robust modeling was utilized to realize the quantitative analysis in real fingerprints from Pericarpium Citri Reticulate during storage periods.
2. Experimental
2.1. Materials and chemical reagents
The pericarps of Citrus reticulata 'Chachi' (Pericarpium Citri Reticulatae in this paper) were collected at a farm in Xinhui district (N 22°5′15″-22°35′01″, E 112°46′55″- 113°15′43″), Guangdong province. Activated carbon/polydimethylsiloxane/divinylbenzene fiber (40/60μm, CAR/PDMS/DVB) was purchased from Zhengzheng analytical instrument Co., Ltd. (Qingdao, China). HPLC grade n-hexane was purchased from Merk company (Merk, Germany); C8–C20, the standard of D-limonene were purchased from Sigma-Aldrich (St. Louis, Missouri, USA); Standards of Hesperidin (97%), Nobiletin (98%), Tangeretin (98%) were purchased from Chengdu Ruifensi biotechnology Co., Ltd. (Chengdu, China); Acetonitrile, Methanol and Formic acid were chromatographic grade, and other reagents were analytical pure.
2.2. LC-DAD-MS determination for flavonoids in Pericarpium Citri Reticulatae
2.2.1. Preparation of samples
Ten pericarp samples aged 3 months, ten samples aged 39 months, and other ten samples aged 63 months of period. In the begin 0.3 g of pericarps were weighed and mixed with 1mL Benzoic acid (1 mg/mL, IS) and 10mL Methanol. After ultrasound extraction for 30 min, the supernatant (filtered through 0.22 μm membrane) was diluted twice and subjected to LC analysis. For reference solution, the final concentration of Hesperidin, Nobiletin and Tangeretin were adjusted to appropriate ranges.
2.2.2. LC-DAD-MS determination
A RP-C18 column (4.6 mm × 250 mm, 5 μm) was used to separate flavonoids in a LC-DAD-MS 8045 system (Shimadzu, Japan). The mobile phase was consisted of A (0.1% formic acid acetonitrile) and B (0.1% formic acid water). The gradient program: 0–12 min: 83% B-73% B; 12–24 min: 73% B-50% B; 24–29 min: 50 % B-30% B; 29–43 min: 30% B; 43–48 min: 30% B-83% B. The flow rate was 0.4mL/min and the injection volume was 5 μL. The DAD signals were acquired for each sample. The MS data were collected under positive ion mode, and 3 L min −1 of atomizer gas flow, 10 L min −1 of heater flow, 300 °C of interface temperature, 250 °C of DL temperature, 400 °C of heating block temperature and 10 L min −1 of flow rate of dry gas (N2) were set. The mass range was set as m/z 100–1000 Da, and the collision energies were set to 15 V, 25 V and 35 V. LC-DAD-MS was measured in random order according to the sample number.
2.3. HS-SPME/GC-MS detection for flavors in Pericarpium Citri Reticulatae
2.3.1. Head space-solid phase micro extraction (HS-SPME)
Thirty pericarp samples (3 months, 39 months, 63 months of storage period) were preprocessed as follow: 0.1 g of Chenpi was added into a 20 mL bottle, and then the fiber was exposed at 50 °C for 40 min.
2.3.2. GC-MS analysis
HP-5MS (30 m × 0.25 mm i.d., film thickness 0.25 um, Agilent) was contained in Shimadzu GC-MS 2010 system (Kyoto, Japan). The HS-SPME fiber was desorbed in the injection port at a split ratio of 1:50 for 5 min. The temperature programme was as follow: 40 °C for 6 min, ascended to 90 °C at 4 °C/min, then increased to 140 °C at 10 ° C/min, finally raised to 280 °C at 25 °C/min for 1 min. The scanning range was 20–500 amu, the ionization energy was 70 eV, and the scanning interval was 0.50 s. The temperature of interface, injection and ion source were maintained at 260 °C, 250 °C and 200 °C, respectively.
2.4. Data processing and statistical analysis
The full spectral data were collected from LC-DAD or GC-MS apparatus at a certain frequency, and they were visualized in MZmine 2 software. The raw data can be transformed into mat file in which each data point is recorded in a matrix (retention time versus response values). Then mMSPA [22], LCOW [23], Licoshift [24], MCR-ALS GUI 2.0 [11,14] and ATLD algorithms [12] worked under MATLAB environment, and their theories was written in detail (Supporting Material). At last, libPLS software was used for the multivariate statistical analysis of the data sets.
3. Result and discussion
3.1. Building trilinear-structure for LC-DAD matrices with slight misalignment
3.1.1. LC-DAD-MS/MS determination for Pericarpium Citri Reticulatae samples
Pericarpium Citri Reticulatae is regarded as having the effect of regulating qi - strengthening spleen (‘Liqi-Jianpi’ in mandarin) and drying dampness - resolving phlegm (‘Zaoshi-Huatan’ in mandarin) after a storage period of 3 years or more. Especially, the aged peel of Citrus reticulata 'Chachi' species is deemed as ‘Daodi’ material with stronger activities. In this research, the peels of Citrus reticulata 'Chachi' were collected as the Pericarpium Citri Reticulatae samples. Due to the previous researches on Citri varieties [25, 26], LC-DAD-MS/MS test was successfully carried out for the known flavone molecules with pharmacological activities in this test. As a result, a good separation performance of Hesperidin, Nobiletin, Tangeretin and unknown peaks was observed in test chromatogram. It is well known that ultraviolet spectrum is not enough to distinguish specific structures of unassigned peaks. Thus herbal scientists need resort to mass spectrum with a set of mass peaks, which is the fingerprint of a small molecule. The procedure is as follows: 1) an in-house database was built for the chemical information of all known flavone molecules in citrus species; 2) adducts of molecular ions were searched against hundreds of structures in an in-house database; 3) experimental MS/MS fragments were utilized to deduct/confirm the target molecules. At last, the peak annotations of ten flavonoids were displayed in Tab S1 (Supplemental Material).
3.1.2. The quantity comparison of bilinear/trilinear modelings
In this section, the researchers quantified each analyte using bilinear or trilinear -decomposition based models. To compare their characteristics, the different cases were illustrated for the two approaches in this section. The first example: different concentrations (0.34 mg/mL and 0.19 mg/mL) of Hesperidin were used to produce 26 simulation matrices with large time-drifts (artificial). Then, an augmented matrix of recorded data over time Daug (IK× J) and a three-way data array X (I×J×K) were build in Fig. S1 (A) and (B) (Supplemental Material), respectively. As Fig. S1 (C) showed, a good quantification was realized for augmented matrix by MCR-ALS algorithm. That is to say, bilinear modeling can extract the information of the pure analytes through a decomposition of data matrix Daug into the produce of an elution curves matrix Caug and a spectral profiles matrix ST (Supplemental Material). In this case, I, J and K, represent the number of elution points, the number of spectral channels and the number of samples. ATLD is known to be effective in second-order/three way chromatographic data with very slight time shifts. In this process, a three-way data array X (I×J×K) can be decomposed into the elution curve, spectral profile and concentration matrices (A, B and C) (Supplemental Material). Nevertheless, there is no solution for the stacked chromatographic data with large time-drifts. As observed in Fig. S1 (C), a large deviation were observed in the modeling due to trilinear breaking mode (raw data). To make up for this defect, the 26 matrices were synchronized and stacked into a trilinear structure. As observed in Fig. S1 (C), the prediction values of ATLD were stabilized again at 102.0 % and 101.5% of the real values for two kinds of synchronized matrices. In other words, only the data-array with strict trilinearity can be decomposed by trilinear model accurately.
Is there any difference between bilinear decomposition (raw matrices) and trilinear modeling (synchronized matrices) for real LC-DAD data? As shown in Figure 2 (A), the 30 matrices of Nobiletin (37.4–38.4 min) with slight drifts were taken as an example. After the mMSPA treatment, the synchronized matrices were reconstructed into another trilinear structure. Because of an overall movement, the linearity observation is mainly considered from a newly assembled side. Here, the alignment of λ profiles from 30 samples (the i th channel in cube) were utilized to evaluate the synchronization. It can be clearly observed from Figure 2 (B) that slight misalignment is observed in a raw slice, while a precise alignment is embodied in a synchronized slice. To better illustrate the trilinearity restoring, MCR models were proposed in a comparison between several approaches with/without trilinear constraint. In this step, the data structure was initially transformed into an augmented data matrix for a further decomposition. As observed from Figure 2 (C), MCR (trilinear-constraint) modeling (blue circles) is satisfactory in the synchronized structure, which is similar to bilinear modeling (grey square) in the raw data. That is to say, the mMSPA algorithm doesn't produce harmful effect on peak shape (area), and it is reliable for the restoring of trilinear structure. As seen from Figure 2 (C), a biased prediction (yellow triangles) was inevitably appeared in a trilinear modeling with trilinear breaking mode. It is easy to understand why the researchers often choose an appropriate model to obtain the satisfactory results. In other words, a trilinear structure must be constructed to accommodate trilinear modeling or MCR modeling (trilinearity constraint). The number of components should also be noted, and should be set reasonably in this MCR model. As observed in Figure 2 (D), the errors will be increased significantly using unreasonable n values (n > 4 or n = 1).
3.1.3. The necessity of building trilinear structure for LC-DAD data processing
PARAFAC2 and MCR modelings are most widely used for the non-multilinear data. Why should we develop other models requiring more constraints in synchronized data? The main reasons are as follow, the flexible modeling may bring shortcomings. For example, the bilinear modeling is usually require appropriate initial values and restricted by rotation ambiguities. Taking a section of 22.9–24.1 min as an example, the elution profiles are not constant across samples. To obtain more effective analysis, the different modelings of three components (1, 2 and C3) was described in details. (1) PARAFAC2 model was developed for a raw data-array (stacked by thirty matrices), while the result was disappointing in Figure 3 (A). (2) A bilinear decomposition based on MCR modeling was used for a column-wise augmented data matrix (thirty samples), and an imperfect decomposition was observed from red circle in Figure 3 (B). Therefore, this peak-cluster will only resort to MCR model (trilinear constraint) or ATLD model for a perfect resolution. Worse still, the trilinear deviation ratio (TDR) of 0.2 supposed that time shifts may destroy the trilinear structure required in the desired modeling. It meant that the trilinear decomposition does not provide accurate quantitative curves with chemical significance. Therefore, mMSPA was utilized to perform a synchronization for multi-channel signals from 30 samples. The 10th sample was set as a reference, and others were deemed as samples to be aligned in the procedure. In this case, a curve of 270 nm was selected as representative profile, and 20 of SNR was set to ensure the detection of peak 2 by Haar CWT. After a synchronization, peak shape can maintain the original state to the maximum extent. (1) Subjected to data rearrangement, a column-wise augmented data matrix was then analyzed by MCR (trilinear-constraint) in Figure 3 (C). Unsurprisingly, the quantitative and qualitative information was accurately calculated even in the presence of overlapped structure (1 and 2). (2) Subjected to data superpose, a data array was further analyzed by trilinear modeling. As observed in Figure 3 (D), normalized chromatogram matrix, normalized spectra matrix and relative concentration matrix can be decomposed by ATLD algorithm. Notably, baseline effects in experimental data can be easily modeled and removed from the scene; and it is not required to correct baseline and to subtract background in raw data. More importantly, the relative concentrations of peak 1, peak 2 and peak C3 were calculated for 30 samples directly. The first ten samples are dried pericarps (3 months), and the last twenty samples are Pericarpium Citri Reticulatae (39, 63 months), respectively. That is, 1 (Diosimin) and 2 (Hesperidin) show a downward trend, whereas C3 (unknown compound) presents an upward trend from pericarps to Pericarpium Citri Reticulatae.
3.1.4. Decomposition on LC-DAD data of Pericarpium Citri Reticulatae
It is a good method that trilinear models can realize the goals of the simultaneous qualitative and quantitative analysis. Initially the LC-DAD fingerprints were normalized, referring to the area of internal standard (IS) peaks. Then the different sample data were stacked into a data-array, and divided into many sub-arrays along the retention time mode. It is important to note that there are less than 5 analytes of interest in each sub-array. After mMSPA synchronization, a successful trilinear decomposition can be developed for the target sub-array. In terms of methodology, the three standard curves (trilinear prediction vs real concentration) possessed good linearity for Hesperidin, Nobiletin and Tangeretin. Also the within-day variances, between-day variances and recovery test met the requirements. As a result, the differences of component between dried pericarps (3 months) and Pericarpium Citri Reticulatae (39, 63 months) were displayed for 13 flavonoids in Figure 4. In summary, the contents of Diosmin, Hesperidin and Poncirin decreased significantly after 3–5 years. These flavonoid glycosides may be converted into other compounds during storage, resulting in a decrease trend. On contrary, the contents of 4’, 5,7,8-tetramethoxyflavone and Tangeretin did not change significantly after 3 years of storage, but increased slowly after 5 years storage. In the literature, Tangeretin is deemed as an important active ingredient in Pericarpium Citri Reticulatae. In addition, the contents of Isosinensetin, 5,6,7,4 ′- tetramethoxyflavone and Nobiletin did not change significantly after 3–5 years storage. That is to say, only small changes are taken place in the contents of polymethoxyflavones during storage period.
3.2. Building trilinear-structure for GC-MS matrices with serious misalignment
3.2.1. HS-SPME/GC-MS test for Pericarpium Citri Reticulatae samples
Electronic nose responses of 18 sensors have been used for the distinction between Pericarpium Citri Reticulatae species [27]. Furthermore, the delicate/herbal fragrances produced in the Pericarpium Citri Reticulatae with 3–5 storage years, should also be investigated in detail.
As we all know, flavor sense is embodied in each volatile odorous compound with different contribution degree. It seems that a significant difference was taken place in the component proportion from the samples at different stages. So the volatile odorous compounds were determined both in dried pericarps (3 months) and Pericarpium Citri Reticulatae (39, 63 months) by HS-SPME/GC-MS. In terms of flavors preparation, HS-SPME has the advantages of easy operation and high selectivity, and it can reduce the oxidation, degradation and other reactions of flavor components. Relying on the property of CAR/PDMS/DVB fiber, a strong enrichment effect on volatile odorous compounds of Citrus was observed in GC-MS data (Fig. S2, Supplementary Material). The similarity searching in a NIST 14 library and retention indices (RIs) confirmation with a collection (https://webbook.nist.gov/chemistry/) have become a qualitative solution. The resulting 49 components were displayed in Tab S2 (Supplementary Material), and they are mainly composed of terpenes, alcohols, esters and aldehydes. Among them, D-Limonene, γ-Terpinene, Methyl methylthranilate, O-cymene and α- Farnesene are the main components of volatile odorous compounds in Pericarpium Citri Reticulatae. The first three components were accounted for 37.23%, 10.28% and 12.61% of the total peak area, respectively. The differences between HS-SPME and steam distillation were also compared in the GC-MS determination. Taking the dried pericarps (3 months) as an example, a total of 54 components were identified by the two methods, of which 12 common components were odorous terpenoids in Tab S2 (Supplementary Material). As shown in the results, the types of compounds adsorbed by HS-SPME are far more than those extracted by SD, but 5 compounds (3-Thujene, β-Phellandrene, β-cis-Ocimene, Terpinolene and Terpinen-4-ol) are only detected in SD product.
3.2.2. Building trilinear structure for the shape-unchanging matrices
The unknown interference and backgrounds were commonly emerged in the multi-component odor mixture. Faced with the co-eluted peaks, chemometric algorithms were utilized to achieve qualitative/quantitative analysis of interested components in complex samples. At present, MCR modeling (trilinear constraint) and ATLD modeling have been proven effective in resolving the distorted peaks. As an example in Figure 5 (A), a serious misalignment is shown in the segments of 20.8–21.2 min (15 dots/peak, unchanged shape) from 30 samples. In order to obtain a strict trilinear structure, LCOW, Licoshift and mMSPA algorithms were used to synchronize the GC-MS matrices, respectively. As observed in Figure 5 (B), LCOW algorithm (C/E model) can ignite a peak-shape change and appear to be time-consuming. The reason is that COW alters the original pair of signals by selectively contracting and expanding the time axes to minimize the distance between them. It usually aligns the signal accurately, and also cause the changes in the peak shape or others in this way. Nevertheless, this change is bound to cause errors in the subsequent quantitative results. Also the researchers used variable penalty dynamic time warping (VPdtw, R language) based on the full signals. In this algorithm, nondiagonal moves are penalized to reduce the number of artificial features in the aligned signals. The pity is that it requires full signals and produces excessive numbers of N/A value. From three kinds of models, the second Licoshift (I/D model) and the third mMSPA (based on peak detection) were further introduced in the 30 matrices. The results showed that their synchronizations are much faster than that of LCOW algorithm here. As observed in Figure 5 (B), the slice of m/z 68 was used to observe the linearity of 30 matrices from the newly assembled side. The sample matrix (No 25) was utilized to check whether the raw data structure (other side of the cube) is damaged. It is glad to observe that two set of slices are both good in the synchronized cubes. But it should also be pointed out that minor misalignment will be observed in weak peaks with a small number of sampling points. That is, because two methods both rely on an efficient fast Fourier transform based computation, and the number of sampling points will affect the quality of synchronization. Just like m/z 161 profiles (slice) in Fig. S3 (A) (Supplemental Material), a slight deviation of can be observed in 30 matrices. Howbeit in Fig. S3 (B), linear interpolation and smoothness may keep the detection errors improved greatly.
Based on the above data, the qualitative/quantitative profile matrices were then obtained through MCR modeling (trilinear constraint), such as elution-time profiles and mass spectra matrices. As seen in Figure 5 (D), the quantitative comparison was further made for three sets of synchronized data. It was found that a unique solution (gray line) can be obtained from bilinear modeling in the non-multilinear data. Nevertheless the negative result (dotted line) was presented for MCR modeling (trilinear constraint) in raw data. This contrast meant that trilinear structure is the pre-condition to a successful three-way calibration. The next step is to try mMSPA and LCOW data, and the expected results appeared in the restored structure. That is, mMSPA group (98.3–102.7% of MCR bilinear) is significantly better than LCOW group (93.5–111.5% of MCR bilinear). In other words, mMSPA is more suitable for the sub-matrices (unchanged shape). Howbeit, the LCOW algorithm can initiate the changes in the peak shape, leading to the next quantitative errors.
3.2.3. Building trilinear structure for the shape-changing matrices
In the previous studies (Figure 5 A), a trilinear structure is easily maintained for 30 sub-matrices by mMSPA algorithm. Even for a complex peak-cluster, the sub-matrices can also be successfully synchronized in a good experimental environment. An example is observed in Fig. S4 (A) that five compounds are overlapped in a segment of 13–14.4 min from LC-DAD signals (Supplemental Material). Faced with the complex structure, mMSPA algorithm was successfully developed to synchronize 30 sub-matrices in Fig. S4 (B). Finally, a trilinear decomposition can be realized in Fig. S4 (C), and their variation trends can be observed.
Only limited high-quality data can be obtained in real world, and the researchers often have to face imperfect data. As illustrated in Figure 6 (A), one peak-cluster (23.27 min) of HS-SPME/GC-MS data from Pericarpium Citri Reticulatae was taken as an example. There are three overlapping compounds with large drift in 30 sub-matrices, namely Neryl acetate, β-Copaene and β-Elemene. What's worse, the positional uncertainty (peak 1 and peak 2) was appeared in several matrices. In Figure 6 (A), the distance difference between the two peaks can be observed from the 27th/28th samples clearly. How to perform the qualitative and quantitative analysis of the overlapping system? The first is to use bilinear decomposition of MCR-ALS with non-negativity/unimodality constraints, and it allows modeling non-multilinear data in chromatographic experiments. Nevertheless the differences between and in MCR-BANDS program represent the rotation ambiguities associated to MCR solutions. In other words, a set of different solutions can fit the experimental data, and they are not real solution. To obtain the satisfied values, the use of the trilinearity constraint in the simultaneous analysis of multiple data sets is the best choice to eliminate this type of ambiguity.
Briefly, a strict trilinear structure should be established for the thirty matrices before MCR modeling (trilinear constraint) or ATLD modeling. It is worth reminding that there are several trilinear conditions, e.g. peak shape, retention time. When one or both of them are not met, the data-array superimposed by thirty matrices would sink into breaking modes. In this research, mMSPA algorithms were firstly used to synchronize 30 sub-matrices in Figure 6 (B). The key step, representative profile (m/z 161) was sought to calculate the optimal drift, and the overall movement of 30 matrices was realized in mMSPA algorithm. That is, it realized the synchronization of m/z profiles from peak 2/peak 3. What's more, it maintained the initial structure of each sample data using the overall movement. Due to this reason, m/z profiles from peak 1 (e.g. m/z 69) were misaligned that the distance deviations were present in the two peaks. As a result, the concentration matrices and spectral matrices of peak 2 (peak 3) could be successfully decomposed by ATLD or MCR model requiring trilinear structures in Figure 6 (C). How to conduct qualitative and quantitative analysis when facing peak 1? As an alternative, representative ion (m/z 69) was selected to synchronize the 30 matrices, and disclose the hidden information of peak 1. That is to say, the researchers can select representative profiles (m/z or λ) to quantify the specific target compound. In addition, if the MCR-BAND test of the synchronized matrices is recovered to normal, the researchers can also use bilinear modeling in an augmentation data matrix.
As an alternative, Licoshift was utilized to realize a simultaneous synchronization of multiple peaks in Figure 6 (B). It uses a piece-wise linear correction function based on I/D model, and achieve its ultimate goal. In this test, the different m/z profiles were used to observe the linearity of 30 matrices from the assembled sides in cubes. As can be seen from Figure 6 (B), both slice (profiles) of m/z 69 and slice (profiles) of m/z 161 were aligned in 30 matrices. The pity of it was that slight misalignment was observed in sample slice (No 11), a certain side in cube. Why is there such a thing? For one thing, only one peak top (blue zone) is recognized in peak cluster to be aligned with other samples, e.g, profiles of m/z 93, curves of m/z 68 in Figure 6; another reason is that the deviations is usually occurred in the peak detection of coarse peaks with few sampling-points. Whatever happens, the assembled data was further subjected to a MCR modeling (trilinear constraint). As observed in Figure 6 (C), the result is quite contrary to previous hypothesis, and the apparent errors were observed in all three peaks. In this respect, mMSPA is better than Licoshift algorithm.
3.2.4. Decomposition on HS-SPME/GC-MS data of Pericarpium Citri Reticulatae
To maintain a strict trilinear structure, mMSPA algorithm was introduced to perform synchronization. Then the assembled data were used to achieve the quantitative curves (sample to sample) matrix by trilinear decomposition. As observed from Tab S3 (Supplementary Material), chemical conversion of some terpenes, alcohols and aldehydes must be taken place in storage period. These compounds will display a unique fragrance in a certain proportion at a specific storage period. Due to high content, Limonene was considered to be a great impact on the overall aroma degree, and exert in-apparent change during the 3–5 years of aging. Nevertheless, trans-Sabinene hydrate, cis-Sabinene hydrate, α-Terpineol, cis-Citral, p-Mentha-1,8-dien-3-one, (+)-, Thymol and Antioxine, showed a downward trend. While o-Cymene, p-Cymen-8-ol, cis-Carveol and Citronellol acetate showed an upward trend. In Figure 7, the relative percentage content in fresh pericarps (10 samples)/aged pericarps (20 samples) were then used for PLS-DA modeling [28]. In Figure 7 (A), 30 samples are divided into two categories through a PLS-DA score chart. No outlier values were found, 10 points of dried pericarps (3 months) are gathered in the upper area, whereas 20 points of Pericarpium Citri Reticulatae are gathered in the lower area. As shown in Figure 7 (B), the reliability of PLS-DA model was verified by monte carlo cross validation (MCCV). As observed in Figure 7 (C), phase diagram algorithm (PHADIA) was calculated to find out the key variables affecting fresh pericarps and Pericarpium Citri Reticulatae. Next, subwindow permutation analysis and random frog were used to detect which characteristics had the biggest contribution on the flavors formation of Pericarpium Citri Reticulatae. In Figure 7 (D), the odorous components such as Sabinene hydrate, Antioxine, α-Terpineol and Thymol can be judged as having the highest contribution rate. Although the contents of these components are low, their flavors are considered to be closely involved in the aging procedures.
4. Conclusion
This paper proposed the necessity of the advanced modeling in Pericarpium Citri Reticulate fingerprints clearly. In some cases, the advantage of trilinear data over the non-trilinear data was proved to achieve more robust calibration models. Nevertheless, it is worth noting that how to restore trilinearity in second-order data by a series of algorithms. The comparison of three algorithms (LCOW, Licoshift, mMSPA) showed that their respective characteristics (advantages and disadvantages) are the core of advanced modeling of herbal fingerprints. For example, the mMSPA algorithm doesn't produce harmful effect on peak shape (area), and it is reliable for the restoring of trilinear structure. The pity is that all substances cannot be aligned at the same time in the shape-changing matrices. Nevertheless it can also reduce the serious drifts across the different sample data, which is the main cause of MCR error. Facing with this problem, the researchers can resort to other solutions in the quality control of Chinese medicines, e.g, the quantification of target compound (s). Recognizing these facts, trilinear structures were built to be suitable for more robust calibration models in LC-DAD or GC-MS data-sets. Therefore, multiple scenes with different unknown interference were disclosed in this test. Even for those examples with rotation ambiguities in bilinear MCR, more obvious advantages are embodied in the synchronized structure. Furthermore, as an application example, the downward or upward trend was disclosed for the flavone glycoside and polymethoxyflavones from Pericarpium Citri Reticulatae. Also, Sabinene hydrate, Antioxine, α-Terpineol and Thymol were judged as the flavors with large contribution during 3–5 years of storage period. By and large, more effective analysis can be made from MCR modeling (trilinear constraint) or trilinear modeling in qualitative/quantitative data from herbal quality researches.
Declarations
Author contribution statement
Yaping Li: Performed the experiments; Analyzed and interpreted the data; Wrote the paper.
Qing Cao: Performed the experiments; Analyzed and interpreted the data.
Min He: Conceived and designed the experiments; Analyzed and interpreted the data; Wrote the paper.
Xinyue Yang, Pingping Zeng: Analyzed and interpreted the data.
Weiguo Cao: Analyzed and interpreted the data; Contributed reagents, materials and analysis tools.
Funding statement
This work was supported by Hunan 2011 Collaborative Innovation Center of Chemical Engineering & Technology with Environmental Benignity and Effective Resource Utilization, Hunan Province Natural Science Fund (no. 2020JJ4569), the key project of Hunan Provincial Education Department (no. 18A055), the Open Research Funding of Chongqing Key Laboratory of Traditional Chinese Medicine for Prevention and Cure of Metabolic Diseases (no. 2021-1-4) and Hunan Province College Students' innovation and entrepreneurship training program (no.S202110530044).
Data availability statement
Data included in article/supplementary material/referenced in article.
Declaration of interests statement
The authors declare no conflict of interest.
Additional information
No additional information is available for this paper.
Acknowledgements
We appreciate Xiang-Dong Qing (College of Materials and Chemical Engineering, Hunan City University) for his scientific assistance in ATLD studies.
Contributor Information
Min He, Email: dahai8214813@gmail.com.
Weiguo Cao, Email: cwgzd2001@cqmu.edu.cn.
Appendix A. Supplementary data
The following is the supplementary data related to this article:
References
- 1.Ren F., Nian Y., Perussello C.A. Effect of storage, food processing and novel extraction technologies on onions flavonoid content: a review. Food Res. Int. 2020;132:108953. doi: 10.1016/j.foodres.2019.108953. [DOI] [PubMed] [Google Scholar]
- 2.Wibowo S., Buvé C., Hendrickx M., Loey A.V., Grauwet T. Integrated science-based approach to study quality changes of shelf-stable food products during storage: a proof of concept on orange and mango juices. Trends Food Sci. Technol. 2018;73:76–86. [Google Scholar]
- 3.Lei D., Wu J., Leon C., Huang L., Hawkins J.A. Medicinal plants of Chinese pharmacopoeia and daodi: insights from phylogeny and biogeography. Chin. Herb Med. 2018;10:269–278. [Google Scholar]
- 4.Yu X., Sun S., Guo Y., Liu Y., Yang D., Li G., Lü S. Citri Reticulatae Pericarpium (Chenpi): botany, ethnopharmacology, phytochemistry, and pharmacology of a frequently used traditional Chinese medicine. J. Ethnopharmacol. 2018;22028:265–282. doi: 10.1016/j.jep.2018.03.031. [DOI] [PubMed] [Google Scholar]
- 5.Choi M.Y., Chai C., Park J.H., Lim J., Kwon S.W. Effects of storage period and heat treatment on phenolic compound composition in dried Citrus peels (Chenpi) and discrimination of Chenpi with different storage periods through targeted metabolomic study using HPLC-DAD analysis. J. Pharmaceut Biomed. 2011;54:638–645. doi: 10.1016/j.jpba.2010.09.036. [DOI] [PubMed] [Google Scholar]
- 6.Fu M., Xu Y., Chen Y., Wu J., Yu Y., Zou B., An K., Xiao G. Evaluation of bioactive flavonoids and antioxidant activity in Pericarpium Citri Reticulatae (Citrus reticulata ‘Chachi’) during storage. Food Chem. 2017;230:649–656. doi: 10.1016/j.foodchem.2017.03.098. [DOI] [PubMed] [Google Scholar]
- 7.Li P., Zhang X., Li S., Du G., Jiang L., Liu X., Ding S., Shan Y. A rapid and nondestructive approach for the classification of different-age Citri Reticulatae Pericarpium using portable near infrared spectroscopy. Sensors. 2020;20:1586. doi: 10.3390/s20061586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Luo M., Luo H., Hu P., Yang Y., Wu B., Zheng G. Evaluation of chemical components in Citri Reticulatae Pericarpium of different cultivars collected from different regions by GC-MS and HPLC. Food Sci. Nutr. 2017;6:400–416. doi: 10.1002/fsn3.569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yu X., Zhang Y., Wang D., Jiang L., Xu X. Identification of three kinds of Citri Reticulatae Pericarpium based on deoxyribonucleic acid barcoding and high-performance liquid chromatography-diode array detection-electrospray ionization/mass spectrometry/mass spectrometry combined with chemometric analysis. Phcog. Mag. 2018;14:64–69. doi: 10.4103/pm.pm_581_16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Olivieri A.C., Escandar G.M. Analytical chemistry assisted by multi-way calibration: a contribution to green chemistry. Talanta. 2019;204:700–712. doi: 10.1016/j.talanta.2019.06.022. [DOI] [PubMed] [Google Scholar]
- 11.Bayat M., Marín-García M., Ghasemi J.B., Tauler R. Application of the area correlation constraint in the MCR-ALS quantitative analysis of complex mixture samples. Anal. Chim. Acta. 2020;1113:52–65. doi: 10.1016/j.aca.2020.03.057. [DOI] [PubMed] [Google Scholar]
- 12.Wu H.L., Wang T., Yu R.Q. Recent advances in chemical multi-way calibration with second-order or higher-order advantages: multilinear models, algorithms, related issues and applications. Trac. Trends Anal. Chem. 2020;130:115954. [Google Scholar]
- 13.Long W.J., Wu H.L., Wang T., Dong M.Y., Yu R.Q. Interference-free analysis of multi-class preservatives in cosmetic products using alternating trilinear decomposition modeling of liquid chromatography diode array detection data. Microchem. J. 2021;162:105847. [Google Scholar]
- 14.de Juan A., Tauler R. Multivariate Curve Resolution: 50 years addressing the mixture analysis problem - a review. Anal. Chim. Acta. 2021;1145:59–78. doi: 10.1016/j.aca.2020.10.051. [DOI] [PubMed] [Google Scholar]
- 15.Yu H., Augustijn D., Bro R. Accelerating PARAFAC2 algorithms for non-negative complex tensor decomposition. Chemometr. Intell. Lab. 2021;214:104312. [Google Scholar]
- 16.Yin X.L., Gu H.W., Jalalvand A.R., Liu Y.J., Chen Y., Peng T.Q. Dealing with overlapped and unaligned chromatographic peaks by second-order multivariate calibration for complex sample analysis: fast and green quantification of eight selected preservatives in facial masks. J. Chromatogr. A. 2018;1573:18–27. doi: 10.1016/j.chroma.2018.09.019. [DOI] [PubMed] [Google Scholar]
- 17.Bortolato S.A., Arancibia J.A., Escandar G.M., Olivieri A.C. Time-alignment of bidimensional chromatograms in the presence of uncalibrated interferences using parallel factor analysis: application to multi-component determinations using liquid-chromatography with spectrofluorimetric detection. Chemometr. Intell. Lab. Syst. 2010;101:30–37. [Google Scholar]
- 18.Yu Y.J., Wu H.L., Niu J.F., Zhao J., Li Y.N., Kang C., Yu R.Q. A novel chromatographic peak alignment method coupled with trilinear decomposition for three dimensional chromatographic data analysis to obtain the second-order advantage. Analyst. 2013;138:627–634. doi: 10.1039/c2an35931f. [DOI] [PubMed] [Google Scholar]
- 19.Azimi F., Fatemi M.H. Multivariate curve resolution-correlation optimized warping applied to the complex GC-MS signals; toward comparative study of peel chemical variability of Citrus aurantium L. varieties. Microchem. J. 2018;143:99–109. [Google Scholar]
- 20.Pellegrino Vidal R.B., Olivieri A.C. Contribution to second-order calibration based on multivariate curve resolution with and without previous chromatographic synchronization. Anal. Chim. Acta. 2019;1078:8e15. doi: 10.1016/j.aca.2019.06.038. [DOI] [PubMed] [Google Scholar]
- 21.Mazivila S.J., Lombardi J.M., Pascoa R.N.M.J., Bortolato S.A., Leit∼ao J.M.M., Esteves da Silva J.C.G. Three-way calibration using PARAFAC and MCR-ALS with previous synchronization of second-order chromatographic data through a new functional alignment of pure vectors for the quantification in the presence of retention time shifts in peak position and shape. Anal. Chim. Acta. 2021;1146:98–108. doi: 10.1016/j.aca.2020.12.033. [DOI] [PubMed] [Google Scholar]
- 22.He M., Yan P., Yang Z.Y., Zhang Z.M., Yang T.B., Hong L. A modified multiscale peak alignment method combined with trilinear decomposition to study the volatile/heat-labile components in Ligusticum chuanxiong hort - Cyperus rotundus rhizomes by HS-SPME-GC/MS. J. Chromatogr. B. 2018;1079:41–50. doi: 10.1016/j.jchromb.2018.01.040. [DOI] [PubMed] [Google Scholar]
- 23.Vest Nielsen N.P., Carstensen J.M., Smedsgaard J. Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. J. Chromatogr. A. 1998;805:17–35. [Google Scholar]
- 24.Tomasi G., Savorani F., Engelsen S.B. icoshift: an effective tool for the alignment of chromatographic data. J. Chromatogr. A. 2011;1218:7832–7840. doi: 10.1016/j.chroma.2011.08.086. [DOI] [PubMed] [Google Scholar]
- 25.Zheng G., Chao Y., Luo M., Xie B., Zhang D., Hu P., Yang X., Yang D., Wei M. Construction and chemical profile on "activity fingerprint" of Citri Reticulatae Pericarpium from different cultivars based on HPLC-UV, LC/MS-IT-TOF, and principal component analysis. Evid. Based Complement. Alternat. Med. 2020;2020:4736152. doi: 10.1155/2020/4736152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zheng Y.Y., Zeng X., Peng W., Wu Z., Su W.W. Characterisation and classification of Citri Reticulatae Pericarpium varieties based on UHPLC-Q-TOF-MS/MS combined with multivariate statistical analyses. Phytochem. Anal. 2019;30:278–291. doi: 10.1002/pca.2812. [DOI] [PubMed] [Google Scholar]
- 27.Li S.Z., Zeng S.L., Wu Y., Zheng G.D., Chu C., Yin Q., Chen B.Z., Li P., Lu X., Liu E.H. Cultivar differentiation of Citri Reticulatae Pericarpium by a combination of hierarchical three-step filtering metabolomics analysis, DNA barcoding and electronic nose. Anal. Chim. Acta. 2019;1056:62–69. doi: 10.1016/j.aca.2019.01.004. [DOI] [PubMed] [Google Scholar]
- 28.Li H.D., Xu Q.S., Liang Y.Z. libPLS: an integrated library for partial least squares regression and linear discriminant analysis. Chemometr. Intell. Lab. Syst. 2018;176:34–43. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data included in article/supplementary material/referenced in article.