Skip to main content
Food Chemistry: X logoLink to Food Chemistry: X
. 2025 Sep 12;31:103019. doi: 10.1016/j.fochx.2025.103019

Rapid evaluation of antioxidant activity of Rheum tanguticum: A synergistic strategy of near-infrared spectroscopy, chromatographic effects, and machine learning

Xiaoming Song a,1, Dan Feng a,b,1, Jiamin Li a, Liyan Zang a,b, Hongmei Li a, Jing Sun a,
PMCID: PMC12475870  PMID: 41017927

Graphical abstract

Unlabelled Image

Keywords: Rheum tanguticum, Spectrum-effect relationship, NIR spectroscopy, Antioxidant activity

Abstract

This study integrates HPLC fingerprinting and NIR spectroscopy to evaluate the antioxidant capacity of Rheum tanguticum. HPLC fingerprinting comprehensively characterized the complex chemical composition, identifying 13 compounds. Meanwhile, NIR spectroscopy provided a rapid, non-destructive approach to predict antioxidant capacity. The combination of these techniques offers a holistic and efficient strategy for evaluating antioxidant activity. The spectrum-effect relationships were analyzed by correlating the results of three antioxidant assays (ABTS, DPPH, and FRAP) with the peak areas of the common peaks identified by HPLC. Glycosylated anthraquinones, particularly conjugated forms like rhein-8-O-glucoside, were found to synergistically enhance antioxidant activity with free aglycones. NIR models optimized using the PLS method with Python-based algorithms demonstrated strong predictive abilities, achieving RPD values of 2.43 for ABTS, 2.63 for DPPH, and 2.43 for FRAP. This research provides a rapid method to evaluate the antioxidant capacity of Rh. tanguticum, offering a reference for future applications in other fields.

Highlights

  • Established HPLC fingerprint for Rheum tanguticum extracts.

  • Identified key chemical compound driving antioxidant activity.

  • Developed rapid NIR spectroscopy models for antioxidant evaluation.

  • Achieved high prediction accuracy with the verification rate above 83 %.

  • Provided new insights for the antioxidant research of agricultural crops and foods.

1. Introduction

Rheum tanguticum is a perennial tall plant of the genus Rheum in Polygonaceae family. It is one of the three authentic rhubarbs recorded in the 2020 edition of the Pharmacopoeia of the People's Republic of China (Commission, 2020). Recent studies (Jing et al., 2022; Zhuang et al., 2020) have demonstrated that the main chemical components of this plant include anthraquinone, flavonoids, and polysaccharides. Rh. tanguticum exhibits various pharmacological effects, including diarrhea, anti-inflammatory, anti-oxidation, anti-tumor, liver protection, and cardiovascular protection. Furthermore, due to its multiple functional components, this plant can also be used in the food industry as a natural pigment, antioxidant and antibacterial agent. It can be employed to develop functional foods such as flavored beverages and health foods. Among these, antioxidant activity serves as the foundation for many beneficial health effects. Oxidative stress, primarily induced by free radicals, leads to damage to cells and extracellular molecules, causing the development of degenerative diseases such as peptic ulcer, cancer, Alzheimer's disease, and atherosclerosis (Shreyasi et al., 2022). Degenerative diseases mediated by oxidative stress are a major cause of the rising global mortality rates, and therefore scavenging free radicals represents a crucial breakthrough in preventing oxidative damage (Leonardo et al., 2021). Additionally, in the food industry, antioxidant activity research is helpful for the development of efficient natural antioxidants, improving food's stability and safety and prolonging food's shelf life. Natural products derived from plants have attracted much attention due to their safety and synergistic advantages, especially in avoiding the potential toxic and side effects associated with synthetic antioxidants (Iqbal et al., 2025). Current research on Rh. tanguticum primarily focuses on extraction and isolation of bioactive constituents (Bai et al., 2023; Lu, Chen, et al., 2024b) and pharmacological characterization (Choi et al., 2024; Zhong et al., 2024). Regarding antioxidant research, existing studies mainly concentrate on assessing the free-radical scavenging ability of the crude rhubarb extract and exploring its antioxidant pharmacological activity (Liang et al., 2024). Despite these advances, critical gaps persist,  particularly in identifying the specific compounds (or synergistic combinations) responsible for the antioxidant activity of Rh. tanguticum and quantitatively defining their contribution to the overall effect. This lack of compound-level resolution necessitates systematic investigation to establish robust evaluation protocols that integrate comprehensive chemical profiling with bioactivity assessment.

Traditional bioactivity evaluation, which follows the “extraction-isolation-purification-verification” workflow for single-compound screening, is labor-intensive, costly, and time-consuming (Chen et al., 2014). In contrast, HPLC fingerprinting has become a fundamental technique for the quality assessment and authentication of herbal medicines and foods due to its capacity to holistically characterize complex matrices (Yang et al., 2022). Building upon this foundation, the spectrum-effect relationship paradigm has emerged as a groundbreaking framework for bioactive constituent discovery by correlating chemical fingerprints with pharmacological outcomes through advanced chemometrics (Yuan et al., 2022). This dual-axis strategy deciphers the material basis of therapeutic actions in traditional Chinese medicine, agricultural crops, and functional foods. Recent applications underscore its transformative potential (Cai et al., 2024; Xu et al., 2024). Traditional in vitro methods for determining plants' antioxidant capacity face limitations such as complex operations, high technical requirements, high detection costs, and susceptibility to environmental factors. These limitations highlight critical gaps in antioxidant activity analysis and rapid assessment technologies, particularly the lack of high-throughput methods and inadequate compound-specific correlation. In contrast, NIR spectroscopy, with its capacity to detect molecular vibrations and functional group signatures, presents an elegant solution (Li, Li, et al., 2025a). Phenolic hydroxyl and carboxyl groups, whose antioxidant potential stems from electronic configurations and spatial arrangements, generate distinct NIR spectral patterns (Ma et al., 2019). Compared with other analytical techniques such as MS and UV–Vis, the non-destructive nature, rapid throughput, and environmental sustainability of near-infrared (NIR) spectroscopy position it as an ideal platform for next-generation antioxidant quantification. (Cao et al., 2021). Therefore, based on the spectrum-effect correlation strategy, this study analyzed and revealed the material basis of the antioxidant activity of Rh. tanguticum. NIR spectroscopy combined with different chemometric methods was employed to construct a rapid detection model for the antioxidant activity of Rh. tanguticum. This integrated approach would enable the rapid and accurate detection of antioxidant activity, thereby providing technical support for the establishment of an intelligent evaluation system. The methodology demonstrates potential for adaptation to other medicinal plants and functional foods.

2. Materials and methods

2.1. Instruments

This study employed fourier transform infrared spectrometer (iS 50, Thermo Nicolet, USA), high performance liquid chromatograph (Infinity 1260, Agilent Technologies, USA), Microplate reader (Epoch 2, Bio Tek Instruments, USA), pulverizer (Tianjin Taisite Co., Ltd., China), electronic balance (ME104,0.0001 g, Mettler Toledo, Switzerland), eclipse plus C18 chromatographic column (4.6 × 250 mm, 5 μm, Agilent Technologies, USA), and ultrapure water machine (Milli-Q Integral 3, Merck Chemical Technology Co., Ltd., Germany).

2.2. Chemicals and reagents

We used total antioxidant capacity (T-AOC) detection kit (FRAP microplate method) (A30IR224565, Shanghai Yuanye Biological Co., Ltd., China), total antioxidant capacity (T-AOC) detection kit (ABTS microplate method) (A10IR222662, Shanghai Yuanye Biological Co., Ltd., China), acetonitrile (chromatographic grade, Supelco ®, United States), methanol (analytical grade, Forton, China), and phosphoric acid (chromatographic grade, Aladdin Reagent Co., Ltd. (Shanghai, China)). Other reagents were analytically pure.

Aloe-emodin (Batch No. II20683), chrysophanol (Batch No. MP00111), rhein (Batch No. MP00560), emodin (Batch No. 09H18Q), and physcion (Batch No. SA10924) were purchased from Henan Standard Substance Research and Development Center (China). Sennoside A (Batch No. N2303165401) and sennoside B (Batch No. N2303165508) were purchased from Sichuan Hengcheng Zhiyuan Biotechnology Co., Ltd. (China). Sennoside C (Batch No. AFCL0602) and emodin-8-O-glucoside (Batch No. AZDD0351) were purchased from Chengdu Elfa Biotechnology Co., Ltd. (China). Rhein-8-O-β-D-glucoside (Batch No. MUST-12111602) was purchased from Chengdu Mansite Biotechnology Co., Ltd., China. Chrysophanol-8-O-glucoside (Batch No. 141208) and physcion-8-O-glucoside (Batch No. 131012) were purchased from Chengdu Klomar Biotechnology Co., Ltd. (China). Aloe-emodin-8-O-glucoside (Batch No. DR010436) was purchased from Dingrui Chemical (Shanghai) Co., Ltd.

2.3. Sample collection

In this study, a total of 18 batches of plant samples were collected from various growing regions in Qinghai Province between September and October. There were a total of 9 samples in each batch, making a total of 162 samples. The original plant specimens were identified as Rheum tanguticum Maxim.ex Ralf. (Polygonaceae) by a plant expert from the Northwest Institute of Plateau Biology, Chinese Academy of Sciences. The specific sample information is shown in Table 1. Following standardized processing protocols, the harvested roots underwent sequential preparation steps: cleaning, slicing, shade-drying, coarse crushing, and subsequent particle size reduction through an 80-mesh sieve. Processed materials were then stored in a controlled drying oven for experimental preservation.

Table 1.

Sample information of Rh. tanguticum.

Sample point number Site Batch number
1 Dari County, Qinghai Province, China S1 ∼ S2
2 Banma County, Qinghai Province, China S3
3 Qilian County, Qinghai Province, China S4
4 Zeku County, Qinghai Province, China S5 ∼ S7
5 Tongren City, Qinghai Province, China S8
6 Huangzhong County, Qinghai Province, China S9 ∼ S18

2.4. Sample preparation

Take 18 batches of samples and mix them separately for spectrum-effect relationship analysis, and take individual samples from the 18 batches to establish a NIR antioxidant detection model. The powder of 0.5000 ± 0.0001 g was taken and placed in a conical bottle with a stopper. Methanol of 25 mL was accurately added, heated and refluxed for 1 h, cooled, transferred to a centrifuge tube and centrifuged at 4000 rpm for 15 min, and then the volume was fixed in a volumetric flask. After the weight of the loss was supplemented with methanol and the volume was shaken well, the test solution was taken.

Reference standards were accurately weighed and dissolved in methanol to prepare stock solutions, which were subsequently diluted to obtain mixed standard working solutions.

2.5. HPLC fingerprint analysis

2.5.1. Chromatographic condition

Column: Agilent Eclipse Plus C18 (4.6 mm × 250 mm, 5 μm). Mobile phase: (A) acetonitrile - (B) 0.1 % phosphoric acid (aqueous), with the following gradient elution program: 0–15 min: 2 %–15 % A; 15–25 min: 15 %–20 % A; 25–40 min: 20 % A (isocratic); 40–70 min: 20 %–30 % A; 70–90 min: 30 %–55 % A; 90–100 min: 55 %–75 % A. Injection volume: 10 μL; Detection wavelength: 254 nm; Column temperature: 30 °C.

2.5.2. Methodological examination

Precision assessment involved five consecutive injections of identical samples under standardized chromatographic conditions (Section 2.5.1), with retention time and peak area consistency evaluated through relative standard deviation (RSD) calculations. Repeatability testing employed five independently prepared samples, each analyzed via 10 μL injections under identical parameters. For stability evaluation, aliquots from a single extract underwent chromatographic analysis at 0 h, 2 h, 8 h, 12 h, and 24 h post-preparation, with temporal consistency monitored through RSD metrics.

2.5.3. Fingerprint similarity evaluation

The chromatography of 18 batches of samples was processed by the software of Chromatography Fingerprint Similarity Evaluation System of Traditional Chinese Medicine 2012 Edition.

2.5.4. Hierarchical cluster analysis (HCA)

To gain a deeper understanding of the differences and similarities in the chemical composition of Rh. tanguticum samples from different producing areas, this study employed HCA. This method quantifies the similarities in chemical composition between samples and categorizes them into several groups, thereby intuitively illustrating the chemical characteristic differences of Rh. tanguticum from various origins. In the cluster analysis, the areas of the common peaks in the samples were used as variables, and the Squared Euclidean Distance was employed as the measure of interval between samples. The HCA was conducted using the within-group average linkage method on 18 batches of Rh. tanguticum samples from different producing areas.

2.5.5. Principal component analysis (PCA)

The fingerprint data exhibit high dimensionality, featuring numerous common peak areas and complex correlations among these data. PCA is a commonly used multivariate statistical analysis method. It aims to transform multiple related variables in the original data into a new set of unrelated principal components through linear transformation, so as to simplify the data structure and reduce the data complexity while retaining most of the key information (Hoseini et al., 2025). By calculating the data's covariance matrix and extracting its eigenvalues and eigenvectors, PCA selects the first few principal components with larger eigenvalues to retain the variation information of the data to the greatest extent. This method not only effectively reduces the dimension, but also removes the data noises. In this study, PCA was used to analyze the characteristic peak data of the chemical constituents of Rh. tanguticum, in order to extract the main variation information and reveal the intrinsic correlation between different components.

2.6. Determination of antioxidant activity

2.6.1. ABTS assay

The antioxidant activity of 18 batches of mixed samples and 162 individual samples of Rh. tanguticum extract were determined. Using the kit method, the test solution was diluted 10 times to make the sample solution. Two different sample wells were labeled as A0 and A1. The light-avoidance reaction was conducted for 5 min, and the absorbance was measured at 734 nm. To measure the absorbance value of A0, 280 μL of ABTS working solution was mixed with 7 μL of methanol. To measure the absorbance value of A1, 280 μL of ABTS working solution was mixed with 7 μL of the sample solution. Dilute Trolox standard with distilled water to a gradient concentration of 0.1–2.5 mM, draw a standard curve, and calculate the total ABTS antioxidant capacity (mM) of the sample based on the standard curve. Measure each sample three times and take the average. The ABTS free radical clearance rate was calculated using the following formula:

S=A0A1/A0×100%

2.6.2. DPPH assay

The DPPH antioxidant activity of Rh. tanguticum extract was determined. A DPPH solution of 0.47 mg/L was prepared and the test sample solution was diluted 200 times to make the sample solution. Ascorbic acid (VC) standards (10, 30, 100 μg/mL) served as positive controls. The different additive wells were labeled as A0, A1, and A2. The light was protected for 30 min and absorbance A was measured at 517 nm. Measure absorbance A0 with 100 μL methanol and 100 μL DPPH; 100 μL sample and 100 μL DPPH A1; 100 μL sample and 100 μL absolute ethanol A2. Measure each sample three times and take the average. The DPPH radical clearance formula was as follows:

S=A0A1A2/A0×100%

2.6.3. FRAP assay

The FRAP antioxidant activity of Rh. tanguticum extract was determined. Using the kit method, the test solution was diluted 20 times to make the sample solution. Dilute the FeSO₄·7H₂O standard to a concentration gradient of 0.15–1.5 mM and plot the standard curve, with the ferrous ion concentration (mM) as the abscissa and the corresponding absorbance as the ordinate. In this method, the total antioxidant capacity is expressed by the Fe2+ concentration. Therefore, the corresponding Fe2+ concentration, the sample's antioxidant capacity, can be found on the standard curve according to the absorbance of the extract. Measure each sample three times and take the average.

2.7. Antioxidant spectrum-effect correlation analysis

2.7.1. Grey relational analysis (GRA)

The GRA method was used to analyze the sensitivity of the common peak area data and ABTS, DPPH and FRAP antioxidant activity data of the fingerprint of Rh. tanguticum samples. By calculating the correlation between the subsequence (sample fingerprint data) and the parent sequence (antioxidant activity index data), each component's influence degree of on the antioxidant activity was determined, which provided a theoretical basis for related research.

The specific steps were as follows (Zhang et al., 2024): three antioxidant index values as the parent sequence X0 (k), k = 1, 2, …n (n is the number of samples); the samples' chromatographic fingerprint data were recorded as subsequences X1 (k), X2 (k),Xm (k); k = 1, 2, …n (m is the common peak number). The original data is averaged, that is, each sequence element is divided by the average value of the corresponding sequence. The calculation process is shown in Formula (1), so as to obtain the homogenized sequence. Yi=Yikk=1n, i = 1, 2, …m.

Yik=Xik1nk=1nXik=nXikk=1nXik;i=1,2,m;k=1,2,n (1)

Calculate the correlation coefficient by formula (2):

ζik=miniminkX0kXik+ρmaximaxkX0kXikX0kXik+ρmaximaxkX0kXik (2)

The correlation is calculated by formula (3):

γ0,i=1nk=1nξ0ik,i=1,2,m (3)

Among them, ρ is the resolution coefficient, 0 ≤ ρ ≤ 1, generally ρ = 0.5; Δ0ik=Y0kYik;i=1,2,m;k=1,2,n.

The correlation degree γ0, i ≥ 0.9, indicating that the subsequence has a significant effect on the parent sequence; 0.8 ≤ γ0, i < 0.9, indicating that there is a relatively significant impact; 0.7 ≤ γ0, i < 0.8, indicating a significant impact; 0.6 ≤ γ0, i < 0.7, indicating a small impact; γ0, i < 0.6 indicates a very small effect (Yang et al., 2024).

2.7.2. Partial least squares regression analysis (PLSR)

PLSR modeling dissected the interplay between Rh. tanguticum's antioxidant capacity (ABTS, DPPH, FRAP assays) and phytochemical composition (common peak areas). This multivariate approach, particularly effective in resolving multicollinearity challenges, quantifies component-efficacy relationships through variable importance in projection (VIP) scoring. Components with VIP >1 (Nguyen et al., 2025) exert dominant influence on antioxidant outcomes, serving as key markers for predictive model construction.

Three antioxidant activity indexes of ABTS, DPPH and FRAP were used as reference sequences X0 (k), k = 1, 2, …n, and the m relative peak area values of 18 batches of Rh. tanguticum were used as comparison sequences Xi (k), k = 1, 2, …18, i = 0, 1, …m. The dimensionless processing of all variables is carried out to eliminate the order of magnitude error, and the mean value method is adopted, that is, the variable value in each sequence is divided by the average value of the corresponding sequence, and the dimensionless sequence is obtained. The formula refers to (1). In this study, there are 18 dependent variables {y1, …, yn} and m independent variables {x1, …, xm}. Partial least squares regression is used to extract principal components t1 and u1 in x and y respectively, so that they can represent x and y in the original data table as much as possible, and the correlation between t1 and u1 is large enough. The partial least squares regression will continuously extract the principal components, and carry out the regression of x to t and the regression of y to u until a better accuracy is obtained. Finally, it is expressed as the regression equation of yn with respect to the original variable {x1, …, xm}. This process can be realized by a variety of software, and this study used SIMCA for partial least squares regression analysis. The partial regression coefficient and VIP value were obtained by analysis to determine the explanatory power of independent variables over dependent variables.

2.7.3. Pearson correlation

The Pearson correlation analysis was conducted using Origin software to investigate the linear relationship and significance level between the common peaks of Rh. tanguticum and its antioxidant activity. The common peak area data of Rh. tanguticum were taken as independent variables, and the antioxidant activity index data of ABTS, DPPH and FRAP were taken as dependent variables, which were input into Origin software respectively. Based on the calculation formula of Pearson correlation coefficient, software calculates each set of independent variable and dependent variable data, and calculates the corresponding p value through bilateral test to determine the correlation's significance level. To reduce the risk of false positives caused by multiple comparisons, we further used the Benjamini Hochberg (BH) method to correct the original p-values. The BH method adjusts the p-value by controlling the false discovery rate (FDR), which can reduce false positive results while maintaining statistical rigor. If p > 0.05: there is no sufficient evidence to reject the original hypothesis, that is, the correlation is not significant, and p ≤ 0.05 was considered statistically significant (95 % confidence level). p ≤ 0.01 was considered statistically significant (99 % confidence level); p ≤ 0.001 was considered statistically significant (99.9 % confidence level).

2.8. NIR spectroscopy acquisition

The NIR fiber module of Fourier transform infrared spectrometer was used to collect the NIR one-dimensional infrared spectrum of Rh. tanguticum extract samples. A total of 162 individual samples of Rh. tanguticum were analyzed. The number of scans was 32 times, the resolution was 8 cm−1, and the spectral collection range was 10,000–4000 cm−1. Each batch of samples was divided into three equal parts and scanned three times. Before each scanning, the air background value was deducted, and the three spectra and average spectra of each sample were taken.

2.9. NIR spectroscopy antioxidant activity modeling

2.9.1. TQ analyst software modeling

The NIR spectral data of Rh. tanguticum collected under section 2.8 (totaling 162 spectra) were partitioned into modeling. Specifically, 135 spectra were allocated to the modeling set, while 27 spectra, divided into three batches from different locations (S2, S4, and S9), were reserved for external validation. The mahalanobis distance (MD) and PCA method were used to eliminate the abnormal spectra in the modeling set. According to the order of antioxidant activity values from high to low, the three antioxidant activity values and their corresponding modeling set spectra were imported into the TQ analyst system respectively. The proportion of calibration set: prediction set of 2:1, 3:1, 4:1, and 5:1 was used to optimize the proportion of modeling set, and the quantitative detection model of ABTS, DPPH and FRAP of NIR spectrum was established. The modeling methods used are partial least squares regression (PLS) and principle component regression (PCR); preprocessing methods include multivariate scattering correction (MSC), standard normal variate (SNV), first derivative spectrum (1D), second derivative spectrum (2D), Savitzky-Golay smoothing (SG smoothing) (polynomial order:3), and Norris derivative filtering smoothing (segment length:5). A three-factor, three-level experimental design (Table S1) was employed, with each combination modeled based on single-factor test results. Norris smoothing was applied to the first-derivative (1D) spectra for derivative filtering, while other preprocessing steps were performed on the raw spectral data. After using SIMCA software to screen the VIP modeling band of NIR spectrum, the VIP > 1 band and the full band were used to optimize the modeling band. The NIR spectra of the external validation samples were substituted into the optimal model to obtain the calculated values of the model. The difference between the calculated values and the actual values given by the model was compared, and the accuracy of the predicted results of the model for external validation was judged by the model prediction rate. The calculation formula is:

F=1M/N×100%

F: prediction rate; M: predicted value-measured value; N: measured value.

Model performance was quantitatively assessed using six key metrics: calibration set root mean square error (RMSEC), prediction set root mean square error (RMSEP), calibration correlation coefficient (Rcal), validation correlation coefficient (Rval), relative percent deviation (RPD), and external validation accuracy rate. Superior predictive capability was characterized by minimized RMSEC/RMSEP values coupled with maximized Rcal/Rval coefficients, enhanced RPD statistics, and elevated external validation accuracy scores. In this method, all models were validated using 5-fold cross-validation. The dataset was divided into k equal subsets (folds). For each fold, the model was trained on k-1 folds and validated on the remaining fold. This process was repeated k times, with each fold serving as the validation set exactly once. The final model performance metrics, including RMSECV and Rcv, were calculated as the average of the results from all folds.

2.9.2. Machine learning modeling in Python software

Five machine learning modeling methods that are commonly used for quantitative regression and have relatively good performance in Python software were used, including bayesian ridge regression (BRR), elastic net regression (ENR), gaussian process regression (GPR), PLS, and support vector machine regression (SVR). NIR spectral data of Rh. tanguticum samples were processed through systematic workflow optimization: Dataset partitioning mirrored TQ software architecture, maintaining identical modeling/external validation sets; Calibration-to-prediction set ratios (2:1, 3:1, 4:1, and 5:1) were comparatively evaluated; Spectral feature selection employed VIP thresholds (VIP >1) versus full-spectrum approaches; Preprocessing optimization incorporated Norris derivative smoothing, multiplicative scatter correction (MSC) (window: 2), and 1D transformations (smooth window: 5). Sequential implementation of all five algorithms with antioxidant activity data generated comparative models evaluated through six key metrics: RMSEC, RMSEP, Rcal, Rval, RPD, and external validation accuracy rate. To ensure the robustness and generalizability of our machine learning models, we employed 5-fold cross-validation. The dataset was divided into five equal subsets (folds). For each fold, the model was trained on the remaining four folds and validated on the held-out fold. This process was repeated five times, with each fold serving as the validation set exactly once. The final model performance metrics, including RMSECV and Rcv, were calculated as the average of the results from all five folds.

2.10. Statistical analysis

The similarity of fingerprints was evaluated by the 2012 edition of the Traditional Chinese Medicine Chromatographic Fingerprint Evaluation System. TQ analyst and machine learning modeling in Python software were used for modeling. SIMCA 14.1 software is used for PCA and spectral effect analysis,and for screening modeling bands with VIP > 1. Origin 2021 was used for cluster analysis and mapping.

3. Results and discussion

3.1. Establishment and characteristic analysis of HPLC fingerprint

The results of method validation demonstrated that robust analytical performance, with all retention time and peak area RSD values for characteristic chromatographic peaks, remained below 2 % across validation parameters. Precision testing revealed relative standard deviations (RSD) of 0.07 % for retention times and 0.98 % for peak areas across 18 characteristic peaks. Stability assessments over 24 h showed RSD values of 0.50 % (retention time) and 1.27 % (peak area), while reproducibility testing yielded RSDs of 0.65 % (retention time) and 1.15 % (peak area).

3.1.1. Similarity analysis

Under the chromatographic conditions described in Section 2.5.1, a HPLC fingerprint profile was successfully established for Rh. tanguticum samples (Fig. 1). Chromatographic similarity analysis was performed using the Chinese Pharmacopoeia-authorized Similarity Evaluation System for Traditional Chinese Medicine Chromatographic Fingerprints (Version 2012). Following system import of chromatographic profiles from 18 sample batches, a reference fingerprint (R) was generated through median vector normalization. Inter-batch similarity coefficients ranged from 0.884 to 0.974 relative to the reference fingerprint. The optimized analytical method successfully identified 18 characteristic peaks exhibiting a strong correlation with the reference profile (r > 0.900 for all peaks).

Fig. 1.

Fig. 1

Fingerprints of Rh. tanguticum samples.

Note: R: Control fingerprint.

Fig. 2 presents the chromatogram of both the reference substance and the sample. After comparison, a total of 13 chemical components corresponding to the common peaks were identified. Among them, Peak 1 was determined to be aloe emodin-8-O-glucoside, Peak 2 rhein-8-O-glucoside, Peak 3 sennoside B, Peak 5 sennoside C, Peak 6 sennoside A, Peak 8 chrysophanol-8-O-glucoside, Peak 9 emodin-8-O-glucoside, Peak 11 emodin methyl ether-8-O-glucoside, and Peak 13 aloe emodin, Peak 14 was identified as rhein, Peak 16 emodin, Peak 17 chrysophanol, and Peak 18 physcion. In order to better explore the characteristics of peaks, hierarchical cluster analysis (HCA) and PCA were conducted.

Fig. 2.

Fig. 2

The HPLC chromatogram of the mixed control (A) and the Rh. tanguticum extract sample (B).

Note: 1: Aloe-emodin-8-O-glucoside; 2: Rhein-8-O-glucoside; 3: sennoside B; 5: sennoside C; 6: sennoside A; 8: Chrysophanol-8-O-glucoside; 9: Emodin-8-O-glucoside; 11: physcion-8-O-glucoside; 13: Aloe-emodin; 14: Rhein; 16: Emodin; 17: chrysophanol; 18: Physcion.

3.1.2. HCA

HCA was employed to categorize the samples from different producing areas based on the similarity of their chemical composition. This approach aimed to intuitively illustrate the differences and similarities in chemical composition of Rh. tanguticum from different producing areas, thereby facilitating a better understanding of the key characteristics of this medicinal material across different origins. Using the area of the common peaks as the variable and the squared Euclidean distance as the measure of interval, 18 batches of Rh. tanguticum samples from different production areas were clustered through the method of within-group average linkage. When the distance was set at 0.12, the samples were clustered into two groups (see Fig. 3). Specifically, samples S1, S2, S3, S5, S6, S7 and S8 were clustered together, and their main producing areas were distributed in Dari County, Banma County, Zeku County and Tongren County. The remaining 11 producing areas (S4, S9, S11, S12, S13, S15, S16, S17, S18, S10, and S14) were clustered into the second group, with the main producing areas being Qilian County and Huangzhong County. Geographically, the distribution of these two groups was demarcated by the Laji Mountains in Qinghai Province, with the first group located to the north while the second to the south of the mountains, indicating certain regional disparities.

Fig. 3.

Fig. 3

Cluster results of different origins of Rh. tanguticum.

3.1.3. PCA

PCA was conducted using the common peak area as the variable. Following the rotation process, the initial eigenvalues and the square sum of the rotated loadings were obtained. The first three principal components were then extracted, and their cumulative contribution rate reached 82.69 %, which could represent the majority of the chemical information in the samples of Rh. tanguticum. The results of eigenvalue and variance contribution rate are shown in Table 2.

Table 2.

Principal component analysis results.

Total variance explained
Component Initial eigenvalues
Square sum of rotational loads
Total Variance percentage Accumulate% Total Variance percentage Accumulate%
1 7.99 44.44 44.44 7.27 40.37 40.37
2 4.09 22.74 67.18 4.43 24.62 64.99
3 2.79 15.51 82.69 3.19 17.70 82.69
Peak number Factor score
First principal component Second principal component Third principal component
1 0.270 0.163 −0.146
2 −0.080 0.306 0.351
3 0.102 0.220 0.367
4 −0.082 0.325 0.386
5 0.303 0.157 −0.090
6 0.284 0.051 0.252
7 0.309 0.128 0.040
8 0.291 0.201 −0.004
9 0.302 0.020 −0.244
10 0.323 0.001 −0.132
11 0.221 0.175 0.140
12 0.305 −0.018 −0.241
13 0.194 −0.373 0.166
14 −0.054 −0.303 0.378
15 0.237 0.156 0.257
16 0.138 −0.377 0.271
17 0.225 −0.355 0.089
18 0.217 −0.294 0.161

Table 2 shows that the first principal component exhibits the highest contribution rate, which amounted to 40.37 %. Among the principal components, Peak 5 (sennoside C), Peak 7, Peak 9 (emodin-8-O-glucoside), Peak 10 and Peak 12 had relatively high positive loading factors. Peak 5 (sennoside C) and Peak 9 (emodin-8-O-glucoside), in particular, were bound anthraquinone derivatives (glycosides), which were recognized as the primary active substances in rhubarb responsible for its laxative effect. This suggests that the first principal component is associated with pharmacological activities of Rh. tanguticum, such as its purgative properties. Furthermore, the first principal component mainly reflects the expression level of the combined anthraquinone / dianthrone components related to purgative effect in Rh. tanguticum. The contribution rate of the second principal component reached 24.62 %. Among the components, Peak 2 (rhein-8-O-glucoside) and Peak 4 had relatively high positive loadings, and both of these components were bound anthraquinone active components. Conversely, factors such as Peak 13 (aloe-emodin), Peak 14 (rhein), Peak 16 (emodin), Peak 17 (chrysophanol), and Peak 18 (physcion) exhibited relatively high negative loadings. These five components are classified as free anthraquinone chemical constituents. It can be inferred that the second principal component mainly reflects the difference in the ratio of conjugated anthraquinones to free anthraquinones in Rh. tanguticum. Research findings, as reported by Huang et al. (2019), have demonstrated that conjugated anthraquinones exhibit a more potent purgative effect compared to free anthraquinones. This difference in the ratio is closely associated with the processing method, the pharmacological activity of conjugated anthraquinones and free anthraquinones, as well as the clinical application trends of the medicinal materials (such as mild or strong laxative effects laxative effects). Regarding the third principal component, the common peaks, including Peaks 2, 3 (sennoside B), 4, 14 (rhein) and 16 (emodin), showed higher scores. These components possess excellent antibacterial and antioxidant properties (Hassan et al., 2024; Li, Chen, et al., 2025b; Lu, Qin, et al., 2024a), reflecting the medicinal materials' potential in exerting antibacterial and antioxidant effects.

3.2. Antioxidant activity assay

The results of the antioxidant activity are shown in Fig. 4. As shown, the ABTS free radical scavenging rate of Rh. tanguticum ranged from 1.681 to 2.484 mM (RSD: 0.002–0.046), the DPPH free radical scavenging rate was 69.57 % ∼ 85.69 % (RSD: 1.14 % ∼ 0.13 %), and the FRAP antioxidant capacity values were within the range of 0.752 to 1.231 mM (RSD: 0.001–0.042). Both the ABTS and DPPH free radical scavenging rates exceeded 60 %, and the FRAP antioxidant capacity values were above 0.700. Samples from different batches demonstrated satisfactory antioxidant activities, ranking at a medium to high level when compared with similar studies (Dai et al., 2022). Notably, its scavenging effects on ABTS and DPPH free radicals were particularly remarkable, outperforming many other rhubarb varieties, such as Rh. officinale. For rhubarb samples from different origins, the three antioxidant activity indicators generally exhibited similar trends. However, it is evident that Rh. tanguticum samples from different producing areas and growing environments exhibit different antioxidant capacities. The antioxidant activity of Rh. tanguticum from batches No.1, No.3 and No.8, which were sourced from Dari County, Banma County and Tongren City respectively, was higher than that of samples from other regions. All these three areas are located in the south of Laji Mountain and their average altitudes are relatively high. Studies (Yang et al., 2021) have indicated that the effective components of Rh. tanguticum exhibit distinct geographical variations. Specifically, regions characterized by low temperature, significant temperature fluctuations, abundant sunshine, and low precipitation are conducive to the formation and accumulation of anthraquinones and polyphenols. Given that conjugated anthraquinone and polyphenols have good antioxidant activities (Chang et al., 2025), our findings are in accordance with the results of antioxidant activity tests. It is hypothesized that environmental factors such as altitude and temperature in these areas may potentially account for differences in components related to antioxidant activity, thereby possibly contributing to the relatively strong antioxidant capacity of Rh. tanguticum from these regions.

Fig. 4.

Fig. 4

Results of ABTS (a), DPPH (b) and FRAP (c) assays for Rh. tanguticum.

ABTS = 2,2′-azinobis-3-ethylbenzothiazoline-6-sulfonic acid scavenging capacity; DPPH = 2,2-diphenyl-1-picrylhydrazyl scavenging capacity; FRAP = ferric reducing antioxidant power. This information applies for (a), (b), and (c)

One-way analysis of variance (ANOVA) was conducted to analyze the antioxidant activity of the samples from different producing areas (Table S2). The findings revealed that there were significant differences in the antioxidant activity among the samples from different producing areas (p < 0.001), which further indicated that the main antioxidant components of the samples from different producing areas varied. Consequently, it is essential to implement quality control measures for the antioxidant properties of Rh. tanguticum.

3.3. Analysis of the spectrum-effect relationship of antioxidant activity

3.3.1. GRA

GRA was systematically applied to quantify the associations between 18 characteristic peak areas and the three antioxidant capacity indices. The resulting correlation matrix (see Table 3) reveals diverse association patterns between specific phytochemical markers and the activities of scavenging free radical.

Table 3.

Correlation between common peaks Yi (i = 0,1, …,18) and antioxidant activity of Rh. tanguticum.

peak (i) ABTS clearance rate correlation degree
0, i)
rank DPPH clearance rate correlation degree
0, i)
rank FRAP antioxidant capacity correlation degree
0, i)
rank
1 0.857 3 0.862 3 0.855 2
2 0.876 1 0.865 2 0.889 1
3 0.822 7 0.826 6 0.827 6
4 0.761 17 0.761 17 0.766 16
5 0.805 10 0.809 10 0.801 10
6 0.813 8 0.818 7 0.811 8
7 0.845 4 0.853 4 0.842 4
8 0.863 2 0.865 1 0.855 3
9 0.796 12 0.807 11 0.793 13
10 0.762 16 0.769 16 0.765 17
11 0.834 5 0.829 5 0.838 5
12 0.729 18 0.736 18 0.732 18
13 0.804 11 0.806 12 0.811 9
14 0.807 9 0.812 9 0.801 11
15 0.793 13 0.800 13 0.798 12
16 0.824 6 0.816 8 0.814 7
17 0.783 15 0.789 14 0.792 14
18 0.787 14 0.787 15 0.785 15

The correlations between the area of each common peak and ABTS, DPPH and FRAP antioxidant capacity indices were greater than 0.720, indicating that a strong correlation between the common peaks and the antioxidant activity, and suggesting that this herbal plant's antioxidant activity is the result of a combined effect of multiple components rather than being attributed to a single substance. In terms of the ABTS scavenging rate, the corresponding components of the common peaks of Rh. tanguticum were ranked in descending order as follows: Peak 2 > Peak 8 > Peak 1 > Peak 7 > Peak 11 > Peak 16 > Peak 3 > Peak 6 > Peak 14 > Peak 5 > Peak 13 > Peak 9 > Peak 15 > Peak 18 > Peak 17 > Peak 10 > Peak 4 > Peak 12. Among these, Peak 2 (rhein-8-O-glucoside), Peak 8 (chrysophanol-8-O-glucoside), Peak 1 (aloe-emodin-8-O-glucoside), Peak 7, Peak 11 (physcion-8-O-glucoside), Peak 16 (emodin), Peak 3 (sennoside B), Peak 6 (sennoside A), Peak 14 (rhein), Peak 5 (sennoside C), and Peak 13 (aloe-emodin) had a relatively significant effect on the ABTS scavenging rate, with correlation coefficients (γ0, i) in the range of 0.8 ≤ γ0, i < 0.9. In addition, Peak 9 (emodin-8-O-glucoside), Peak 15, Peak 18 (physcion), Peak 17 (chrysophanol), Peak 10, Peak 4 and Peak 12 also had significant effects on the ABTS scavenging rate (0.7 ≤ γ0, i < 0.8).

The order of the impact of the corresponding components of each common peak on the DPPH radical scavenging rate was as follows: Peak 8 > Peak 2 > Peak 1 > Peak 7 > Peak 11 > Peak 3 > Peak 6 > Peak 16 > Peak 14 > Peak 5 > Peak 9 > Peak 13 > Peak 15 > Peak 17 > Peak 18 > Peak 10 > Peak 4 > Peak 12. Among these, Peak 8 (chrysophanol-8-O-glucoside), Peak 2 (rhein-8-O-glucoside), Peak 1 (aloe-emodin-8-O-glucoside), Peak 7, Peak 11 (physcion-8-O-glucoside), Peak 3 (sennoside B), Peak 6 (sennoside A), Peak 16 (emodin), Peak 14 (rhein), Peak 5 (sennoside C), Peak 9 (emodin-8-O-glucoside), Peak 13 (aloe-emodin) and Peak 15 exerted relatively significant effects on the DPPH free radical scavenging rate (0.8 ≤ γ0, i < 0.9). Moreover, peak 17 (chrysophanol), peak 18 (physcion), peak 10, peak 4 and peak 12 had significant effects on DPPH scavenging rate (0.7 ≤ γ0, i < 0.8).

The effects of the corresponding components of each common peak on the antioxidant activity of FRAP were ranked as follows: Peak 2 > Peak 1 > Peak 8 > Peak 7 > Peak 11 > Peak 3 > Peak 16 > Peak 6 > Peak 13 > Peak 5 > Peak 14 > Peak 15 > Peak 9 > Peak 17 > Peak 18 > 4 > Peak 10 > Peak 12. Among these, Peak 2 (rhein-8-O-glucoside), Peak 1 (aloe emodin-8-O-glucoside), Peak 8 (chrysophanol-8-O-glucoside), Peak 7, Peak 11, Peak 3 (sennoside B), Peak 16 (emodin), Peak 6 (sennoside A), Peak 13 (aloe emodin), Peak 5 (sennoside C), Peak 14 (rhein) had a relatively significant effect on the FRAP antioxidant capacity (0.8 ≤ γ0, i < 0.9). In addition, Peak 15, Peak 9 (emodin-8-O-glucoside), Peak 17 (chrysophanol), Peak 18 (physcion), Peak 4, Peak 10 and Peak 12 had significant effects on the antioxidant capacity of FRAP (0.7 ≤ γ0, i < 0.8).

Upon a comprehensive comparison of the correlation results of the three antioxidant activities (Fig. 5(a)), it was observed that the changing trends and inflection points of the three antioxidant indicators were basically identical, indicating that the influence trend exerted by the common components of Rh. tanguticum on the three antioxidant activity indicators is generally consistent. This uniformity is noteworthy considering the distinct mechanisms underlying each assay: ABTS and DPPH primarily assess free radical scavenging through electron transfer, while FRAP measures reducing power via single-electron donation. The consistency may arise from the dominant phytochemicals in Rh. tanguticum (e.g., anthraquinone glycosides) exerting multifaceted antioxidant effects, capable of both scavenging free radicals (ABTS / DPPH) and participating in electron transfer reactions (FRAP). Among these components, Peaks 2, 8 exhibited the most significant antioxidant effects, while Peaks 4, 10 and 12 had relatively weaker effects. The antioxidant efficacy of Rh. tanguticum appears intrinsically linked to its conjugated anthraquinone profile. Compounds like chrysophanol-8-O-glucoside, rhein-8-O-glucose, aloe-emodin-8-O-glucoside, and physcion-8-O-glucoside played a dominant role in the antioxidant activity. This finding is consistent with previous research results, which indicates a significant positive correlation between total anthraquinone glycosides and antioxidant activity. And studies (Huang et al., 2019) have also shown that the purgative effect of glycosylated anthraquinones is superior to that of free aglycones. Such dual functionality underscores a critical processing consideration for Rh. tanguticum-preserving sufficient glycoside levels during extraction and formulation not only optimizes antioxidant yield but also safeguards its traditional laxative properties.

Fig. 5.

Fig. 5

Spectra-effect relationship analysis of antioxidant activity in Rh. tanguticum.

Comparison of the effects of common components on antioxidant activity in Rh. tanguticum (a). Importance analysis of each common peak for ABTS (b), DPPH (c), FRAP (d) and standardized regression coefficient of ABTS (e), DPPH (f), FRAP (g). Correlation analysis (h).

Note:* p ≤ 0.05; ** p ≤ 0.01; *** p ≤ 0.001.

3.3.2. PLSR

PLSR analysis was performed on the date of the areas of 18 common peaks and the data of three antioxidant indices, and the results are shown in Fig. 5 (b ∼ g). Through the analysis of the VIP values and partial regression coefficients of the three antioxidant activity indices for the fingerprint data of 18 batches of Rh. tanguticum, and by comparing the PLSR results of the three antioxidant indices, it was found that the VIP values of Peak 2 (rhein-8-O-glucoside), Peak 4, Peak 17 (chrysophanol) and Peak 18 (physcion) were all greater than 1, indicating that the substances corresponding to these four common peaks had an important influence on the antioxidant effect of Rh. tanguticum in vitro. Based on the two indices of the correlation coefficient and the VIP value, it was found that Peak 2 and Peak 4 were positively correlated with the three antioxidant indexes and their VIP values were greater than 1, suggesting that these two active components played a more prominent role in promoting the antioxidant activity of Rh. tanguticum. The inherent variability in the chemical compositions of Rh. tanguticum samples collected from different regions introduces a degree of heterogeneity, which is reflected in the larger error bars observed in the PLSR VIP importance analysis plot.

3.3.3. Pearson correlation

Using Origin 2021 software, Pearson correlation analysis was conducted, and the results were illustrated in Fig. 5 (h). Peak 2 (rhein-8-O-glucoside) was significantly positively correlated with ABTS free radical scavenging activity, DPPH free radical scavenging activity and FRAP antioxidant capacity (p ≤ 0.01). Peak 4 was significantly positively correlated with DPPH free radical scavenging activity and FRAP antioxidant capacity (p ≤ 0.01), while Peak 18 (physcion) was significantly negatively correlated with FRAP antioxidant capacity (p ≤ 0.05). In the corrected p-values, only peak 2 showed a significant positive correlation with the ABTS, DPPH and FRAP indices (corrected p ≤ 0.05), as shown in Table S3. This finding indicates that among the correlations between peak area and antioxidant indicators, peak 2 has a statistically significant association with ABTS, DPPH and FRAP, which may suggest that this peak plays an important role in antioxidant activity. These results are in accordance with those of PLSR analysis in 3.3.2. The chemical structure of Peak 2 rhein-8-O-glucoside contains four phenolic hydroxyl groups and a carboxyl group, which may be one of the contributing factors to its relatively greater impact on the antioxidant activity.

3.4. Rapid evaluation of antioxidant activity based on NIR spectroscopy

3.4.1. Spectral characteristics

The NIR spectra of 18 batches of Rh.tanguticum samples are shown in Fig. 6. A total of six absorption peaks were identified, located at 8370 cm−1, 6337 cm−1, 5886 cm−1, 4825 cm−1, 4400 cm−1 and 4277 cm−1, respectively. The second-order frequency doubling absorption peak of CH2 (anthraquinone nucleus and glycosyl methyl / methylene) were observed near 8370 cm−1. The first-order frequency doubling absorption peaks of O—H (glycoside hydroxyl group and free anthraquinone phenolic hydroxyl group) were found near 6337 cm−1. The C—H first-order frequency doubling absorption peak of anthraquinone aromatic ring was at 5886 cm−1. Near 4825 cm−1, there was a combined band absorption peak of O—H (from the hydroxyl groups of glycoside and polysaccharide). The combination frequency of O—H and C—O stretching vibrations, characteristic of conjugated anthraquinone glycosidic bonds, was observed near 4400 cm−1. The second-order overtone of the fundamental frequency of the bending vibration of polysaccharide C—H (related to cellulose or starch components) occurred at 4277 cm−1. These characteristic peaks can comprehensively reflect the chemical structure of anthraquinones in rhubarb as well as the distribution characteristics of polysaccharides.

Fig. 6.

Fig. 6

Original NIR spectrum of Rh. tanguticum.

3.4.2. Establishment of NIR spectroscopy evaluation model

3.4.2.1. Sample set division

TQ Analyst software was utilized to divide the modeling set through the concentration gradient method, while machine learning modeling in Python software was employed to divide the modeling set by random method. The proportions of calibration set and prediction set were optimized separately (see Table 4). When using TQ Analyst software to divide the samples into gradients, the optimal modeling performance for ABTS and DPPH was achieved with a 3:1 division of the sample set, while for FRAP, the best performance was observed with a 5:1 division. When machine learning modeling in Python software randomly divides the modeling set, the results showed that a 4:1 ratio was optimal for all three antioxidant indices (ABTS, DPPH, and FRAP).

Table 4.

Results of sample set optimization.

Modeling software Antioxidant indicators 2:1 3:1 4:1 5:1
Rval TQ
analyst
ABTS 0.365 0.400 0.153 0.256
DPPH 0.704 0.758 0.679 0.543
FRAP 0.776 0.714 0.665 0.816
Python ABTS 0.567 0.729 0.901 0.846
DPPH 0.880 0.766 0.892 0.765
FRAP 0.826 0.855 0.919 0.909
3.4.2.2. Remove outliers

The abnormal spectra corresponding to the three antioxidant index models within the modeling spectra were eliminated. The Mahalanobis distances (MD) and principal component score of the spectral data are shown in Fig.S1. According to the MD method, the MD values of all modeling spectra were found to be less than 1.8, which indicated that there were no abnormal spectra detected under this method. The results of PCA score diagnosis showed that all of the samples were within the 95 % confidence interval. Therefore, with the employment of the two methods, the near-infrared spectra of all samples were deemed to be abnormal and did not require elimination.

3.4.2.3. Modeling band selection

In the establishment of the model, the original spectrum encompasses all of the measured wavelengths. However, the full-spectrum model contains certain redundant information, which may have a negative impact on the predictive capability of the model. Wavelength selection has been shown to enhance the performance of the calibration model (Liu et al., 2025). Therefore, the VIP method was used to select the spectral bands of the modeling set, and the results were shown in Fig. S2.

In the modeling process, the full band (10,000–4000 cm−1) was first modeled, and then the bands with VIP > 1, 5254–4000 cm−1 and 7074–5651 cm−1 were selected for modeling band optimization. Different softwares were used to optimize the modeling interval of each index under the determined pretreatment and modeling methods (see Table 5). The modeling effects of ABTS, DPPH, FRAP in the full band under two different analysis software are better than those in the VIP > 1 band.

Table 5.

Optimization results of band selection method.

Antioxidant indicators Full range
VIP
RMSEC Rcal RMSEP Rval RMSEC Rcal RMSEP Rval
TQ
analyst
ABTS 0.295 0.511 0.309 0.400 0.295 0.511 0.310 0.394
DPPH 0.100 0.681 0.085 0.761 0.097 0.695 0.085 0.783
FRAP 0.135 0.685 0.113 0.776 0.130 0.734 0.138 0.633
Python ABTS 0.108 0.950 0.142 0.901 0.100 0.944 0.221 0.798
DPPH 0.034 0.970 0.055 0.892 0.010 0.997 0.092 0.758
FRAP 0.047 0.964 0.081 0.919 0.040 0.971 0.111 0.742

3.4.3. Spectral preprocessing and modeling method selection

The NIR spectrum data along with the content data of ABTS, DPPH, FRAP were input into the TQ analyst software and utilized for machine learning modeling in Python software. The optimized optimal modeling bands and modeling ratios were then used for model construction. The modeling results obtained under different modeling software, pretreatment methods and modeling methods were compared, and the optimal modeling results were screened out. The best pretreatment and modeling methods were determined, and the results are shown in Table 6.

Table 6.

Optimization results of NIR spectroscopy modeling for antioxidant activity.

Antioxidantindicators Modelingsoftware Modelsetratio modeling
method
Preprocessingmethod Modelingband/cm−1 RMSEC Rcal RMSEP Rval RMSECV Rcv RPD Externalvalidationofpredictionrates
ABTS TQanalyst 3:1 PLS SNV 10,000–4000 0.295 0.511 0.309 0.400 0.329 0.337 1.118 68.92 %
Python 4:1 PLS 1D + MSC 10,000–4000 0.108 0.950 0.142 0.901 0.362 0.812 2.433 87.03 %
DPPH TQanalyst 2:1 PCR 1D + SG smoothing 10,000–4000 0.100 0.681 0.085 0.761 0.107 0.618 1.684 74.35 %
Python 4:1 PLS 1D + MSC 10,000–4000 0.034 0.970 0.055 0.892 0.217 0.873 2.634 83.88 %
FRAP TQanalyst 2:1 PLS MSC 10,000–4000 0.135 0.685 0.113 0.776 0.151 0.588 1.745 80.31 %
Python 4:1 PLS 1D + MSC 10,000–4000 0.047 0.964 0.081 0.919 0.190 0.720 2.434 88.30 %

Under the machine learning modeling in Python software, for the ABTS antioxidant index, the model established by PLS exhibits the best performance for the NIR spectra pretreated that were pretreated using 1D + MSC method, as shown in Fig. 7 (a). This model showed lower values of the RMSEC at 0.108 and the RMSEP at 0.142, along with higher values of the Rcal at 0.950 and the Rval at 0.901. The cross-validation results yielded an RMSECV of 0.362 and an Rcv of 0.812. With a ratio of RPD value of 2.433 and an external validation prediction rate reaching 87.03 %, these results indicate that the model has good predictive ability and robustness.

Fig. 7.

Fig. 7

ABTS (a), DPPH (b), FRAP (c) Python optimal results of software modeling, ABTS (d), DPPH (e), FRAP (f) TQ analyst optimal results of software modeling.

Similarly, based on the machine learning modeling in Python software, for the DPPH antioxidant index, the model established by PLS method after pretreatment with 1D + MSC technique exhibited the optimal performance, as shown in Fig. 7 (b). The model has lower RMSEC (0.034) and RMSEP (0.055), and higher Rcal (0.970) and Rval (0.892). The cross-validation results yielded an RMSECV of 0.217 and an Rcv of 0.873. The RPD value was 2.634, and the external validation prediction rate was 83.88 %, which further confirmed the accuracy and reliability of the model.

For the FRAP antioxidant index, under machine learning modeling in Python software, the model established by PLS method using 1D + MSC pretreated spectrum has the best effect, as shown in Fig. 7 (c). The model showed lower RMSEC (0.047) and RMSEP (0.081), and higher Rcal (0.964) and Rval (0.919). The cross-validation results yielded an RMSECV of 0.190 and an Rcv of 0.720. The RPD value was 2.434, and the external validation prediction rate still reached 88.30 %.

The average prediction rate of external validation for all indicators was more than 80 %, and both of the prediction rate of the ABTS and FRAP models were above 85 %, indicating that this model were capable of making accurate predictions for unknown samples. Since NIR spectroscopy can more comprehensively reflect the chemical composition and structural information of the samples, it enables accurate prediction of the antioxidant activity. The PLS effectively balances the information retention and model complexity, it exhibits stronger prediction robustness (A. Einarson et al., 2021). During the process of dividing the modeling set and the calibration set, machine learning modeling in Python software does not perform screening and division according to the descending order of antioxidant activity values. Instead, it randomly divides the system, which may be the primary reason for the differences between the modeling results obtained from the two software modeling programs.

4. Conclusions

In this study, HPLC was employed to establish the liquid fingerprint of Rh. tanguticum from different producing areas, and the common peaks were assigned to clarify the chemical components represented by each common peak. The results of cluster analysis showed that the sampling sites were distributed in the northern and southern regions, with the Laji Mountains serving as the dividing line, with Qilian County and Huangzhong County located in the north while Banma County, Zeku County and Tongren County in the south. The spectrum-effect relationship analysis revealed that the antioxidant activity of Rh. tanguticum was the result of a combined effect of multiple components. The grey relational values of Peak 8 (chrysophanol-8-O-glucoside), Peak 2 (rhein-8-O-glucoside), Peak 1 (aloe-emodin-8-O-glucoside), Peak 7, Peak 11 (physcion-8-O-glucoside), Peak 3 (sennoside B), Peak 6 (sennoside A), Peak 16 (emodin), Peak 14 (rhein), Peak 5 (sennoside C), Peak 9 (emodin-8-O-glucoside) and Peak 13 (aloe-emodin) were relatively high. It shows that the combined effect of conjugated anthraquinone glycoside and free anthraquinone is the primary contributor to the antioxidant activity. The results of PLSR and Pearson correlation analysis showed that for Peak 2 (rhein-8-O-glucoside), Peak 4, Peak 17 (chrysophanol) and Peak 18 (physcion), the VIP values were greater than 1. Peak 2 (rhein-8-O-glucoside) was significantly positively correlated with the ABTS free radical scavenging activity, the DPPH free radical scavenging activity and the FRAP antioxidant capacity (p ≤ 0.05). These results showed that these components carry a high weight in explaining the antioxidant activity and can be considered as priority markers for quality control purposes.

In the quantitative detection model of antioxidant activity established by NIR, the three antioxidant indices were best modeled through machine learning in Python software. The optimal ABTS model (with an RPD of 2.433), the optimal DPPH model (with an RPD of 2.634), and the optimal FRAP model (with an RPD of 2.434) are capable of achieving rapid and accurate evaluation of the in vitro antioxidant activity of Rh. tanguticum. The NIR models combined with machine learning algorithms offer high efficiency and accuracy, with prediction rates above 83 % and robust RPD values, significantly reducing the time and cost associated with traditional in vitro antioxidant assays. The non-destructive and rapid nature of NIR spectroscopy, which allows for immediate results with minimal sample preparation, holds promise for large-scale screening applications. Moreover, the versatility of these models suggests they may be applicable to other herbal materials and functional foods, potentially offering theoretical support for the development of standardized quality control protocols and quality improvement strategies across various industries, subject to further real-world validation.

This study provides a scientific foundation for the improvement of quality standards and clinical application of Rh. tanguticum. In the future, it is feasible to further explore the molecular mechanisms underlying the relationship between the chemical composition and antioxidant activity. Optimizing the modeling strategies could involve expanding the sample size, incorporating samples from more diverse locations, and employing advanced machine learning techniques, such as deep learning, to enhance model performance. Future work may also focus on refining the geographic scope, enriching the dataset, and incorporating biological validation to further strengthen the findings.

CRediT authorship contribution statement

Xiaoming Song: Writing – original draft, Visualization, Validation, Investigation, Formal analysis, Data curation. Dan Feng: Writing – original draft, Visualization, Formal analysis. Jiamin Li: Investigation. Liyan Zang: Methodology. Hongmei Li: Supervision. Jing Sun: Writing – review & editing, Resources, Project administration, Funding acquisition, Conceptualization.

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 32270402), the Central Forestry and Grassland Ecological Protection and Restoration Fund Project of 2023 (Grant No. 463, Qinglinbao [2024]), and the Qinghai Innovation Platform Construction Project (2022–2024).

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.fochx.2025.103019.

Appendix A. Supplementary data

Supplementary material

mmc1.docx (136.3KB, docx)

Data availability

The data utilized in this study can be provided upon request to the corresponding author. These data are not currently stored in publicly accessible databases.

References

  1. Bai S., Luo D., Zhong G., Yang S., Ouyang H., Rao X., et al. Phytochemical analysis. PCA; 2023. Exploration of plant metabolomics variation and absorption characteristics of water-extracted Rheum tanguticum and ethanol-extracted Rheum tanguticum by UHPLC-Q-TOF-MS/MS. [DOI] [PubMed] [Google Scholar]
  2. Cai Y., Zhao F., Hu P., Jun L., Wang X., Guo L., et al. The stir-frying process of Citrus reticulata ‘Chachi’: Process optimization, physical and chemical properties, and spectrum-activity relationship. LWT. 2024;212:116968. doi: 10.1016/J.LWT.2024.116968. [DOI] [Google Scholar]
  3. Cao X., Ren X., Wang M., Wang L., Deng Y. Research progress on quality standards of specifications and grades of Chinese medicinal materials and decoction pieces. Journal of Chinese Medicinal Materials. 2021;44(02):490–494. doi: 10.13863/j.issn1001-4454.2021.02.044. [DOI] [Google Scholar]
  4. Chang Y., Hou N., Fei J., Zhi Q., Yu L., Zhang Z., et al. Uncovering phenolic profiles of different forms in safflower seeds and their antioxidant capacity, and biological activity. Journal of Food Science. 2025;90(3) doi: 10.1111/1750-3841.70025. [DOI] [PubMed] [Google Scholar]
  5. Chen T., Liu Y., Zou D., Chen C., You J., Zhou G., et al. Application of an efficient strategy based on liquid-liquid extraction, high-speed counter-current chromatography, and preparative HPLC for the rapid enrichment, separation, and purification of four anthraquinones from Rheum tanguticum. Journal of Separation Science. 2014;37(1–2):165–170. doi: 10.1002/jssc.201300648. [DOI] [PubMed] [Google Scholar]
  6. Choi J.Y., Kang S., Tran M.N., Lee S., Ryu S.M., Chae S.W., et al. Antiepileptic and neuroprotective effects of Rheum tanguticum root extract on Trimethyltin-induced epilepsy and neurodegeneration: In vivo and in silico analyses. Journal of Integrative Neuroscience. 2024;23, 6:122. doi: 10.31083/J.JIN2306122. [DOI] [PubMed] [Google Scholar]
  7. Commission., N. P . vol. I. China Medical Science and Technology Publishing House; Beijing: 2020. Pharmacopoeia of the People's Republic of China. [Google Scholar]
  8. Dai L., Miao X., Yang X., Zuo L., Lan Z., Li B., et al. High value-added application of two renewable sources as healthy food: The nutritional properties, chemical compositions, antioxidant, and Antiinflammatory activities of the stalks of Rheum officinale Baill. And Rheum tanguticum Maxim. ex. Frontiers in Nutrition. 2022;8:770264. doi: 10.3389/FNUT.2021.770264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Einarson K.A., Baum A., Olsen T.B., Larsen J., Armagan I., Santacoloma P.A., Clemmensen L.K., et al. Predicting pectin performance strength using near-infrared spectroscopic data: A comparative evaluation of 1-D convolutional neural network, partial least squares, and ridge regression modeling. Journal of Chemometrics. 2021;36:2. doi: 10.1002/CEM.3348. [DOI] [Google Scholar]
  10. Hassan H.M., Hamdan A.M., Alattar A., Alshaman R., Bahattab O., Gayyar M.M.H.A. Evaluating anticancer activity of emodin by enhancing antioxidant activities and affecting PKC/ADAMTS4 pathway in thioacetamide-induced hepatocellular carcinoma in rats. Redox Report : Communications in Free Radical Research. 2024;29(1):2365590. doi: 10.1080/13510002.2024.2365590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hoseini S.A., Vazifedoost M., Hajirostamloo B., Didar Z., Nematshahi M.M. Supercritical fluid extraction and encapsulation of Rivas (Rheum ribes) flower: Principal component analysis (PCA) Heliyon. 2025;11(2):e41746. doi: 10.1016/J.HELIYON.2025.E41746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Huang F., Yin X., Tang G., Lian Y., Liu X., Xu X., et al. Research on contents of anthraquinones, dianthrones and tannins in Rheum tanguticum on PCA and CA. China Journal of Chinese Materia Medica. 2019;44(05):920–926. doi: 10.19540/j.cnki.cjcmm.20181226.009. [DOI] [PubMed] [Google Scholar]
  13. Iqbal N., Kibtia M., Khoder R.M., Javed M., Khalifa I., Kanwal R., et al. Valorization of peanut shell polyphenols as natural antioxidants for preserving silver carp mince during refrigerated storage. Food Chemistry. 2025;493(Pt 1) doi: 10.1016/J.FOODCHEM.2025.145762. [DOI] [PubMed] [Google Scholar]
  14. Jing Z., Ping Z., Yudi X., Feng J., Xin Z., Huaiyou W., et al. Metabolic profile and dynamic characteristic of rhubarb during the vitro biotransformation by human gut microbiota. Food Chemistry. 2022;397:133840. doi: 10.1016/J.FOODCHEM.2022.133840. [DOI] [PubMed] [Google Scholar]
  15. Leonardo S.R.J., Maria B.S., Renan R.d.A., Dib B.M., Kátia P.S., Alan S.L. Metabolic syndrome and cardiovascular diseases: Going beyond traditional risk factors. Diabetes/Metabolism Research and Reviews. 2021;38, 3:e3502. doi: 10.1002/DMRR.3502. [DOI] [PubMed] [Google Scholar]
  16. Li G., Li J., Liu H., Wang Y. Geographic traceability of Gastrodia elata Blum based on combination of NIRS and chemometrics. Food Chemistry. 2025;464(P1):141529. doi: 10.1016/J.FOODCHEM.2024.141529. [DOI] [PubMed] [Google Scholar]
  17. Li Y., Chen Z., Xu Z., Wu J., Yang D., Yu Y., et al. TMT-based quantitative proteomics and non-targeted metabolomics reveal the antibacterial mechanism of rhein against Bacillus cereus and its potential application as a food preservative. LWT. 2025;218:117487. doi: 10.1016/J.LWT.2025.117487. [DOI] [Google Scholar]
  18. Liang Y., Cao X., Mao M., Guo Y. Analysis of Bioactive Compounds,Antioxidant and Antibacterial Activity of Different Polar Parts of Rhubarb. Clinical Complementary Medicine and Pharmacology. 2024;48(04):388–397. doi: 10.16466/j.issn1005-5509.2024.04.002. [DOI] [Google Scholar]
  19. Liu S., Liu H., Li J., Wang Y. Artificial and algorithmic screening of infrared spectral feature bands of Gastrodia elata to achieve rapid identification of its species. Journal of Chemometrics. 2025;39(1):e3641. doi: 10.1002/CEM.3641. [DOI] [Google Scholar]
  20. Lu L., Qin T., Chen K., Xie J., Pan L., Xi B. Enhancing the antioxidant activity by the combination use of resveratrol and emodin. Russian Journal of Bioorganic Chemistry. 2024;50(4):1466–1475. doi: 10.1134/S1068162024040319. [DOI] [Google Scholar]
  21. Lu W., Chen T., Shen C., Zou D., Luo J., Wang S., et al. Rapid screening of potential α-glucosidase inhibitors from the waste leaves of Rheum tanguticum by activity-oriented extraction and enrichment optimization, UPLC-QTOF-MS/MS, molecular docking and in vitro validation. Microchemical Journal. 2024;202 doi: 10.1016/J.MICROC.2024.110687. 110687- [DOI] [Google Scholar]
  22. Ma L., Peng Y., Pei Y., Zeng J., Shen H., Cao J., et al. Systematic discovery about NIR spectral assignment from chemical structural property to natural chemical compounds. Scientific Reports. 2019;9(1):1–17. doi: 10.1038/s41598-019-45945-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Nguyen P.N., Nguyen N.L., Pham H.N., Nguyen P.K., Thi K.O.N. Antioxidant and antibacterial activities of Pteris vittata L. extracts from metalliferous soils: Correlations with phenolic compounds. Natural Product Communications. 2025;20:2. doi: 10.1177/1934578X251322868. [DOI] [Google Scholar]
  24. Shreyasi H., Suchandra D., Layla K.K. Evaluation of phytochemical content and in vitro antioxidant properties of methanol extract of Allium cepa, Carica papaya and Cucurbita maxima blossoms. Food Chemistry Advances. 2022;1 doi: 10.1016/J.FOCHA.2022.100104. [DOI] [Google Scholar]
  25. Xu M., Yu Y., Xu H., Li M. Identification of active substances in pear paste for the treatment of cough in mice using spectrum-effect relationship study. Journal of Functional Foods. 2024;122:106491. doi: 10.1016/j.jff.2024.106491. [DOI] [Google Scholar]
  26. Yang F., Ran J., Liu H., Song J., Xie C. Geographic variation of functional components and the climatic response characteristics of Rheum tanguticum Maxim.ex Balf. Acta Ecologica Sinica. 2021;41(09):3645–3655. doi: 10.5846/stxb201912312854. [DOI] [Google Scholar]
  27. Yang H., Yang T., Gong D., Li X., Sun G., Guo P. A trinity fingerprint evaluation system of traditional Chinese medicine. Journal of Chromatography A. 2022;1673:463118. doi: 10.1016/J.CHROMA.2022.463118. [DOI] [PubMed] [Google Scholar]
  28. Yang T., Zhao X., Sun Q., Zhang Y., Xie J. Elucidating the anti-inflammatory activity of platycodins in lung inflammation through pulmonary distribution dynamics and grey relational analysis of cytokines. Journal of Ethnopharmacology. 2024;323:117706. doi: 10.1016/J.JEP.2024.117706. [DOI] [PubMed] [Google Scholar]
  29. Yuan H., Luo J., Lu M., Jiang S., Qiu Y., Tian X., et al. An integrated approach to Q-marker discovery and quality assessment of edible Chrysanthemum flowers based on chromatogram–effect relationship and bioinformatics analyses. Industrial Crops and Products. 2022;188:PB. doi: 10.1016/J.INDCROP.2022.115745. [DOI] [Google Scholar]
  30. Zhang X., Wang L., Li R., Wang L., Fu Z., He F., et al. Identification strategy of Fructus Gardeniae and its adulterant based on UHPLC/Q-orbitrap-MS and UHPLC-QTRAP-MS/MS combined with PLS regression model. Talanta. 2024;267:125136. doi: 10.1016/J.TALANTA.2023.125136. [DOI] [PubMed] [Google Scholar]
  31. Zhong L., Sun J., Li S., Qi Y., Luo M., Dong L., et al. Scorch processing of rhubarb (Rheum tanguticum Maxim. ex Balf.) pyrolyzed anthraquinone glucosides into aglycones and improved the therapeutic effects on thromboinflammation via regulating the complement and coagulation cascades pathway. Journal of Ethnopharmacology. 2024;333:118475. doi: 10.1016/J.JEP.2024.118475. [DOI] [PubMed] [Google Scholar]
  32. Zhuang T., Gu X., Zhou N., Ding L., Yang L., Zhou M. Hepatoprotection and hepatotoxicity of Chinese herb Rhubarb (Dahuang): How to properly control the “General (Jiang Jun)” in Chinese medical herb. Biomedicine & Pharmacotherapy. 2020;127 doi: 10.1016/j.biopha.2020.110224. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

mmc1.docx (136.3KB, docx)

Data Availability Statement

The data utilized in this study can be provided upon request to the corresponding author. These data are not currently stored in publicly accessible databases.


Articles from Food Chemistry: X are provided here courtesy of Elsevier

RESOURCES