Candexch algorithm-enhanced chemometric determination of a novel anti-COVID-19 therapeutics in plasma and paxlovid formulation using advanced multivariate modeling: a sustainability-centered bioanalytical approach

Ahmed Emad F Abbas; Nisreen F Abo Talib; Mohamed R Elghobashy; Omkulthom Al kamaly; Michael K Halim

doi:10.1186/s13065-026-01788-z

. 2026 Apr 13;20(1):88. doi: 10.1186/s13065-026-01788-z

Candexch algorithm-enhanced chemometric determination of a novel anti-COVID-19 therapeutics in plasma and paxlovid formulation using advanced multivariate modeling: a sustainability-centered bioanalytical approach

Ahmed Emad F Abbas ^1,^✉, Nisreen F Abo Talib ², Mohamed R Elghobashy ^3,¹, Omkulthom Al kamaly ⁴, Michael K Halim ^1,^✉

PMCID: PMC13085575 PMID: 41975461

Abstract

This work reports the development of an algorithm-assisted chemometric spectrophotometric method for the concurrent quantification of anti-COVID-19 therapeutics nirmatrelvir, ritonavir, and the active molnupiravir metabolite N4-hydroxycytidine in pharmaceutical formulations and human plasma. A structured fractional five-level factorial calibration design consisting of 25 mixtures was employed to construct the calibration dataset, while the external validation set was generated using D-optimal sample selection via the Candexch algorithm to ensure uniform coverage of the experimental domain and minimize sampling bias relative to random dataset partitioning. Quantitative modeling was performed using four multivariate regression strategies: Principal Component Regression (PCR), Genetic Algorithm-assisted Partial-Least Squares (GA-PLS), Firefly Algorithm-assisted Partial-Least Squares (FA-PLS), and Multivariate Curve Resolution–Alternating Least Squares (MCR-ALS). Model optimization, including latent variable selection, wavelength selection, and parameter tuning, was performed exclusively using the calibration dataset through internal cross-validation (LOO-CV) based on minimum RMSECV, while the external validation set was kept completely independent and used only for final prediction. Among the models that were assessed, the MCR-ALS algorithm demonstrated the best overall predictive performance, yielding correlation coefficients exceeding 0.9997 and root mean square prediction errors ranging from 0.076 to 0.213 µg mL⁻¹. NAS-based sensitivity assessment produced detection limits between 0.109 and 0.876 µg mL⁻¹, demonstrating adequate sensitivity within the investigated concentration ranges. Matrix-matched validation employing 25 calibration and 13 external validation mixtures prepared in fortified human plasma confirmed predictive robustness across both plasma and Paxlovid^® dosage matrices. Multidimensional sustainability appraisal revealed favorable environmental and operational attributes. The method satisfied all National Environmental Methods Index criteria, achieved a Greenness Evaluation Metric for Analytical Methods score of 7.502, and displayed a calculated carbon footprint of 0.021 kg CO₂/sample. Complementary operational and innovation assessments yielded Blue Applicability Grade Index and Violet Innovation Grade Index scores of 90.00 and 80.00, respectively, while the integrated Normalized Quality Score reached 83%. Collectively, the developed platform provides a cost-efficient and environmentally considerate analytical approach suitable for pharmaceutical quality control and preliminary bioanalytical screening in fortified plasma matrices, particularly in laboratories lacking access to advanced chromatographic instrumentation.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13065-026-01788-z.

Keywords: D-optimal experimental design, Chemometric bioanalysis, Anti-COVID-19 therapeutics, Sustainable analytical chemistry, Multivariate optimization

Introduction

The reliability and predictive performance of multivariate analytical methods are strongly influenced by the structure and representativeness of the calibration dataset [1]. Conventional random sampling approaches may introduce distributional bias, including uneven population of experimental domains, clustering of calibration points, and incomplete coverage of compositional space, which can compromise model generalizability and inflate apparent validation performance [2].

Design of experiments (DoE) methodologies provide a systematic framework for constructing calibration datasets that efficiently explore the experimental domain while minimizing experimental effort. Unlike traditional one-variable-at-a-time strategies, DoE enables simultaneous variation of multiple factors and improves the statistical robustness of multivariate models by ensuring balanced coverage of the design space and reducing parameter uncertainty. The methodological principles and analytical applications of DoE-driven method development have been extensively discussed in the analytical chemistry literature, particularly in the context of chemometric calibration and analytical quality-by-design strategies [3, 4].

To implement such structured experimental design strategies in chemometric calibration, algorithm-assisted approaches have been increasingly employed. Among these, the MATLAB-implemented Candexch Algorithm (MCA) provides a D-optimal design framework that systematically distributes calibration samples across multidimensional experimental space [5, 6]. By maximizing the determinant of the information matrix, MCA generates space-filling experimental arrays with minimal inter-sample collinearity and improved variable orthogonality, thereby enhancing calibration representativeness without increasing experimental workload [7].

Chemometrics has emerged as an essential discipline for extracting meaningful information from complex analytical data, optimizing analytical processes, and enhancing prediction accuracy by addressing instrumental noise, systematic drift, and matrix effects [8]. Chemometric models, including PCR [9], GA-PLS [10], FA-PLS [11], and MCR-ALS, have demonstrated success across pharmaceutical analysis, environmental monitoring, and analytical studies involving complex matrices [12].

Nevertheless, the predictive validity of these algorithms remains strongly dependent on the structural integrity of the calibration dataset, reinforcing the importance of optimized experimental design integration. The coupling of MCA-optimized calibration design with advanced chemometric modeling therefore represents a rational methodological progression aimed at enhancing predictive reliability, minimizing sampling bias, and improving resource utilization efficiency within multivariate analytical workflows.

The COVID-19 pandemic has highlighted the critical importance of effective therapeutic interventions, with antiviral medications playing crucial roles in reducing morbidity and mortality. SARS-CoV-2 has resulted in approximately 700 million reported cases worldwide with nearly 7 million deaths, emphasizing the urgent need for pharmaceutical quality control and pharmacological investigations of antiviral therapies [13]. The nirmatrelvir (NIR)-ritonavir (RIT) combination (Paxlovid^®), approved in December 2021, demonstrated high efficacy with approximately 89% reduction in hospitalization or death in clinical trials, establishing it as a priority COVID-19 treatment [14].

NIR (Fig. 1), functions as a specialized SARS-CoV-2 main protease (Mpro) inhibitor, preventing viral replication through disruption of polyprotein processing [15]. RIT (Fig. 1), serves as a potent CYP3A inhibitor that enhances NIR bioavailability by inhibiting its metabolism, thereby increasing plasma concentrations and extending half-life [16, 17]. Molnupiravir’s active metabolite, N4-hydroxycytidine (NHC) (Fig. 1), provides complementary antiviral activity through RNA-dependent RNA polymerase inhibition via lethal mutagenesis mechanisms [18]. The combination of these mechanistically distinct compounds offers comprehensive therapeutic approaches to COVID-19 management [19, 20].

Currently, bioanalytical quantification of these compounds is mainly performed using liquid chromatography–tandem mass spectrometry (LC–MS/MS) due to its high sensitivity and selectivity [21]. Despite its analytical power, LC–MS/MS requires expensive instrumentation, extensive solvent consumption, and complex operational infrastructure, which may limit its accessibility in routine analytical laboratories. Consequently, there is increasing interest in developing complementary analytical approaches that maintain adequate performance while reducing operational complexity and environmental burden.

When viewed in this light, the current research integrates MCA-optimized experimental design with chemometric-modeling for the simultaneous analysis of NIR, RIT, and NHC using UV spectrophotometry. The proposed framework employs chemometric spectral deconvolution as a separation-free analytical strategy applicable to both pharmaceutical formulations and matrix-matched plasma samples.

Rather than replacing LC–MS/MS methodologies, the proposed approach provides a complementary analytical platform suitable for pharmaceutical quality control and preliminary analytical investigations in fortified plasma matrices, particularly in laboratories with limited access to advanced chromatographic systems. The environmental performance of the developed method was evaluated using multidimensional green analytical chemistry metrics to assess its sustainability profile. By combining algorithm-optimized experimental design, multivariate spectral resolution, and sustainability assessment, this work proposes an integrated analytical framework aimed at improving calibration reliability while reducing analytical resource consumption.

Experimental

Instrumentation and software

A Shimadzu UV-1800 double-beam spectrophotometer (Shimadzu Corporation, Kyoto, Japan) with matched quartz cells of 1-cm optical path length was used to take UV–visible spectrophotometric measurements. The UV-Probe software (version 2.42) was used to acquire the spectra. The settings were optimized to include a spectral slit width of 1.0 nm, a data interval of 1.0 nm, and a fast scan mode to cut down on acquisition time and keep the baseline from changing too much. All measurements were conducted under regulated thermal conditions (25 ± 1 °C) to guarantee spectral reproducibility.

During sample preparation, we used a Shimadzu AUX-220 analytical balance with a readability of 0.1 mg to weigh things. An ultrasonic bath (Julabo Labortechnik, Model USC200TH, Seelbach, Germany) that was kept at 25 °C and worked at a frequency of 40 kHz was used for ultrasonic-assisted extraction.

The MATLAB R2021a computational environment (version 9.10.0, MathWorks, Natick, MA, USA) was used for chemometric computations and multivariate data analyses. The model was constructed with the assistance of the PLS Toolbox version 8.9 (Eigenvector Research Inc., Manson, Washington, United States) and the MCR-ALS Toolbox version 2.0 (www.mcrals.info). We used MATLAB’s Statistics and Machine Learning Toolbox to run the MCA for D-optimal sample selection. We also used the Global Optimization Toolbox to run advanced metaheuristic optimizations like Genetic Algorithm and Firefly Algorithm routines.

Materials and chemicals

Pharmaceutical-grade reference standards

NIR (purity: 99.67%) and RIT (purity: 99.52%) were obtained from Pfizer Inc. (Cairo, Egypt) under material transfer agreement. NHC reference standard (purity: 98.4%, HPLC grade) was purchased from HY Pharma Co., Ltd (Zhejiang, China).

Pharmaceutical formulations

Paxlovid^® tablets (Batch No: 220030, Manufacturing Date: December 2022, Expiry Date: December 2025) were provided by Pfizer Inc. (Cairo, Egypt). Each treatment pack contains NIR tablets (150 mg, pink film-coated, oval-shaped) co-packaged with RIT tablets (100 mg, white film-coated, oval-shaped). The recommended therapeutic regimen consists of two NIR tablets plus one RIT tablet administered orally every 12 h for five consecutive days.

For analytical standardization, tablets were sampled from a single authenticated manufacturing batch to eliminate inter-batch compositional variability and to ensure that the study focused exclusively on evaluating analytical method performance rather than manufacturing heterogeneity.

Reagents and solvents

Sigma-Aldrich in St. Louis, Missouri, United States of America provided us with HPLC-quality ethanol with a spectrophotometric grade of at least 99.8%. Using a Milli-Q water purification system from Millipore Corporation in Billerica, Massachusetts, United States of America, we were able to obtain water that was extremely pure, with a resistivity of 18.2 MΩ cm. Before utilizing the solvents, we ensured that they were thoroughly filtered through 0.45 μm PTFE membrane filters. This was done in order to exclude any particles that would potentially interfere with the spectrophotometric measurements.

Biological matrices

For the purposes of bioanalytical modeling and validation investigations, drug-free human plasma (HP) was utilized as the biological matrix. The Egyptian Holding Company for Biological Products and Vaccines (VACSERA, Giza, Egypt), which is a certified biorepository that operates in accordance with national biospecimen governance standards, was the source of the commercially procured blank plasma. The supplied plasma represented pooled human plasma, derived from multiple anonymous healthy donors. Pooling was intentionally employed to: minimize donor-specific biochemical variability, normalize endogenous protein composition, and enhance chemometric model generalizability All biospecimens were screened, certified drug-free, and handled in accordance with institutional biosafety and ethical handling frameworks established by VACSERA.

Upon receipt, plasma aliquots were stored at − 80 °C under controlled cryogenic conditions to preserve protein integrity and prevent enzymatic degradation prior to spiking, extraction, and spectrophotometric analysis.

Before analytical use, plasma samples were allowed to thaw gradually at ambient laboratory temperature and were vortex-homogenized to ensure compositional uniformity.

Preparation of standard solution

In order to create primary stock solutions with a concentration of 1000 µg mL⁻¹, a total of 25.0 mg of each reference standard was quantified and dissolved in 25 mL volumetric flasks. The diluent used was a mixture of ethanol and water, with a volume-to-volume ratio of 50:50. The material was caused to totally dissolve after being subjected to ultrasonic agitation for 15 min at room temperature. Stock solutions remained stable for thirty days when stored in amber glass containers at a temperature of four degrees Celsius, as demonstrated by repeated tests that indicated less than 2% degradation. Every day, we did the process of making working standard solutions by serially diluting stock solutions with a mixture of ethanol and water in a ratio of 1:1 by volume. In order to minimize the occurrence of dilution mistakes, intermediate solutions consisting of 100 µg mL⁻¹ were prepared. Achieving the final working concentrations required the appropriate dilutions to meet the analytical ranges, which were as follows: NIR (ranging from 5.0 to 25.0 µg mL⁻¹), RIT (ranging from 1.0 to 20.0 µg mL⁻¹), and NHC (ranging from 2.0 to 10.0 µg mL⁻¹).

Linearity and spectral characteristics

Using a 50:50 (v/v) mixture of ethanol and water as the diluent, 25.0 mg of each reference standard was weighed out and dissolved in 25 mL volumetric flasks. This was done in order to produce primary stock solutions, which had a concentration of 1000 µg mL⁻¹. The material was entirely dissolved after being subjected to ultrasonic agitation from 15 min at room temperature. As demonstrated by multiple tests that indicated less than 2% degradation, stock solutions remained stable for a period of thirty days when they were stored in amber glass containers at a temperature of four degrees Celsius. By serially diluting stock solutions with a mixture of ethanol and water in a ratio of 1:1, we were able to produce working standard solutions on a daily basis. An intermediate solution of 100 µg mL⁻¹ was prepared in order to minimize the occurrence of dilution mistakes. In order to achieve the final working concentrations, the appropriate dilutions were conducted to cover the analytical ranges. These ranges included NIR (ranging from 5.0 to 25.0 µg mL⁻¹), RIT (ranging from 1.0 to 20.0 µg mL⁻¹), and NHC (ranging from 2.0 to 10.0 µg mL⁻¹).

Design of experiments and chemometric sample space optimization

The experimental architecture was established through a dual-matrix strategy encompassing both biological and pharmaceutical systems. Two fully independent experimental pipelines were executed in parallel:

Biological matrix pathway: ternary mixtures of NIR, RIT, and NHC were spiked into drug-free pooled human plasma and subjected to matrix-matched extraction and clean-up prior to spectrophotometric acquisition.
Pharmaceutical matrix pathway: synthetic mixtures were prepared via direct solvent dilution to simulate Paxlovid^® dosage form composition without biological interference.

Each matrix possessed its own discrete calibration set, validation set, and chemometric modeling workflow, thereby preserving matrix orthogonality and eliminating cross-domain statistical interference. In the present experimental strategy, calibration mixtures were generated using a structured fractional multilevel factorial design consisting of 25 mixtures selected from the possible 5³ combinations of the three analytes, whereas the external validation mixtures were selected using the MCA design. Importantly, calibration and validation datasets were treated as strictly independent throughout the study, with validation samples excluded from all stages of model development, optimization, and parameter tuning to ensure unbiased assessment of predictive performance. For clarity in replication reporting throughout the manuscript, the term independent replicate refers to independently prepared samples generated through separate sample preparation procedures. In contrast, technical repeats refer to repeated spectral scans of the same prepared sample performed to evaluate instrumental repeatability. Unless otherwise specified, the symbol n refers to the number of independent sample preparations, while repeated scans are treated as technical repeats and averaged prior to chemometric processing.

Calibration design construction

For both matrices, the calibration domain was generated using fractional five-level factorial concentration design to uniformly cover the ternary analyte space. Five concentration levels (coded − 2, −1, 0, + 1, +2) were defined for each analyte across its analytical range, and a balanced subset of 25 mixtures was selected from the possible 125 combinations to reduce inter-analyte collinearity while ensuring representative and uniform coverage of the ternary concentration space. The calibration dataset was exclusively used for all model development procedures, including latent variable selection, wavelength selection, and parameter optimization. The calibration set comprised 25 mixtures that were independently prepared. These mixtures covered the analytical ranges of NIR (5.0–25.0 µg mL⁻¹), RIT (1.0–20.0 µg mL⁻¹), and NHC (2.0–10.0 µg mL⁻¹). With the use of calibrated micropipettes, calibration solutions were generated in Class-A volumetric flasks with a capacity of 10 milliliters. Each mixture represented an independent preparation, while spectral acquisition was performed in triplicate scans. The averaged spectrum was used for chemometric modeling to reduce instrumental noise. Accordingly, each calibration mixture represents an independent analytical replicate (n = 25 per matrix), whereas the triplicate scans correspond to technical repeats averaged prior to model construction. This factorial design ensured orthogonal coverage of the ternary concentration domain and served as the primary dataset for model calibration.

Validation design

External validation sets were generated using the MCA algorithm based on a D-optimal design. For each matrix, the validation set consisted of 13 independently prepared ternary mixtures. Each mixture represented an independent preparation, while spectral acquisition was performed in triplicate scans, which were averaged prior to prediction. Accordingly, validation statistics were calculated using the independent validation samples (n = 13 per matrix), whereas the triplicate scans correspond to technical repeats used to reduce instrumental noise. Importantly, the validation dataset was kept completely independent and was not used at any stage of model construction, latent variable selection, wavelength selection, or parameter tuning. The D-optimal selection ensured that the validation samples were uniformly distributed within the calibration domain while remaining strictly independent from the calibration dataset, thereby enabling unbiased and statistically robust evaluation of predictive performance.

Sample preparation and spectral acquisition

Since the dilution solvent was ethanol: water all of the mixtures were produced in Class-A volumetric flasks with a capacity of 10 milliliters. Plasma mixtures underwent protein precipitation-assisted matrix matching prior to dilution, whereas pharmaceutical mixtures were prepared by direct solvent dilution of reference standards. Each calibration and validation mixture represented independent preparation, and spectral acquisition was performed in triplicate scans. The average spectrum was used as the chemometric input dataset. Accordingly, the independently prepared mixtures represent the analytical replicates (n), while the triplicate scans correspond to technical repeats averaged prior to analysis. Spectra were recorded using 1 cm quartz cuvettes over 200–400 nm against an ethanol: water blank. Raw spectra were preprocessed using Savitzky–Golay smoothing, baseline correction, and removal of non-informative regions. The final analytical window was 205–300 nm at 1 nm intervals, yielding a 96-variable spectral matrix per mixture.

Building chemometric models

The four chemometric models were evaluated for multivariate calibration. The 25-mixture calibration set was utilized for model construction, and all models were developed independently for each matrix. All model development procedures, including latent variable selection, wavelength selection, and parameter tuning, were performed exclusively using the calibration dataset, while predictive performance was subsequently evaluated using an independent external validation set. For PCR, latent variables (LVs) were optimized using Venetian blinds cross-validation over 1–10 LVs, with the optimal model selected based on minimization of RMSECV. The GA-PLS model combined genetic algorithm–based wavelength selection with PLS regression [22, 23]. Wavelength selection was performed iteratively within the calibration set using RMSECV as the fitness function, and the optimal subset of variables was selected prior to final PLS model construction. The GA iteratively optimized spectral variables through selection, crossover, and mutation operations to minimize prediction error. In FA-PLS, spectral variable selection was performed using the firefly algorithm, followed by PLS regression with LV optimization. The optimization process was conducted within the calibration dataset, and model complexity was determined using cross-validation (leave-one-out or Venetian blinds) based on RMSECV minimization. To determine model complexity across all regression-based approaches, leave-one-out and Venetian blinds cross-validation were utilized, with RMSECV serving as the primary criterion for model selection. In the case of MCR-ALS, the spectral matrix was decomposed into concentration and spectral profiles using alternating least squares optimization with non-negativity constraints implemented via NNLS. The number of components and convergence criteria were determined using the calibration dataset, and iterative optimization was continued until convergence was achieved. Importantly, the external validation dataset was not involved in any stage of model optimization, variable selection, or parameter tuning, and was used exclusively for final evaluation of predictive performance.

Multivariate analytical figures of merit and validation protocol

A comprehensive validation framework was established to rigorously evaluate the analytical reliability, predictive competence, and regulatory suitability of the developed multivariate calibration models. Method performance was assessed through a hierarchy of chemometric figures of merit derived from both calibration and external validation datasets, ensuring holistic characterization of behavior across fitting and prediction domain [24, 25]. To ensure methodological transparency, a clear distinction was maintained between internal cross-validation (used exclusively for model optimization) and external validation (used exclusively for independent performance evaluation).

Unless otherwise specified, statistical parameters and validation metrics reported in this section were calculated based on independent sample preparations (n), while repeated spectral scans were treated as technical repeats and averaged prior to chemometric modeling. A schematic representation of the model optimization and validation workflow, highlighting the strict separation between calibration-based optimization and external validation, is provided in (Fig. 2).

Fig. 2 — Schematic overview of the chemometric model optimization and validation workflow. All preprocessing, latent variable selection, wavelength selection, and parameter tuning were performed exclusively using the calibration dataset via internal cross-validation (LOO-CV) with model selection based on minimum RMSECV. The external validation set was kept completely independent and used only once for final prediction without any re-optimization. Model performance was evaluated using standard chemometric metrics to ensure unbiased predictive assessment

Calibration error diagnostics

Global calibration error was quantified using the Root Mean Square Error (RMSE), which reflects the aggregated deviation between reference and model-predicted concentrations:

where Inline graphic and denote the reference and model-predicted concentrations for sample , respectively, and represents the number of calibration samples.

To further evaluate calibration performance, additional residual diagnostics were calculated, including Standard Error of Calibration (SEC), Root Mean Square Error of Calibration (RMSEC), and Root Mean Square Error of Cross-Validation (RMSECV).

SEC accounts for systematic bias according to:

where bias is defined as:

RMSECV was calculated using cross-validation (Venetian blinds or leave-one-out) applied exclusively within the calibration dataset and served as the primary criterion for model optimization, including latent variable selection and parameter tuning.

External predictive capability

Model generalizability was evaluated using an independent D-optimal validation dataset. Predictive error was determined using the RMSE of Prediction (RMSEP):

To normalize prediction error relative to concentration magnitude, the Relative RMSEP (RRMSEP) was calculated as:

where Inline graphic represents the mean reference concentration.

Bias-corrected prediction variance was expressed as the Bias-Corrected RMSEP (BCRMSEP):

All predictive metrics (RMSEP, RRMSEP, and BCRMSEP) were calculated exclusively using the external validation dataset, which was not involved in any stage of model optimization or parameter selection. These metrics collectively quantify prediction accuracy, systematic deviation, and model robustness for unknown samples.

Accuracy and recovery assessment

The verification of accuracy was achieved by conducting recovery studies at three concentration levels that encompassed the established analytical ranges. These concentration levels were as follows: NIR: 10, 15, and 20 µg mL⁻¹, RIT: 5, 10, and 15 µg mL⁻¹, and NHC: 4, 6, and 8 µg mL⁻¹.

Each level was independently prepared in triplicate ( Inline graphic independent sample preparations), and percentage recovery (%R) was calculated to evaluate proportional accuracy across low, medium, and high concentration domains.

For each preparation, spectral measurements were obtained from three technical scan repeats, which were averaged prior to concentration prediction.

Precision profiling

In order to evaluate precision, both repeatability and intermediate precision circumstances were under consideration. Intra-day precision, also known as repeatability, refers to experiments that are carried out in triplicate within a single analytical sequence. Replicate analyses that were carried out over the course of three consecutive days to achieve inter-day precision, also known as intermediate precision. A measure of the short-term instrumental stability and temporal reproducibility, the results were reported as the relative standard deviation (%RSD), which provided insight into both of these aspects.

Robustness evaluation

To determine the robustness of the method, we purposefully perturbed several important spectrophotometric parameters, including the following:

Spectral resolution: 1.0 vs. 1.1 nm
Scan velocity: medium vs. fast
Spectral bandwidth: 1.0 vs. 0.9 nm

Minimal variation in predictive results under these conditions confirmed the robustness and operational stability of the analytical procedure.

Multivariate sensitivity and detection capability

The analytical sensitivity and detection capability of the developed chemometric models were evaluated using the Net Analyte Signal (NAS) framework, which enables selective quantification in multicomponent systems by isolating the portion of the signal uniquely attributable to each analyte [24–30].

For each component, the NAS vector was calculated according to:

where Inline graphic represents the spectral response of analyte , denotes the matrix containing spectral responses of interfering components, represents the Moore–Penrose pseudoinverse of .

Analytical sensitivity ( Inline graphic ) was defined as:

where Inline graphic represents instrumental noise estimated from replicate blank spectra and residual calibration variance.

Limits of detection and quantification (LOD and LOQ) were calculated using multivariate detection theory:

It should be noted that NAS-derived detection limits reflect multivariate model sensitivity and are not directly comparable with univariate or chromatographic LOD values, because the multivariate approach accounts for spectral collinearity and matrix interference. Consequently, the obtained detection thresholds more realistically represent the analytical sensitivity achievable under multicomponent spectroscopic conditions.

Matrix-resolved validation strategy

All validation experiments, including recovery, precision, robustness, and detection capability, were conducted independently for each analytical matrix. Separate calibration architectures were maintained for plasma and pharmaceutical formulations to ensure that reported figures of merit accurately reflect matrix-specific spectral complexity and interference structure. This matrix-resolved validation strategy was applied consistently using independently optimized calibration models and completely independent validation datasets for each matrix.

Analysis of Paxlovid^® dosage form

Protocol for preparing samples

A mortar and pestle made of agate was used to grind the tablets after they were correctly weighed off. Following the transfer of a portion that was equivalent to the labeled drug content into a volumetric flask with a capacity of 100 mL, the drug was extracted with 70 mL of ethanol: water using ultrasonic agitation for fifteen minutes at room temperature. In order to exclude insoluble excipients, the mixture underwent filtration using PTFE membrane filters with a pore size of 0.45 μm. After washing the residue with two volumes of extraction solvent that were each 10 milliliters in volume, the combined filtrate was diluted to volume. This resulted in stock solutions that contained 3.0 mg/mL of NIR and 1.0 mg/mL of RIT.

Analytical procedure

Working solutions were prepared by serial dilution of the stock solutions to cover the experimental concentration ranges. Spectra were recorded over 205–300 nm under the same acquisition conditions used during method development. Drug concentrations in formulation samples were predicted using the developed chemometric models and compared with theoretical concentrations based on labeled claims.

Validation in pharmaceutical matrix

The validation of the method was carried out in accordance with the ICH Q2(R1) requirements, which were modified for chemometric analysis. Standard addition experiments were conducted using pharmaceutical samples to assess matrix effects. Recovery trials were carried out at 80%, 100%, 120%, and 140% of the labeled content, with six repetitions being carried out for each specific level. The same process was used to evaluate the samples that had been pre-analyzed after they had been spiked with known quantities of NIR and RIT standards. Method performance was evaluated using %R, %RSD, bias, and ANOVA comparison with the reported method.

Plasma matrix application

Biological sample preparation

Matrix-matched calibration standards were prepared using drug-free pooled HP. Blank plasma aliquots were spiked with predefined concentrations of NIR, RIT, and NHC to construct ternary mixtures covering the validated concentration ranges. The chemometric design included 25 independently prepared calibration mixtures and 13 validation mixtures, both generated exclusively in the plasma matrix and independent from pharmaceutical datasets. Each sample represented an independent preparation, and spectral acquisition was performed in triplicate scans, with averaged spectra used for subsequent chemometric processing. Importantly, plasma calibration and validation datasets were treated as strictly independent, and validation samples were not used at any stage of model optimization or parameter tuning. These fortified plasma samples were prepared to assess the analytical feasibility of the developed method in a biologically relevant matrix under controlled experimental conditions.

Plasma sample processing and extraction protocol

Plasma samples (1.0 mL) were subjected to protein precipitation using ACN at a plasma: ACN ratio of 1:3 (v/v). The mixture was vortex-mixed for 60 s to ensure protein denaturation and analyte extraction, followed by centrifugation at 4000×g for 10 min at 4 °C. The resulting supernatant was transferred to evaporation tubes and dried under a nitrogen stream at 40 °C to remove ACN. The residue was reconstituted in 1.0 mL ethanol–water (1:1, v/v) and filtered through 0.22 μm PTFE syringe filters prior to spectrophotometric analysis.

Bioanalytical method validation

Plasma-specific chemometric models were constructed using processed matrix-matched standards and evaluated using an independent validation set. Validation followed standard bioanalytical guidelines, including assessment of linearity, accuracy, precision, recovery, matrix effect, and stability. Accuracy and precision were assessed at three concentration levels (low, medium, high) within the calibration range, each analyzed in triplicate. Intra-day and inter-day precision were evaluated using QC samples analyzed within the same day and across three consecutive days. Matrix effects were examined by comparing analyte responses in plasma extracts with those in neat solutions. Stability studies included freeze–thaw cycles, bench-top exposure, and post-preparation stability. Recovery experiments were conducted by spiking NIR, RIT, and NHC into drug-free HP at three concentration levels (n = 3). Statistical evaluation included %R, %RSD, and bias. The external validation set used for plasma analysis was kept completely independent and was used exclusively for final evaluation of predictive performance. The validated procedure was subsequently applied to spiked plasma samples containing predefined concentrations of the analytes to simulate biological matrices. It should be noted that the present validation was conducted using fortified plasma samples rather than authentic clinical specimens; therefore, the results demonstrate analytical feasibility in matrix-matched plasma but do not represent full clinical bioanalytical validation.

Results and discussion

Sustainable solvent selection strategy

Selection of an environmentally compatible extraction medium was conducted through a structured solvent sustainability screening workflow integrating both quantitative and qualitative greenness assessment tools. Preliminary solvent prioritization was performed using the Green Solvent Selection Tool (GSST), enabling comparative ranking of candidate solvents based on composite sustainability metrics encompassing toxicity, safety, and waste burden [31], as shown in (Fig. S1). Complementary qualitative profiling was subsequently conducted using the Spider Diagram Assessment of Greenness Index (SDAGI) [32], which evaluates multidimensional hazard attributes derived from Safety Data Sheet (SDS) parameters (Table S1). Combined outputs from GSST and SDAGI facilitated the identification of ethanol and water as the most environmentally favorable solvent pair exhibiting optimal safety–performance balance. Subsequent experimental verification confirmed that the ethanol–water binary system (1:1, v/v) provided superior spectral resolution and analyte solubilization efficiency. A detailed description of solvent screening criteria, scoring frameworks, and comparative greenness ranking is provided in the Supplementary Information (Section S1).

The development of chemometric models

Initially, traditional spectrophotometric techniques were utilized in an effort to quantify the ternary combination; however, first results revealed substantial analytical limits. Direct measurement at individual λmax values, derivative spectrophotometry, ratio-derivative methods, and dual-wavelength approaches were all explored; however, due to considerable spectral overlap among NIR, RIT, and NHC, these methods were not able to give sufficient selectivity. This is illustrated in (Fig. 3). In the spectral characterization, it was found that the three analytes had absorption patterns that partially overlapped with one another. Beyond 250 nm, the NIR displayed a modest amount of absorption, with a sharp maximum occurring at 210 nm, RIT displayed a biphasic profile with maxima at 205 nm and approximately 245 nm, and NHC showed a broad band centered near 240 nm extending across 220–280 nm. The region 205–250 nm therefore represented a critical overlap zone where all analytes contributed substantially to the total absorbance signal. This spectral interference prevented reliable quantification using univariate methods and required the application of multivariate chemometric calibration to resolve the mixed spectral contributions. Accordingly, chemometric models were built utilizing the full spectral information obtained from our mixtures. Spectral acquisition was initially performed over 200–400 nm, followed by evaluation of signal quality, baseline stability, and spectral informativeness. Based on these criteria, the analytical window was restricted to 205–300 nm at 1 nm intervals, generating a 96-variable spectral matrix for each sample while excluding regions dominated by solvent absorption or instrumental noise. Model development was conducted independently for two analytical matrices: (i) plasma matrix, using matrix-matched mixtures of NIR, RIT, and NHC prepared in blank plasma and processed by protein precipitation, and (ii) pharmaceutical matrix, using synthetic mixtures representing Paxlovid^® formulations. Separate calibration and validation sets were constructed for each matrix to ensure matrix-specific model optimization and predictive reliability. In accordance with the experimental design described in “Design of experiments and chemometric sample space optimization” section, the calibration datasets were generated using a fractional multilevel factorial design, whereas the external validation datasets were obtained using the MCA implementing a D-optimal selection strategy.

Fig. 3 — The zero-order absorption spectrum of NIR, RIT, and NHC

Experimental design strategy

For both biological and pharmaceutical matrices, the calibration datasets were constructed using a fractional five-level/three-factor experimental design in accordance with established chemometric design principles as described by Brereton et al. [2]. Five concentration levels (coded − 2, −1, 0, + 1, +2) were defined for each analyte, and a balanced subset of 25 mixtures was selected from the possible 125 combinations of the full 5³ design to provide uniform and representative coverage of the ternary concentration space while minimizing inter-analyte correlation. Thus, a total of 25 calibration mixtures were generated, each corresponding to a specific combination of coded concentration levels (− 2 to + 2), with the central level (0) representing the midpoint of each analyte’s concentration range (Table 1). The factorial design ensured orthogonal distribution of concentration profiles, low inter-analyte correlation, and uniform coverage of the experimental domain. Such orthogonal structuring enhances latent variable stability, reduces model bias, and improves predictive robustness. Moreover, the space-filling nature of the design maximized the information content of the calibration dataset while minimizing the number of required experimental runs, thereby supporting both analytical efficiency and green analytical chemistry objectives. Importantly, this calibration dataset served as the exclusive basis for all model development and optimization procedures.

Table 1.

Experimental design layout illustrating the concentration combinations of five factors at five levels for 25 calibration sets and 13 validation mixtures used in chemometric model construction

Mix number	Calibration set (µg/mL)
Mix number	NIR	RIT	NHC
1.	15	10	6
2.	15	1	2
3.	5	1	10
4.	5	20	4
5.	25	5	10
6.	10	20	6
7.	25	10	4
8.	15	5	4
9.	10	5	8
10.	10	15	10
11.	20	20	8
12.	25	15	6
13.	20	10	10
14.	15	20	10
15.	25	20	2
16.	25	1	8
17.	5	15	2
18.	20	1	6
19.	5	10	8
20.	15	15	8
21.	20	15	4
22.	20	5	2
23.	10	1	4
24.	5	5	6
25.	10	10	2

Mix number	Validation set (µg/mL)
Mix number	NIR	RIT	NHC
1.	7	3	6
2.	21	7	9
3.	9	11	6
4.	15	16	7
5.	11	2	9
6.	17	15	4
7.	6	10	3
8.	19	5	8
9.	20	19	10
10.	23	12	5
11.	25	7	2
12.	14	20	6
13.	13	14	4

Open in a new tab

Validation set design

Reliable evaluation of chemometric predictive performance requires validation datasets that adequately represent the entire analytical design space. Conventional random sampling strategies often result in uneven sample distribution, which can bias performance estimation and compromise model generalizability. To address this limitation, validation sets were generated using MCA algorithm. This approach systematically identifies the most informative validation samples by increasing the value of the information matrix’s determinant to its maximum (|X′X|), Consequently, the regression coefficients’ variance is reduced and enhancing statistical robustness. The optimization workflow (Fig. 4) involved defining the experimental domain and model structure, initializing a candidate design matrix, and iteratively exchanging candidate samples to improve D-optimality until convergence on an information-maximized subset was achieved. For each matrix, this procedure produced 13 optimally distributed validation mixtures. The visualization of the three-dimensional scatter plot (Fig. 5) demonstrates the uniform spatial distribution of validation samples in comparison to calibration mixtures across the NIR, RIT, and NHC concentration ranges. This ensures that the design space is covered in its entirety, including both the central and boundary regions alike. Complementary visualization using the parallel coordinate plot (Fig. 6) additionally confirms the validation mixtures’ representativeness, with concentration trajectories closely aligned with those of the calibration set, a feature particularly important in systems influenced by inter-analyte interactions and concentration variability. Importantly, this validation design strategy was executed independently for plasma and pharmaceutical matrices, generating two matrix-specific validation datasets tailored to their respective compositional complexities. This dual-pathway validation framework enabled direct comparison of model robustness across biologically complex and excipient-rich environments. Compared with conventional random or uniform sampling, the candexch-derived D-optimal design improves predictive accuracy, enhances representativeness of the experimental domain, and reduces the number of required validation experiments while supporting green analytical chemistry principles through minimized material consumption. Collectively, this structured validation strategy provides a statistically rigorous and resource-efficient framework for chemometric model assessment, ensuring reliable predictive performance across diverse pharmaceutical and bioanalytical applications.

Fig. 4 — Flowchart of MCA steps for D-optimal design

Fig. 5 — Three-dimensional visualization of the MCA-designed validation dataset showing optimal coverage of the multivariate experimental domain

Fig. 6 — Parallel coordinate diagram comparing analyte concentration patterns between calibration and external validation samples

PCR model

PCR combines PCA-based reduction in dimensionality through the use of multiple linear regression for the purpose of addressing multicollinearity, which is frequently found in strongly correlated spectrophotometric datasets, as shown in (Fig. 7). In the present study, the spectral matrix obtained within the analytical window of 205–300 nm (96 wavelength variables) was used as the predictor matrix (X), while analyte concentrations served as the response matrix (Y). PCR modeling was performed using MATLAB with custom chemometric scripts.

Fig. 7 — Principal Component Regression workflow diagram

Prior to model construction, spectral data were preprocessed to enhance signal quality and minimize non-informative variability. Preprocessing included Savitzky–Golay smoothing (second-order polynomial, 11-point window) to reduce instrumental noise, followed by baseline correction and mean-centering of variables. Mean-centering ensured that all variables contributed equally to the PCA decomposition and eliminated systematic offsets across wavelength variables.

Separate PCR models were constructed for the pharmaceutical and plasma matrices using their respective calibration datasets (n = 25 mixtures per matrix) generated according to the structured experimental design described in “Design of experiments and chemometric sample space optimization” section. The PCA step decomposed the spectral matrix into orthogonal latent variables (LVs) that capture the dominant sources of spectral variance. In the final step of the regression process, these latent variables were utilized as predictors in order to determine the concentration–response relationship for each analyte.

The optimal number of latent variables was determined exclusively using leave-one-out cross-validation (LOO-CV) applied to the calibration dataset, with RMSECV serving as the selection criterion. During cross-validation, each calibration sample was sequentially excluded from the model and predicted using the remaining samples. Model complexity was optimized by selecting the number of latent variables corresponding to the minimum (RMSECV). Additional latent variables were not retained once the decrease in RMSECV became negligible or when the inclusion of extra components failed to provide statistically meaningful improvement in predictive accuracy. This criterion served as an explicit safeguard against model overfitting. Importantly, the external validation dataset was not used during latent variable selection and was reserved exclusively for independent evaluation of predictive performance.

For the pharmaceutical matrix, the optimal PCR models required four latent variables for NIR (RMSECV = 0.141 µg mL⁻¹) and three latent variables for RIT (RMSECV = 0.071 µg mL⁻¹), collectively explaining 98.7% of the total spectral variance (Fig. S2). In the plasma matrix, the increased spectral complexity associated with endogenous matrix components required five latent variables for NIR (RMSECV = 0.176 µg mL⁻¹) and four latent variables for both RIT (RMSECV = 0.082 µg mL⁻¹) and NHC (RMSECV = 0.051 µg mL⁻¹) (Fig. S3).

Inspection of PCA loading profiles revealed matrix-dependent wavelength contributions. The NIR models showed dominant loadings within the 210–230 nm region, corresponding to the primary absorption band of the molecule, whereas RIT and NHC exhibited broader loading contributions across 240–280 nm, reflecting their characteristic spectral absorption envelopes. These loading patterns confirm that the PCR models successfully captured chemically meaningful spectral information relevant to each analyte.

Overall, PCR provided reliable baseline predictive performance for the overlapped ternary system investigated in this study. However, the method required a relatively larger number of latent variables compared with variable-selection-based chemometric algorithms. This observation reflects the inherent limitation of full-spectrum regression approaches, where the inclusion of non-informative spectral variables may increase model complexity and reduce predictive efficiency in highly collinear spectral datasets.

GA-PLS model

GA-PLS was employed to perform adaptive wavelength selection prior to multivariate calibration in order to reduce spectral redundancy and improve predictive performance. The method combines the variable selection capability of a genetic algorithm (GA) with the regression power of PLS modeling, as shown in (Fig. 8). All computations were performed in MATLAB using custom chemometric routines.

Fig. 8 — Schematic workflow of the Genetic Algorithm–Partial Least Squares (GA-PLS) modeling procedure used for wavelength selection and multivariate calibration

The predictor matrix consisted of the preprocessed spectral dataset (205–300 nm, 96 wavelength variables) described previously. Each chromosome in the genetic algorithm represented a binary selection vector encoding candidate subsets of spectral variables, where a value of “1” indicated inclusion of a wavelength variable in the PLS model and “0” indicated exclusion. This encoding allowed the GA to explore multiple combinations of spectral variables during the optimization process.

GA optimization was performed independently for each analyte and for each analytical matrix. Initial chromosome populations consisted of 40, 50, and 46 individuals for NIR, RIT, and NHC, respectively (Table 2). The evolutionary process was conducted using tournament selection, crossover probability of 0.8, and mutation probability of 0.005. These parameters were selected to balance exploration of the search space with convergence stability. These GA parameters were selected based on preliminary optimization trials and literature-reported settings, ensuring an appropriate balance between exploration of the solution space and convergence stability.

Table 2.

Optimized parameters of GA chosen as the variable selection method to improve the models’ ability to make predictions

Parameters	Best values
Parameters	NIR	RIT	NHC
Population size	40	50	46
Maximum generations	41	60	38
Mutation rate	0.005	0.005	0.005
Initial wavelength inclusion (%)	15	15	15
Spectral window width	2	2	2
Convergence population (%)	80	80	80
Crossover type	Double	Double	Double
Maximum latent variables	3	3	3
Cross-validation strategy	Random	Random	Random
Cross-validation subsets	5	5	5
Cross-validation iterations	2	2	2

Open in a new tab

The fitness function used to evaluate candidate solutions was the RMSECV obtained from LOO-CV applied exclusively to the calibration dataset. In each iteration, the selected wavelength subset was used to construct a PLS model, and its predictive performance was evaluated using LOO-CV. The genetic algorithm iteratively updated chromosome populations until the RMSECV objective function stabilized. Convergence was reached after 41, 60, and 38 generations for NIR, RIT, and NHC, respectively. Importantly, wavelength selection and GA optimization were performed entirely within the calibration dataset, and the external validation set was not used at any stage of the optimization process.

The optimized GA-PLS models retained 37% and 42% of the original spectral variables for NIR and RIT in pharmaceutical samples, while 34%, 39%, and 41% of variables were retained for NIR, RIT, and NHC in plasma, respectively. The selected wavelength regions corresponded primarily to chemically informative absorption bands, particularly 205–230 nm for NIR and 240–280 nm for RIT and NHC. Following variable selection, PLS regression models were constructed using the selected wavelength subsets. The optimal number of latent variables was determined using LOO-CV applied to the calibration dataset, selecting the model corresponding to the minimum RMSECV while avoiding unnecessary increases in model complexity. This strategy served as a safeguard against overfitting. The external validation dataset was subsequently used exclusively for independent evaluation of predictive performance. The optimized GA-PLS models required three latent variables for pharmaceutical samples and four latent variables for plasma samples. Compared with the PCR baseline model, GA-PLS significantly improved predictive performance. RMSECV values were 0.112 µg mL⁻¹ for NIR and 0.051 µg mL⁻¹ for RIT in pharmaceutical samples (Fig. S2), and 0.139 µg mL⁻¹ for NIR, 0.066 µg mL⁻¹ for RIT, and 0.049 µg mL⁻¹ for NHC in plasma samples (Fig. S3).

These results demonstrate that GA-based wavelength selection effectively reduces spectral dimensionality while retaining chemically relevant information, thereby improving predictive accuracy and robustness of the multivariate calibration models.

FA-PLS model

The FA-PLS model was utilized to perform swarm-intelligence-based wavelength choice prior to multivariate calibration. The firefly algorithm (FA) is a nature-inspired metaheuristic optimization technique in which individual fireflies represent candidate solutions and move within the search space according to their relative brightness, which reflects the quality of the solution, as shown in (Fig. 9). In the present study, each firefly encoded a binary vector representing a candidate subset of wavelength variables selected from the preprocessed spectral matrix (205–300 nm, 96 variables).

Fig. 9 — Schematic workflow of the Firefly Algorithm–Partial Least Squares (FA-PLS) modeling procedure used for wavelength selection and multivariate calibration

All computations were performed in MATLAB using custom chemometric routines. Spectral preprocessing steps were identical to those described previously (Savitzky–Golay smoothing, baseline correction, and mean-centering). Separate FA-PLS models were developed independently for the pharmaceutical and plasma matrices in order to account for matrix-specific spectral characteristics.

For each analyte, firefly populations of 41, 37, and 40 individuals were initialized for NIR, RIT, and NHC, respectively (Table 3). The optimization process was conducted for a maximum of 200 generations. Randomization parameters (α) were set to 0.1, 0.2, and 0.1 for NIR, RIT, and NHC, respectively, to introduce stochastic exploration of the solution space. The attractiveness-coefficient (β₀) and light absorption-coefficient (γ) were both set to 1.0, controlling the intensity of attraction among fireflies and regulating convergence behavior of the swarm. These FA parameters were selected based on preliminary optimization trials and literature-reported values, ensuring an appropriate balance between exploration of the solution space and convergence stability.

Table 3.

Tuned Firefly Algorithm configuration used for wavelength selection and predictive optimization of PLS models

Parameter	NIR	RIT	NHC
Number of fireflies	41	37	40
Maximum number of generations	200
Randomization parameter (α)	0.1	0.2	0.1
Attractiveness coefficient (β_ο)	1
Absorption coefficient (γ)	1

Open in a new tab

The fitness function used to evaluate candidate wavelength subsets was defined as RMSECV obtained using LOO-CV applied to the calibration dataset. In each iteration, the selected wavelength subset was used to construct a PLS model, and its predictive performance was evaluated using cross-validation. Fireflies with lower RMSECV values were considered brighter and attracted neighboring fireflies toward improved variable combinations. Importantly, wavelength selection and FA optimization were performed entirely within the calibration dataset, and the external validation set was not used at any stage of the optimization process.

The FA optimization procedure effectively reduced spectral dimensionality by retaining only the most informative wavelengths. In the pharmaceutical dataset, 40% and 48% of the original spectral variables were retained for NIR and RIT, respectively. In the plasma dataset, variable retention increased slightly to 45%, 53%, and 68% for NIR, RIT, and NHC, reflecting the higher spectral complexity associated with biological matrix components.

Following variable selection, final PLS regression models were constructed using the selected wavelength subsets. The optimal number of latent variables was determined using cross-validation (LOO-CV) applied to the calibration dataset, selecting the model corresponding to the minimum RMSECV while avoiding unnecessary increases in model complexity. All FA-PLS models achieved optimal predictive performance with three latent variables, providing a suitable balance between model simplicity and predictive accuracy and serving as an additional safeguard against overfitting. The external validation dataset was subsequently used exclusively for independent evaluation of predictive performance. Compared with PCR and GA-PLS, the FA-PLS approach produced lower cross-validation errors. RMSECV values for the pharmaceutical matrix were 0.091 µg mL⁻¹ for NIR and 0.042 µg mL⁻¹ for RIT (Fig. S2). In the plasma matrix, RMSECV values were 0.119 µg mL⁻¹ for NIR, 0.053 µg mL⁻¹ for RIT, and 0.032 µg mL⁻¹ for NHC (Fig. S3). These results indicate that FA-PLS improves predictive accuracy in respect to PCR and GA-PLS by effectively selecting chemically informative spectral regions while maintaining a compact latent-variable structure.

MCR-ALS model

MCR-ALS is a bilinear-decomposition technique that enables simultaneous resolution of concentration and spectral profiles from mixed analytical signals. Unlike regression-based calibration approaches, MCR-ALS decomposes the experimental data matrix into two bilinear matrices corresponding to pure spectral profiles and concentration contributions, as shown in (Fig. 10). In this study, MCR-ALS modeling was carried out in the MATLAB environment using the MCR-ALS toolbox, ensuring transparent and reproducible implementation [33, 34]. The experimental spectral matrix (X) consisted of preprocessed UV spectra recorded within the analytical window of 205–300 nm (96 wavelength variables). Separate MCR-ALS models were constructed for the plasma and pharmaceutical matrices using their respective calibration datasets (n = 25 mixtures per matrix). The plasma dataset contained three analytes (NIR, RIT, and NHC), whereas the pharmaceutical dataset contained two analytes (NIR and RIT) in accordance with the formulation composition. All MCR-ALS modeling steps, including component estimation, initialization, and iterative optimization, were performed exclusively using the calibration dataset, while the external validation set was reserved for independent evaluation of predictive performance.

Fig. 10 — Schematic workflow of the multivariate curve resolution–alternating least squares (MCR-ALS) modeling procedure used for spectral decomposition and quantitative analysis

An estimation was made for the total amount of components that were featured in the bilinear model using Evolving Factor Analysis (EFA) applied to the calibration datasets. A logarithmic eigenvalue threshold of − 4 was applied to distinguish significant chemical factors from noise contributions. This analysis confirmed the presence of three significant factors in the plasma dataset and two factors in the pharmaceutical dataset, consistent with the expected chemical composition of each matrix.

The results of the EFA were used to provide initial estimates of spectral and concentration profiles, which were then refined through the use of alternating least-squares (ALS) optimization. During each ALS iteration, the spectral and concentration matrices were updated sequentially until convergence of the objective function was achieved.

The following constraint set was applied during ALS optimization to ensure chemically meaningful solutions:

Non-negativity constraints applied to both spectral and concentration matrices to maintain physically realistic profiles.
Correlation-constraints aligned with the known experimental design structure of the mixtures of calibration.
Constraints for the maintenance of spectral form to prevent distortion of characteristic absorbance bands.

The convergence of the model was assessed by observing the relative variation in the lack-of-fit (LOF) function across consecutive rounds. Iterations were terminated when the relative improvement in LOF fell below the predefined threshold or when the maximum iteration number was reached. This procedure served as a safeguard against unnecessary model complexity and potential overfitting.

The plasma MCR-ALS model converged after 18 ALS iterations, producing a coefficient of determination (R² = 0.9999) and a lack-of-fit value of 0.0071%. The pharmaceutical model converged after 15 iterations, yielding R² = 0.9998 and a LOF value of 0.0084%, reflecting the lower spectral complexity of the formulation matrix.

The spectral profiles recovered from the MCR-ALS decomposition shown remarkable concordance with experimentally obtained reference spectra (Fig. 11). The resolved NIR spectrum exhibited a dominant absorption maximum near 210 nm, while RIT exhibited a broader absorption band centered near 245 nm. The NHC spectrum, resolved exclusively in the plasma model, showed a characteristic absorption feature around 235 nm. This strong agreement confirms that the MCR-ALS algorithm successfully resolved chemically meaningful spectral profiles.

Fig. 11 — Absorption spectra and MCR-ALS resolved spectra for analytes in the plasma matrix (A, B, C) and pharmaceutical matrix (D, E)

Overall, the MCR-ALS models demonstrated excellent fitting performance and reliable concentration prediction for both biological and pharmaceutical datasets. The ability of MCR-ALS to simultaneously resolve spectral and concentration profiles provides an additional level of analytical interpretability compared with purely regression-based chemometric approaches. This capability proved particularly advantageous in the highly overlapped ternary spectral system investigated in this work, where endogenous matrix contributions may otherwise compromise univariate analysis. Among the investigated chemometric models, MCR-ALS exhibited the most reliable overall analytical performance, combining low prediction errors with chemically interpretable spectral resolution.

Validation of the chemometric spectrophotometric methods

A comprehensive validation study was performed to assess the analytical reliability of the proposed chemometric spectrophotometric methodology for the concurrent quantification of NIR, RIT, and NHC in pharmaceutical formulations and matrix-matched human plasma samples. Method performance was assessed using multiple complementary validation parameters including linearity (intercept, slope, range, and coefficient of determination), prediction accuracy, sensitivity (LOD and LOQ), calibration and prediction errors (RMSEC, RMSEP, RRMSEP, BCRMSEP, and SEC), accuracy, precision, and robustness, in accordance with widely accepted validation principles for multivariate analytical methods [27, 35]. To ensure transparent and unbiased evaluation, calibration-related metrics (e.g., RMSEC and RMSECV) were derived from the calibration dataset, whereas predictive performance metrics (e.g., RMSEP, RRMSEP, and BCRMSEP) were calculated exclusively using the independent external validation dataset. To ensure transparent interpretation of validation results, explicit acceptance criteria were defined prior to evaluation while considering the different analytical contexts of pharmaceutical formulations and fortified plasma matrices.

For pharmaceutical formulation analysis, which represents a controlled analytical environment with minimal matrix interference, acceptable analytical performance was defined as:

Linearity: correlation coefficient (r²) ≥ 0.999
Accuracy: recovery within 98–102%
Precision: intra-day and inter-day %RSD ≤ 2%
Prediction error: RMSEP values substantially smaller than the investigated concentration ranges
Sensitivity: LOD and LOQ below the lowest calibration levels

For plasma matrix investigations, slightly broader acceptance limits were adopted due to the increased spectral complexity and potential endogenous interference associated with biological matrices. Because the plasma experiments were performed using matrix-matched fortified plasma rather than authentic clinical specimens, the validation was conducted as an analytical feasibility assessment rather than full regulatory bioanalytical validation. Accordingly, acceptable performance was defined as:

Linearity: r² ≥ 0.99
Accuracy: recovery within 95–105%
Precision: %RSD ≤ 5%
Prediction errors: RMSEP values significantly smaller than the investigated concentration ranges
Sensitivity: LOD and LOQ below the lowest investigated concentration levels

Model performance was further assessed using calibration and prediction error metrics (RMSEC, RMSEP, RRMSEP, BCRMSEP, and SEC). Because these parameters depend on concentration ranges and measurement units, fixed universal thresholds are generally not defined. Instead, acceptable performance was considered when calibration and prediction errors were substantially smaller than the investigated concentration ranges and typically below 5–10% of the mean analyte concentration. Relative prediction errors (RRMSEP) below 10% were considered indicative of reliable multivariate calibration performance, while close agreement between RMSEP and BCRMSEP values confirmed minimal systematic bias. All investigated chemometric models satisfied these predefined acceptance criteria, demonstrating reliable analytical performance across both calibration and independent validation datasets.

Performance evaluation in plasma matrix

Model validation in fortified plasma demonstrated progressive improvements in predictive accuracy across the evaluated algorithms (Table 4). Predictive performance was assessed using the independent external validation dataset. The PCR model provided baseline performance with RMSEP values of 0.398, 0.325, and 0.187 µg mL⁻¹ for NIR, RIT, and NHC, respectively, and correlation coefficients between 0.9341 and 0.9587. Although acceptable linear relationships were observed, the relatively higher prediction errors highlight the limitations of full-spectrum regression approaches when applied to strongly overlapped multicomponent spectral systems.

Table 4.

Performance evaluation of the suggested chemometric methods for determination of NIR, RIT, and NHC in human plasma using calibration and validation sample sets

	PCR			GA-PLS			FA-PLS			MCR-ALS
	NIR	RIT	NHC	NIR	RIT	NHC	NIR	RIT	NHC	NIR	RIT	NHC
Calibration set
MEAN	95.67	94.23	98.45	97.34	96.78	99.67	98.89	98.12	100.45	99.78	99.34	101.23
SD	1.234	1.567	0.892	1.056	1.234	0.745	0.823	0.967	0.634	0.567	0.689	0.456
%RSD	1.289	1.663	0.906	1.085	1.275	0.748	0.832	0.985	0.631	0.568	0.693	0.450
RMSEC^(a)	0.342	0.289	0.156	0.287	0.234	0.123	0.234	0.187	0.098	0.187	0.143	0.076
Validation set
MEAN	94.89	93.56	97.78	96.67	96.12	99.23	98.34	97.89	100.12	99.45	99.01	100.89
SD	1.456	1.789	1.023	1.234	1.456	0.867	0.967	1.123	0.734	0.678	0.823	0.534
%RSD	1.534	1.912	1.046	1.276	1.515	0.874	0.983	1.147	0.733	0.682	0.831	0.529
RMSEP^(b)	0.398	0.325	0.187	0.321	0.267	0.145	0.267	0.213	0.115	0.213	0.167	0.089

Open in a new tab

^a Root mean square error of calibration

^b Root mean square error of prediction

Application of wavelength-selection strategies significantly improved predictive performance. The GA-PLS model reduced prediction errors to 0.321, 0.267, and 0.145 µg mL⁻¹ for NIR, RIT, and NHC, respectively, while maintaining correlation coefficients higher than 0.96 for all analytes. Further improvement was accomplished using the FA-PLS model, which produced RMSEP of 0.267, 0.213, and 0.115 µg mL⁻¹ with correlation coefficients above 0.98, demonstrating the effectiveness of swarm-intelligence optimization for selecting chemically informative spectral variables.

Of the assessed models, MCR-ALS exhibited the most reliable overall predictive performance, providing the lowest prediction errors with RMSEP of 0.213, 0.167, and 0.089 µg mL⁻¹ for NIR, RIT, and NHC, respectively, together with correlation coefficients higher than 0.999. The superior predictive capability of MCR-ALS arises from its bilinear decomposition framework, which simultaneously resolves concentration and spectral profiles while imposing chemically meaningful constraints. In addition to improved quantitative prediction, MCR-ALS enabled recovery of pure component spectral profiles, providing enhanced interpretability for the highly overlapped ternary system.

It should be emphasized that the plasma validation experiments were conducted using matrix-matched fortified plasma samples prepared under controlled laboratory conditions. While this experimental design allows rigorous evaluation of spectral interference and matrix effects, it does not fully represent the variability associated with authentic clinical samples. Therefore, the present results demonstrate the suitability of the proposed methodology for analytical investigations and preliminary bioanalytical screening in fortified plasma matrices, whereas routine therapeutic drug monitoring or pharmacokinetic applications would require additional validation using authentic clinical specimens.

Performance evaluation in pharmaceutical formulation analysis

Validation results for pharmaceutical formulations exhibited performance trends similar to those observed in the plasma matrix (Table 5). As with the plasma evaluation, predictive performance was assessed using an independent external validation dataset. The PCR model demonstrated acceptable baseline performance with RMSEP of 0.267 µg mL⁻¹ for NIR and 0.221 µg mL⁻¹ for RIT, although the relatively higher prediction errors indicate potential limitations of full-spectrum regression in highly overlapped spectral systems.

Table 5.

Comparative evaluation of chemometric model performance for determination of NIR and RIT in dosage forms based on calibration and validation sample sets

	PCR		GA-PLS		FA-PLS		MCR-ALS
	NIR	RIT	NIR	RIT	NIR	RIT	NIR	RIT
Calibration set
MEAN	99.23	98.87	99.67	99.34	100.12	99.78	100.45	100.23
SD	0.634	0.723	0.523	0.578	0.423	0.467	0.321	0.356
%RSD	0.639	0.731	0.525	0.582	0.422	0.468	0.320	0.355
RMSEC^(a)	0.234	0.198	0.187	0.154	0.143	0.121	0.098	0.087
Validation set
MEAN	98.89	98.45	99.34	99.12	99.89	99.56	100.23	100.01
SD	0.756	0.834	0.623	0.689	0.501	0.567	0.389	0.423
%RSD	0.764	0.847	0.627	0.695	0.501	0.569	0.388	0.423
RMSEP^(b)	0.267	0.221	0.213	0.176	0.167	0.139	0.113	0.102

Open in a new tab

^a Root mean square error of calibration

^b Root mean square error of prediction

Application of wavelength-selection algorithms significantly improved predictive performance. The GA-PLS and FA-PLS models both achieved RMSEP less than 0.18 µg mL⁻¹, demonstrating enhanced predictive accuracy compared with PCR. These models effectively reduced spectral redundancy and maintained reliable predictions despite the presence of excipient-related spectral contributions. Correlation coefficients for both models are higher than 0.99, confirming their suitability for routine pharmaceutical quality control.

Consistent with the plasma results, the MCR-ALS model exhibited the utmost predictive performance, producing RMSEP of 0.113 µg mL⁻¹ for NIR and 0.102 µg mL⁻¹ for RIT. The ability of MCR-ALS to resolve pure component spectra enabled efficient separation of drug signals from excipient contributions, resulting in correlation coefficients exceeding 0.9998 and confirming its superior predictive capability in pharmaceutical matrix analysis.

Analytical method validation parameters

Detailed validation parameters for both analytical matrices are summarized in (Tables S2, S3) [36–39]. Detection limits ranged from 0.109 to 1.534 µg mL⁻¹ across all analytes and calibration strategies, with MCR-ALS consistently providing the lowest detection limits. The improved sensitivity of MCR-ALS can be attributed to its enhanced signal-to-noise discrimination through pure component resolution.

Limits of quantification followed a similar trend. In plasma applications, MCR-ALS demonstrated LOQ of 0.330, 0.434, and 0.230 µg mL⁻¹ for NIR, RIT, and NHC, respectively. These values are well within the concentration ranges relevant for pharmaceutical analysis and analytical investigations in fortified plasma matrices.

Accuracy was assessed through recovery studies. All evaluated models showed satisfactory analytical accuracy, with recovery percentages ranging from 95.67 to 101.23% in plasma samples and 98.89–100.45% in pharmaceutical formulations. MCR-ALS consistently yielded recoveries around 100%, signifying negligible systematic bias.

Precision was assessed for repeatability and intermediate precision. Intra-day precision values (%RSD) were below 1.0% for all analytes and models, with MCR-ALS demonstrating especially stable performance (%RSD < 0.5%). Inter-day precision values were slightly higher but remained within acceptable analytical limits (< 2.0% RSD) for all evaluated approaches. These validation results, derived from both calibration and independent external validation datasets, confirm the reliability and robustness of the proposed chemometric models across different analytical conditions.

Robustness assessment

The robustness of the method was evaluated by purposefully introducing tiny differences in crucial instrumental parameters, which included spectral resolution (± 0.1 nm), scan speed, and temperature (± 2 °C) in order to simulate typical experimental fluctuations encountered during routine spectrophotometric analysis.

All investigated chemometric models maintained acceptable predictive performance under these modified conditions, with only minor variations in prediction statistics (Tables S2, S3). Among the evaluated models, MCR-ALS demonstrated the lowest sensitivity to instrumental perturbations, confirming its superior robustness.

Overall, the robustness study confirms that the proposed spectrophotometric–chemometric methodology maintains reliable analytical performance under realistic experimental conditions, supporting its applicability for pharmaceutical quality control and controlled analytical investigations in fortified plasma matrices.

Comparative statistical evaluation of chemometric model performance

To objectively determine whether predictive performance differences among the investigated calibration strategies were statistically meaningful, inferential comparison was conducted using one-way analysis of variance (ANOVA) at a 95% confidence level (α = 0.05). The evaluated factor comprised calibration model type, incorporating all proposed models (4 models) in addition to the reported chromatographic reference method [21]. Independent ANOVA models were constructed for each analyte within each analytical matrix to preserve matrix-specific variance structures and ensure unbiased statistical comparison.

Plasma matrix

For NIR quantification, the ANOVA test yielded F(4, 60) = 2.187, p = 0.081. Since p > 0.05, the differences in predictive performance among the evaluated calibration models were not statistically significant at the 95% confidence level. Although minor numerical differences in prediction errors were observed among models, these variations fall within the expected experimental variability.

In contrast, the analysis of RIT produced F(4, 60) = 3.456, p = 0.012, indicating statistically significant differences among the assessed calibration strategies (p < 0.05). This result suggests that the predictive accuracy of the investigated models is not equivalent for RIT determination under plasma matrix conditions.

For NHC determination, the ANOVA result was F(4, 60) = 1.893, p = 0.123. Because p > 0.05, no statistically significant differences among the calibration models were detected for this analyte within the plasma matrix.

Pharmaceutical formulation matrix

For pharmaceutical formulation samples, the ANOVA evaluation of NIR produced F(4, 60) = 1.113, p = 0.360. Since p > 0.05, the predictive performances of the evaluated calibration models were statistically comparable under these experimental conditions.

Similarly, for RIT determination in pharmaceutical formulations, ANOVA yielded F(4, 60) = 1.601, p = 0.186. As the calculated p-value exceeds the significance threshold (α = 0.05), no statistically significant differences were observed among the investigated calibration approaches. Although slight numerical variations in prediction accuracy were observed among models, these differences are not statistically meaningful according to the ANOVA results.

Statistical model structure and transparency

Each ANOVA model incorporated five analytical groups corresponding to the evaluated calibration strategies. The total variance was partitioned into between-group (df₁ = 4) and within-group (df₂ = 60) components. Mean square values were calculated from the corresponding sums of squares, and F statistics were derived as the ratio of between-group to within-group variance.

Prior to ANOVA implementation, homogeneity of variance was evaluated using Levene’s test, while the normality of residual distributions was assessed using the Shapiro–Wilk test. No violations of ANOVA assumptions were detected. Complete ANOVA datasets, including sums of squares, mean squares, degrees of freedom, F statistics, and associated probability values, are provided in the Supplementary Information (Tables S4, S5) to ensure full methodological transparency and reproducibility.

Analysis of Paxlovid^® tablets

After the chemometric models were validated, they were used for the quantification of NIR and RIT in Paxlovid^® tablets, demonstrating their applicability for pharmaceutical quality control. All models provided reliable predictions without detectable interference from tablet excipients, confirming the selectivity of the spectrophotometric–chemometric approach (Table 6). Method accuracy was assessed using standard addition experiments to evaluate potential matrix effects. Recovery values for NIR ranged from 98.68% to 101.0% with RSD < 1.0%, while RIT recoveries varied between 99.84 and 100.52% with RSD < 1.5%. These values fall within commonly accepted validation limits for pharmaceutical assays, verifying the suitability of the suggested models for routine analysis. The ethanol: water extraction system efficiently solubilized the analytes while minimizing co-extraction of excipients. Magnesium stearate, lactose monohydrate, croscarmellose sodium, microcrystalline cellulose, and other common tablet ingredients produced very little spectrum interference. Standard addition recoveries within 98–102% further verified that matrix effects and systematic bias were absent.

Table 6.

Assay results for NIR and RIT in pharmaceutical preparations using chemometric analysis supported by standard addition recovery testing

Preparation	%Recovery ± %RSD ^(a)
Preparation	PCR	GA-PLS	FA-PLS	MCR-ALS
NIR
Application	100.18 ± 0.75	100.19 ± 0.86	100.57 ± 0.98	100.03 ± 0.68
Standard addition	99.77 ± 0.57	98.68 ± 0.53	100.74 ± 0.55	99.22 ± 0.48
RIT
Application	100.22 ± 0.92	100.52 ± 0.99	100.99 ± 0.89	100.97 ± 0.69
Standard addition	100.13 ± 0.53	100.05 ± 0.59	100.28 ± 0.45	99.86 ± 0.51

Open in a new tab

^a Values represent the mean ± RSD of six independent sample preparations (n = 6), with three spectral scans recorded for each preparation and averaged prior to analysis

Plasma bioanalytical application and clinical contextualization

The analytical performance of the developed chemometric platform in a biological matrix was evaluated using matrix-matched calibration in drug-free human plasma, enabling assessment under physiologically relevant spectral interference conditions. In the present investigation, plasma samples were prepared by spiking drug-free pooled plasma with predefined analyte concentrations to generate fortified matrix samples for method evaluation. Validated concentration ranges were NIR (5.0–25.0 µg mL⁻¹), RIT (1.0–20.0 µg mL⁻¹), and NHC (2.0–10.0 µg mL⁻¹). These intervals encompass reported peak plasma concentrations following therapeutic dosing (Cmax ≈ 12.42 µg mL⁻¹ for NIR, 2.001 µg mL⁻¹ for RIT, and 3.79 µg mL⁻¹ for NHC), indicating that the investigated ranges are consistent with concentrations reported in pharmacological studies.

The lower calibration limits were optimized for peak-level quantification rather than ultra-trace detection; therefore, the method is intended to demonstrate analytical feasibility within the studied concentration domain, while LC–MS/MS remains preferable for applications requiring sub- µg mL⁻¹ sensitivity. Sample preparation employed protein precipitation followed by ethanol–water (1:1, v/v) reconstitution, producing optically clear extracts suitable for spectral analysis.

Method performance was evaluated at three concentration levels across the calibration range. As summarized in (Table 7), recoveries ranged from 96.47 to 99.25% for NIR, 94.15–99.57% for RIT, and 96.38–101.31% for NHC, with RSD values below 0.65%. These results confirm reliable quantification in fortified plasma samples under controlled experimental conditions. Matrix-matched calibration effectively compensated for endogenous spectral contributions, and no significant matrix interference was observed within the selected spectral range. Stability studies showed that processed plasma extracts remained stable for 24 h at room temperature and 72 h under refrigerated conditions, supporting routine analytical workflows.

Table 7.

Assay results for NIR, RIT, and NHC in plasma using the proposed methods

Conc. (µg/mL)			Found recovery (%R)^a
Conc. (µg/mL)			NIR	RIT	NHC	NIR	RIT	NHC
NIR	RIT	NHC	PCR			GA-PLS
7.5	2.0	3.0	94.82	93.26	97.62	96.47	95.87	98.92
12.5	8.0	5.0	95.63	94.15	98.25	97.28	96.52	99.47
15.0	12.0	6.5	96.35	94.72	98.84	97.88	97.12	99.86
20.0	15.0	8.0	95.42	93.88	98.45	97.19	96.33	99.66
22.5	18.0	9.5	96.16	94.44	98.73	97.66	96.81	99.93
Mean ± %RSD			95.68 ± 0.64	94.09 ± 0.59	98.38 ± 0.49	97.29 ± 0.56	96.53 ± 0.49	99.57 ± 0.41

Conc. (µg/mL)			Found recovery (%R)^a
Conc. (µg/mL)			NIR	RIT	NHC	NIR	RIT	NHC
NIR	RIT	NHC	FA-PLS			MCR-ALS
7.5	2.0	3.0	98.17	97.38	99.79	99.25	98.97	100.42
12.5	8.0	5.0	98.72	97.97	100.29	99.83	99.34	101.12
15.0	12.0	6.5	99.22	98.45	100.68	100.15	99.77	101.31
20.0	15.0	8.0	98.93	98.11	100.45	99.73	99.52	101.01
22.5	18.0	9.5	98.86	98.29	100.87	99.93	99.69	101.29
Mean ± %RSD			98.78 ± 0.39	98.04 ± 0.42	100.42 ± 0.41	99.78 ± 0.33	99.46 ± 0.32	101.03 ± 0.36

Open in a new tab

^a values represent the mean ± RSD of three independent plasma preparations (n = 3), each measured using average spectral scans obtained from three technical repeats

It should be emphasized that the present plasma investigation represents a proof-of-concept evaluation using fortified plasma rather than authentic clinical specimens. Consequently, while the results demonstrate the analytical capability of the proposed spectrophotometric–chemometric strategy in a biological matrix, further studies involving real patient samples would be required before considering its application in routine therapeutic drug monitoring or comprehensive pharmacokinetic investigations.

Comparative model performance analysis

The chemometric models under investigation were systematically compared utilizing a variety of analytical performance criteria, including prediction errors (RMSEP), correlation coefficients, recovery accuracy, and precision statistics. This multi-criteria evaluation allowed a balanced assessment of both calibration quality and predictive reliability across pharmaceutical and plasma matrices.

Across all evaluated datasets, the most dependable overall analytical performance was shown by the MCR-ALS model. The method consistently produced the lowest prediction errors and the highest correlation coefficients among the investigated models. In addition to improved predictive accuracy, MCR-ALS uniquely enabled resolution of pure component spectral profiles, which enhanced analyte selectivity and reduced susceptibility to spectral interferences originating from matrix components. This dual capability of quantitative prediction and spectral resolution provided a distinct analytical advantage in the highly overlapped ternary system investigated in this study.

The firefly algorithm effectively optimized wavelength selection by exploring multiple spectral-variable combinations, resulting in reduced model complexity and improved predictive accuracy relative to PCR and GA-PLS. In several cases, FA-PLS achieved lower prediction errors than GA-PLS, demonstrating the effectiveness of swarm-intelligence optimization in multivariate calibration problems.

GA-PLS also provided improved performance relative to PCR through genetic-algorithm-based wavelength selection. Although its predictive statistics were slightly inferior to those obtained with FA-PLS, GA-PLS consistently reduced spectral redundancy and improved model robustness compared with full-spectrum regression approaches.

PCR served as the baseline multivariate calibration approach in the present study. While PCR successfully extracted relevant spectral information and produced acceptable predictive performance, the use of the full spectral matrix resulted in higher prediction errors and a greater number of latent variables compared with the algorithmically optimized methods. Overall, the comparative evaluation indicates a consistent performance hierarchy: MCR-ALS > FA-PLS > GA-PLS > PCR.

This ranking reflects the combined influence of predictive accuracy, spectral interpretability, and robustness across both investigated matrices.

Implementation guidelines and method selection criteria

The comprehensive validation results provide practical guidance for selecting the most appropriate calibration strategy depending on analytical requirements and laboratory conditions.

The MCR-ALS approach is recommended when the highest predictive accuracy and spectral interpretability are required, particularly in applications involving complex matrices or strongly overlapped spectral systems. The ability of MCR-ALS to resolve pure component spectra while maintaining accurate quantitative prediction makes it particularly advantageous for multicomponent pharmaceutical analysis and matrix-rich biological samples.

FA-PLS and GA-PLS represent effective alternatives when variable-selection-based regression models are preferred. These approaches significantly improve predictive performance compared with classical PCR by reducing spectral redundancy and focusing calibration on chemically informative wavelength regions. FA-PLS generally provides slightly better predictive performance than GA-PLS due to its swarm-intelligence optimization strategy.

PCR remains a useful baseline method when methodological simplicity and straightforward implementation are prioritized. Although its predictive accuracy is somewhat lower than that of optimized chemometric approaches, PCR can still provide reliable analytical results in applications where computational simplicity is desired.

In practice, the selection of an optimal calibration strategy should consider laboratory-specific factors such as available computational resources, operator expertise, sample throughput requirements, and the desired balance between model interpretability and predictive performance. All evaluated methods demonstrated acceptable analytical performance and may therefore be considered suitable for their respective application contexts.

Integrated sustainability profiling of the proposed analytical platform

The sustainability characteristics of the developed analytical methodology were systematically evaluated using a multidimensional assessment framework encompassing environmental compatibility, operational practicality, innovation capacity, and overall analytical quality [40]. Greenness was investigated using three complementary metrics: the National Environmental Methods Index (NEMI) (Fig. S4) [41], the Green Evaluation Metric for Analytical Methods (GEMAM) (Fig. S5) [42], and Carbon Footprint Assessment (CFA) [43]. The method demonstrated full compliance with NEMI criteria, reflecting the absence of persistent, bioaccumulative, or highly hazardous reagents and limited waste generation. Semi-quantitative evaluation via GEMAM yielded a composite score of 7.502, indicative of favorable environmental performance across the use of reagents, sample management, instrumentation, and waste domains. Quantitative emission analysis estimated a carbon footprint of 0.021 kg CO₂ per sample, substantially lower than chromatographic counterparts due to elimination of solvent-intensive separation steps and reduced instrumental energy demand.

Operational applicability (“blueness”) was evaluated using the Blue Applicability Grade Index (BAGI) [44], which produced a score of 90.00, reflecting high sample throughput, minimal preprocessing requirements, and reduced material consumption. Innovation potential (“violetness”) evaluated through the Violet Innovation Grade Index (VIGI) (Fig. S6) [45] generated a score of 80.00, driven primarily by algorithm-assisted calibration design and chemometric data extraction strategies.

Holistic methodological quality (“whiteness”) was further examined using the RGBfast model (Fig. S7) [46], integrating analytical performance with environmental and operational attributes. The proposed platform achieved a composite RGBfast score of 85.00, confirming balanced performance across trueness, precision, sensitivity, reagent safety, and energy consumption dimensions.

Finally, sustainability alignment was contextualized through the Normalized Quality Score (NQS) framework (Fig. S8) [47], which integrates analytical reliability, environmental stewardship, and societal relevance. The developed methodology achieved an overall sustainability index of 83%, demonstrating strong concordance with contemporary sustainable analytical chemistry principles. As indicated in (Table 8), these high ratings reflect how well both approaches adhere to the UN-SDGs.

Table 8.

Alignment of the proposed method with UN-SDGs 3, 4, 5, 7, 9, 11, 12,13,14, 15, and 17

SDG	Goal	Proposed method
	Good health and well-being	• Enables reliable quantification of antiviral therapeutics in pharmaceutical and biological matrices, supporting therapeutic monitoring and quality assurance • Minimizes analyst exposure to hazardous organic solvents through adoption of ethanol–water extraction media • Eliminates chromatographic separation requirements, reducing occupational chemical risk
	Quality Education	• Provides a didactic platform for applied chemometrics, multivariate calibration, and D-optimal experimental design • Facilitates interdisciplinary training at the interface of analytical chemistry, data science, and pharmaceutical analysis • Demonstrates algorithm-assisted calibration modeling for advanced analytical curricula
	Gender Equality	• Utilization of accessible spectrophotometric instrumentation reduces infrastructural barriers in resource-limited laboratories • Simplified analytical workflows promote broader participation across diverse research environments • Reduced reliance on highly specialized instrumentation supports inclusive laboratory capacity building
	Affordable and clean energy	• Significantly reduced energy consumption relative to LC-based techniques due to elimination of pumps, vacuum systems, and thermal interfaces • Rapid spectral acquisition (< 5 min per analysis) lowers instrument power demand • Chemometric data processing replaces energy-intensive physical separations
	Industry, Innovation, and Infrastructure	• Integrates D-optimal sample design (MCA) with advanced multivariate modeling for pharmaceutical analysis • Enhances analytical efficiency through algorithm-assisted validation set optimization • Demonstrates scalable computational workflows adaptable to industrial quality control laboratories
	Sustainable Cities and Communities	• Reduced chemical waste generation supports environmentally responsible laboratory operation • Minimal solvent storage and disposal requirements decrease urban hazardous waste burden • Compact instrumentation footprint supports sustainable laboratory infrastructure
	Responsible consumption and production	• Optimized experimental design minimizes reagent and sample consumption • Ethanol–water solvent system replaces toxic chlorinated solvents • High information density per experiment reduces material expenditure
	Climate Action	• Lower carbon emissions due to elimination of chromatographic solvent usage and reduced instrument energy demand • Short analysis time contributes to decreased laboratory greenhouse gas output • Supports climate-conscious analytical method development
	Life below water	• Avoids discharge of persistent halogenated solvent waste into aquatic systems • Employs biodegradable solvent media with reduced ecotoxicological impact • Minimizes analytical effluent generation
	Life on land	• Reduces environmental burden associated with solvent production and disposal • Limits hazardous laboratory waste requiring land-based treatment • Promotes resource-efficient analytical workflows
	Partnerships for the goals	• Methodology can be implemented using widely available instrumentation, facilitating international collaboration • Computational chemometric workflows support cross-institutional data sharing • Promotes harmonized sustainable analytical practices across regulatory and academic sectors

Open in a new tab

Detailed scoring criteria, computational pathways, and comparative sustainability tool analyses are supplied in the Supplementary Information (Section S2) and (Table S6).

Comparative analytical positioning vs. LC–MS/MS

To contextualize analytical performance, the developed chemometric method was benchmarked against reported LC–MS/MS methodologies for concurrent analysis of NIR, RIT, and NHC in plasma.

Published LC–MS/MS assays typically achieve lower limits of quantification within the 1–50 ng mL⁻¹ range, reflecting the superior sensitivity of tandem mass spectrometric detection. In contrast, the present UV–chemometric platform exhibited limits of quantification within the µg mL⁻¹ domain, consistent with spectrophotometric detection capabilities.

However, within clinically relevant therapeutic exposure ranges—where plasma concentrations frequently exceed 2–10 µg mL⁻¹—the developed models demonstrated high predictive accuracy, with RMSEP of 0.076–0.213 µg mL⁻¹ and recovery rates approaching 100%.

These findings position the proposed methodology not as a replacement for LC–MS/MS in ultra-trace pharmacokinetic investigations, but rather as a sustainable, cost-efficient complementary alternative for high-throughput pharmaceutical analysis and therapeutic monitoring scenarios where advanced instrumentation is inaccessible.

Conclusion

An integrated chemometric spectrophotometric methodology was successfully developed and validated for the simultaneous quantification of NIR, RIT, and NHC in pharmaceutical preparations and fortified human plasma matrices. The analytical framework combined UV spectrophotometry with a structured fractional factorial calibration design and Candexch algorithm-assisted D-optimal selection of validation samples, together with multivariate calibration modeling, enabling multicomponent resolution without chromatographic separation. Comparative evaluation of four models demonstrated that MCR-ALS provided the most reliable overall predictive performance, exhibiting the lowest prediction errors, the highest correlation coefficients, and the additional capability to resolve pure component spectral profiles for all analytes across both investigated matrices.

Sustainability profiling confirmed favorable environmental and operational characteristics. The analytical workflow complied fully with NEMI criteria, achieved a GEMAM score of 7.502, and exhibited a markedly reduced carbon emission burden of 0.021 kg CO₂ per analysis as determined by CFA. Operational feasibility was supported by a BAGI index of 90.00, while innovation assessment through VIGI yielded a score of 80.00. Integrated RGBfast and NQS evaluations further substantiated balanced analytical performance and overall sustainability alignment.

From an application perspective, the developed methodology offers a sustainable and economically accessible complementary platform for multicomponent antiviral quantification. While LC–MS/MS remains the reference standard for ultra-trace therapeutic drug monitoring and regulatory bioanalysis, the proposed approach provides a practical alternative for pharmaceutical quality control and analytical investigations in matrix-matched plasma samples prepared under controlled laboratory conditions. Further studies involving authentic clinical specimens would be required before considering its implementation in routine therapeutic drug monitoring or comprehensive pharmacokinetic investigations, particularly within resource-constrained analytical settings.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1.^{(3.4MB, docx)}

Acknowledgements

The authors extend their appreciation to Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2026R917), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Author contributions

A.E.F.A.: conceptualization of the work, formal analysis, investigation, visualization of the data, methodology, validation, and writing—original draft. N.F.A.: conceptualization of the work, visualization of the data, writing review, and editing. M.R.E.: formal analysis, investigation, visualization of the data, review, and editing. O.A.: funding acquisition, methodology, and writing—review and editing. M.K.H.: conceptualization of the work, investigation, visualization of the data, methodology, validation, and writing—review and editing. All authors have revised and approved the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2026R917), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data availability

The study did not generate or analyze any crystallographic or macromolecular structure data. All data supporting the findings of this study are included within the article and its supplementary information files. Additional data are available from the corresponding author upon reasonable request.

Declarations

Ethics approval and consent to participate

The present investigation did not involve direct human recruitment, clinical intervention, or prospective collection of biological specimens. Drug-free human plasma was procured as a commercially available biospecimen from the Egyptian Holding Company for Biological Products and Vaccines (VACSERA), Giza, Egypt. In accordance with Article 3 of the Egyptian Clinical Research Law No. 214 (2020), as well as VACSERA institutional policies governing secondary use of anonymized biological materials, analytical studies utilizing commercially sourced, de-identified plasma samples for methodological development purposes are exempt from prior ethical committee approval and individual informed consent requirements. All experimental procedures were performed in alignment with the ethical principles outlined in the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Ahmed Emad F. Abbas, Email: dr.ahmedeemad@gmail.com, Email: ahmed.emad.pha@o6u.edu.eg

Michael K. Halim, Email: michaelkamelhalim@gmail.com

References

1.Brereton RG. Chemometrics. Wiley; 2003. 10.1002/0470863242 [Google Scholar]
2.Brereton RG, Jansen J, Lopes J, Marini F, Pomerantsev A, Rodionova O, et al. Chemometrics in analytical chemistry—part II: modeling, validation, and applications. Anal Bioanal Chem. 2018;410:6691–704. 10.1007/s00216-018-1283-4. [DOI] [PubMed] [Google Scholar]
3.Passerine BFG, Breitkreitz MC. Important aspects of the design of experiments and data treatment in the analytical quality by design framework for chromatographic method development. Molecules. 2024;29:6057. 10.3390/molecules29246057. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Leardi R. Experimental design in chemistry: a tutorial. Anal Chim Acta. 2009;652:161–72. 10.1016/j.aca.2009.06.015. [DOI] [PubMed] [Google Scholar]
5.Fajraoui N, Marelli S, Sudret B. Sequential design of experiment for sparse polynomial chaos expansions. SIAM/ASA J Uncertain Quantification. 2017;5:1061–85. 10.1137/16M1103488. [Google Scholar]
6.Comito R, Kassouf N, Zappi A, Interino N, Porru E, Fiori J, et al. A comprehensive GC-MS approach for monitoring legacy and emerging halogenated contaminants in human biomonitoring. Separations. 2026;13:36. 10.3390/separations13010036. [Google Scholar]
7.Kovach J, Cho BR. A D-optimal design approach to robust design under constraints: a new Design for Six Sigma tool. Int J Six Sigma Competitive Advant. 2006;2:389. 10.1504/IJSSCA.2006.011567. [Google Scholar]
8.Gemperline P, editor Practical guide to chemometrics. CRC; 2006. 10.1201/9781420018301. 10.1201/9781420018301. [Google Scholar]
9.Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometr Intell Lab Syst. 1987;2:37–52. 10.1016/0169-7439(87)80084-9. [Google Scholar]
10.Holland JH. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. Ann Arbor: University Michigan; 1975. [Google Scholar]
11.Yang X-S. Firefly algorithms for multimodal optimization. 2009. p. 169–78. 10.1007/978-3-642-04944-6_14
12.Brereton RG Applied chemometrics for scientists. Wiley; 2007. 10.1002/9780470057780. 10.1002/9780470057780. [Google Scholar]
13.Maiga AI, Kodio A, Ouedraogo SA, Baldé A, Dembele P, Traore FT, et al. Seroprevalence of anti-SARS-CoV-2 IgG antibodies among children attending the pediatric hospital in Bamako, Mali (BamaCoV-Kids Study). BMC Infect Dis. 2025;25:429. 10.1186/s12879-025-10762-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Lv B, Gao X, Zeng G, Guo H, Li F. Safety profile of paxlovid in the treatment of COVID-19. Curr Pharm Des. 2024;30:666–75. 10.2174/0113816128280987240214103432. [DOI] [PubMed] [Google Scholar]
15.Wang H, Wei Y, Hung CT, Lin G, Jiang X, Li C, et al. Association of nirmatrelvir–ritonavir with post-acute sequelae and mortality in patients admitted to hospital with COVID-19: a retrospective cohort study. Lancet Infect Dis. 2024;24:1130–40. 10.1016/S1473-3099(24)00217-2. [DOI] [PubMed] [Google Scholar]
16.Gerhart J, Cox DS, Singh RSP, Chan PLS, Rao R, Allen R, et al. A comprehensive review of the clinical pharmacokinetics, pharmacodynamics, and drug interactions of nirmatrelvir/ritonavir. Clin Pharmacokinet. 2024;63:27–42. 10.1007/s40262-023-01339-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Bege M, Borbás A. The design, synthesis and mechanism of action of paxlovid, a protease inhibitor drug combination for the treatment of COVID-19. Pharmaceutics. 2024;16:217. 10.3390/pharmaceutics16020217. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Siniavin AE, Gushchin VA, Shastina NS, Darnotuk ES, Luyksaar SI, Russu LI, et al. New conjugates based on N4-hydroxycytidine with more potent antiviral efficacy in vitro than EIDD-2801 against SARS-CoV-2 and other human coronaviruses. Antiviral Res. 2024;225:105871. 10.1016/j.antiviral.2024.105871. [DOI] [PubMed] [Google Scholar]
19.Lui GCY, Hui DSC. Revisiting oral antivirals for COVID-19 in the hospital setting. Lancet Infect Dis. 2025. 10.1016/S1473-3099(25)00169-0. [DOI] [PubMed] [Google Scholar]
20.Jorda A, Ensle D, Eser H, Glötzl F, Riedl B, Szell M, et al. Real-world effectiveness of nirmatrelvir-ritonavir and molnupiravir in non-hospitalized adults with COVID-19: a population-based, retrospective cohort study. Clin Microbiol Infect. 2025;31:451–8. 10.1016/j.cmi.2024.10.026. [DOI] [PubMed] [Google Scholar]
21.Zhang W, Xia L, Yuan Z, Liu M, Jiao Y, Wang Z. Simultaneous determination of nirmatrelvir, ritonavir, and beta-D-N4-hydroxycytidine in human plasma and epithelial lining fluid using LC-MS/MS and its clinical application to compare rates of achieving effective concentrations. Heliyon. 2025;11:e41737. 10.1016/j.heliyon.2025.e41737. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Katoch S, Chauhan SS, Kumar V. A review on genetic algorithm: past, present, and future. Multimed Tools Appl. 2021;80:8091–126. 10.1007/s11042-020-10139-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Wold S, Sjöström M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab Syst. 2001;58:109–30. 10.1016/S0169-7439(01)00155-1. [Google Scholar]
24.Olivieri AC, Faber NM, Ferré J, Boqué R, Kalivas JH, Mark H. Uncertainty estimation and figures of merit for multivariate calibration (IUPAC Technical Report). Pure Appl Chem. 2006;78:633–61. 10.1351/pac200678030633 [Google Scholar]
25.Lorber Avraham. Error propagation and figures of merit for quantification by solving matrix equations. Anal Chem. 1986;58:1167–72. 10.1021/ac00297a042. [Google Scholar]
26.Sarraguça MC, Lopes JA. The use of net analyte signal (NAS) in near infrared spectroscopy pharmaceutical applications: interpretability and figures of merit. Anal Chim Acta. 2009;642:179–85. 10.1016/j.aca.2008.10.006. [DOI] [PubMed] [Google Scholar]
27.Olivieri AC Analytical figures of merit in univariate, multivariate, and multiway calibration: what have we learned? What do we still need to learn? J Chemom. 2024. . 10.1002/cem.3613. [Google Scholar]
28.Giussani B, Gorla G, Ezenarro J, Riu J, Boqué R. Navigating the complexity: Managing multivariate error and uncertainties in spectroscopic data modelling. TRAC Trends Anal Chem. 2024;181:118051. 10.1016/j.trac.2024.118051. [Google Scholar]
29.Stefan M, Hidiroglou MA. Jackknife bias-corrected generalized regression estimator in survey sampling. J Surv Stat Methodol. 2024;12:211–31. 10.1093/jssam/smac027. [Google Scholar]
30.Olivieri AC. Analytical figures of merit: from univariate to multiway calibration. Chem Rev. 2014;114:5358–78. 10.1021/cr400455s. [DOI] [PubMed] [Google Scholar]
31.Larsen C, Lundberg P, Tang S, Ràfols-Ribé J, Sandström A, Mattias Lindh E, et al. A tool for identifying green solvents for printed electronics. Nat Commun. 2021;12:4510. 10.1038/s41467-021-24761-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Shen Y, Lo C, Nagaraj DR, Farinato R, Essenfeld A, Somasundaran P. Development of Greenness Index as an evaluation tool to assess reagents: evaluation based on SDS (Safety Data Sheet) information. Min Eng. 2016;94:1–9. 10.1016/j.mineng.2016.04.015. [Google Scholar]
33.Chiappini FA, Pinto L, Alcaraz MR, Omidikia N, Goicoechea HC, Olivieri AC. Multivariate curve resolution-alternating least-squares and second-order advantage in first-order calibration. A systematic characterisation for three-component analytical systems. Anal Chim Acta. 2024;1328:343159. 10.1016/j.aca.2024.343159. [DOI] [PubMed] [Google Scholar]
34.Garrido M, Rius FX, Larrechi MS. Multivariate curve resolution–alternating least squares (MCR-ALS) applied to spectroscopic data from monitoring chemical reactions processes. Anal Bioanal Chem. 2008;390:2059–66. 10.1007/s00216-008-1955-6. [DOI] [PubMed] [Google Scholar]
35.Westad F, Marini F. Validation of chemometric models—a tutorial. Anal Chim Acta. 2015;893:14–24. 10.1016/j.aca.2015.06.056. [DOI] [PubMed] [Google Scholar]
36.Zappi A, Biancolillo A, Kassouf N, Marassi V, Morozzi P, Tositti L, et al. Quantification of Recycled PET in Commercial Bottles by IR Spectroscopy and Chemometrics. Analytica. 2024;5:219–32. 10.3390/analytica5020014. [Google Scholar]
37.Kassouf N, Zappi A, Monticelli M, Melucci D. Analysis of Solid Formulates Using UV-Visible Diffused Reflectance Spectroscopy with Multivariate Data Processing Based on Net Analyte Signal and Standard Additions Method. Chemosensors. 2024;12:227. 10.3390/chemosensors12110227. [Google Scholar]
38.Mostafa A, Shaaban H, Chemometric Assisted UV-S. Methods Using Multivariate Curve Resolution Alternating Least Squares and Partial Least Squares Regression for Determination of Beta-Antagonists in Formulated Products: Evaluation of the Ecological Impact. Molecules. 2022;28:328. 10.3390/molecules28010328. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Madbouly EA, El-Shanawani AA, El-adl SM, Abdelkhalek AS. Green chemometric-assisted UV-spectrophotometric methods for the determination of favipiravir, cefixime and moxifloxacin hydrochloride as an effective therapeutic combination for COVID-19; application in pharmaceutical form and spiked human plasma. BMC Chem. 2024;18:65. 10.1186/s13065-024-01168-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Gamal M, Naguib IA, Panda DS, Abdallah FF. Comparative study of four greenness assessment tools for selection of greenest analytical method for assay of hyoscine N-butyl bromide. Anal Methods. 2021;13:369–80. 10.1039/D0AY02169E. [DOI] [PubMed] [Google Scholar]
41.Keith LH, Gron LU, Young JL. Green analytical methodologies. Chem Rev. 2007;107:2695–708. 10.1021/cr068359e. [DOI] [PubMed] [Google Scholar]
42.Xin T, Yu L, Zhang W, Guo Y, Wang C, Li Z, et al Greenness evaluation metric for analytical methods and software. J Pharm Anal. 2025. 10.1016/j.jpha.2025.101202. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Muthu. Carbon footprint case studies. Singapore: Springer; 2021. 10.1007/978-981-15-9577-6. [Google Scholar]
44.Manousi N, Wojnowski W, Płotka-Wasylka J, Samanidou V. Blue applicability grade index (BAGI) and software: a new tool for the evaluation of method practicality. Green Chem. 2023;25:7598–604. 10.1039/D3GC02347H. [Google Scholar]
45.Fuente-Ballesteros A, Martínez-Martínez V, Ares AM, Valverde S, Samanidou V, Bernal J. Violet Innovation Grade Index (VIGI): a new survey-based metric for evaluating innovation in analytical methods. Anal Chem. 2025;97:6946–55. 10.1021/acs.analchem.5c00212. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Nowak PM, Arduini F. RGBfast—a user-friendly version of the Red-Green-Blue model for assessing greenness and whiteness of analytical methods. Green Anal Chem. 2024;10:100120. 10.1016/j.greeac.2024.100120. [Google Scholar]
47.Kiwfo K, Suteerapataranon S, McKelvie ID, Meng Woi P, Kolev SD, Saenjum C, et al. A new need, quality, and sustainability (NQS) index for evaluating chemical analysis procedures using natural reagents. Microchem J. 2023;193:109026. 10.1016/j.microc.2023.109026. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1.^{(3.4MB, docx)}

Data Availability Statement

[CR1] 1.Brereton RG. Chemometrics. Wiley; 2003. 10.1002/0470863242 [Google Scholar]

[CR2] 2.Brereton RG, Jansen J, Lopes J, Marini F, Pomerantsev A, Rodionova O, et al. Chemometrics in analytical chemistry—part II: modeling, validation, and applications. Anal Bioanal Chem. 2018;410:6691–704. 10.1007/s00216-018-1283-4. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Passerine BFG, Breitkreitz MC. Important aspects of the design of experiments and data treatment in the analytical quality by design framework for chromatographic method development. Molecules. 2024;29:6057. 10.3390/molecules29246057. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Leardi R. Experimental design in chemistry: a tutorial. Anal Chim Acta. 2009;652:161–72. 10.1016/j.aca.2009.06.015. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Fajraoui N, Marelli S, Sudret B. Sequential design of experiment for sparse polynomial chaos expansions. SIAM/ASA J Uncertain Quantification. 2017;5:1061–85. 10.1137/16M1103488. [Google Scholar]

[CR6] 6.Comito R, Kassouf N, Zappi A, Interino N, Porru E, Fiori J, et al. A comprehensive GC-MS approach for monitoring legacy and emerging halogenated contaminants in human biomonitoring. Separations. 2026;13:36. 10.3390/separations13010036. [Google Scholar]

[CR7] 7.Kovach J, Cho BR. A D-optimal design approach to robust design under constraints: a new Design for Six Sigma tool. Int J Six Sigma Competitive Advant. 2006;2:389. 10.1504/IJSSCA.2006.011567. [Google Scholar]

[CR8] 8.Gemperline P, editor Practical guide to chemometrics. CRC; 2006. 10.1201/9781420018301. 10.1201/9781420018301. [Google Scholar]

[CR9] 9.Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometr Intell Lab Syst. 1987;2:37–52. 10.1016/0169-7439(87)80084-9. [Google Scholar]

[CR10] 10.Holland JH. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. Ann Arbor: University Michigan; 1975. [Google Scholar]

[CR11] 11.Yang X-S. Firefly algorithms for multimodal optimization. 2009. p. 169–78. 10.1007/978-3-642-04944-6_14

[CR12] 12.Brereton RG Applied chemometrics for scientists. Wiley; 2007. 10.1002/9780470057780. 10.1002/9780470057780. [Google Scholar]

[CR13] 13.Maiga AI, Kodio A, Ouedraogo SA, Baldé A, Dembele P, Traore FT, et al. Seroprevalence of anti-SARS-CoV-2 IgG antibodies among children attending the pediatric hospital in Bamako, Mali (BamaCoV-Kids Study). BMC Infect Dis. 2025;25:429. 10.1186/s12879-025-10762-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Lv B, Gao X, Zeng G, Guo H, Li F. Safety profile of paxlovid in the treatment of COVID-19. Curr Pharm Des. 2024;30:666–75. 10.2174/0113816128280987240214103432. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Wang H, Wei Y, Hung CT, Lin G, Jiang X, Li C, et al. Association of nirmatrelvir–ritonavir with post-acute sequelae and mortality in patients admitted to hospital with COVID-19: a retrospective cohort study. Lancet Infect Dis. 2024;24:1130–40. 10.1016/S1473-3099(24)00217-2. [DOI] [PubMed] [Google Scholar]

[CR16] 16.Gerhart J, Cox DS, Singh RSP, Chan PLS, Rao R, Allen R, et al. A comprehensive review of the clinical pharmacokinetics, pharmacodynamics, and drug interactions of nirmatrelvir/ritonavir. Clin Pharmacokinet. 2024;63:27–42. 10.1007/s40262-023-01339-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Bege M, Borbás A. The design, synthesis and mechanism of action of paxlovid, a protease inhibitor drug combination for the treatment of COVID-19. Pharmaceutics. 2024;16:217. 10.3390/pharmaceutics16020217. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Siniavin AE, Gushchin VA, Shastina NS, Darnotuk ES, Luyksaar SI, Russu LI, et al. New conjugates based on N4-hydroxycytidine with more potent antiviral efficacy in vitro than EIDD-2801 against SARS-CoV-2 and other human coronaviruses. Antiviral Res. 2024;225:105871. 10.1016/j.antiviral.2024.105871. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Lui GCY, Hui DSC. Revisiting oral antivirals for COVID-19 in the hospital setting. Lancet Infect Dis. 2025. 10.1016/S1473-3099(25)00169-0. [DOI] [PubMed] [Google Scholar]

[CR20] 20.Jorda A, Ensle D, Eser H, Glötzl F, Riedl B, Szell M, et al. Real-world effectiveness of nirmatrelvir-ritonavir and molnupiravir in non-hospitalized adults with COVID-19: a population-based, retrospective cohort study. Clin Microbiol Infect. 2025;31:451–8. 10.1016/j.cmi.2024.10.026. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Zhang W, Xia L, Yuan Z, Liu M, Jiao Y, Wang Z. Simultaneous determination of nirmatrelvir, ritonavir, and beta-D-N4-hydroxycytidine in human plasma and epithelial lining fluid using LC-MS/MS and its clinical application to compare rates of achieving effective concentrations. Heliyon. 2025;11:e41737. 10.1016/j.heliyon.2025.e41737. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Katoch S, Chauhan SS, Kumar V. A review on genetic algorithm: past, present, and future. Multimed Tools Appl. 2021;80:8091–126. 10.1007/s11042-020-10139-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Wold S, Sjöström M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab Syst. 2001;58:109–30. 10.1016/S0169-7439(01)00155-1. [Google Scholar]

[CR24] 24.Olivieri AC, Faber NM, Ferré J, Boqué R, Kalivas JH, Mark H. Uncertainty estimation and figures of merit for multivariate calibration (IUPAC Technical Report). Pure Appl Chem. 2006;78:633–61. 10.1351/pac200678030633 [Google Scholar]

[CR25] 25.Lorber Avraham. Error propagation and figures of merit for quantification by solving matrix equations. Anal Chem. 1986;58:1167–72. 10.1021/ac00297a042. [Google Scholar]

[CR26] 26.Sarraguça MC, Lopes JA. The use of net analyte signal (NAS) in near infrared spectroscopy pharmaceutical applications: interpretability and figures of merit. Anal Chim Acta. 2009;642:179–85. 10.1016/j.aca.2008.10.006. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Olivieri AC Analytical figures of merit in univariate, multivariate, and multiway calibration: what have we learned? What do we still need to learn? J Chemom. 2024. . 10.1002/cem.3613. [Google Scholar]

[CR28] 28.Giussani B, Gorla G, Ezenarro J, Riu J, Boqué R. Navigating the complexity: Managing multivariate error and uncertainties in spectroscopic data modelling. TRAC Trends Anal Chem. 2024;181:118051. 10.1016/j.trac.2024.118051. [Google Scholar]

[CR29] 29.Stefan M, Hidiroglou MA. Jackknife bias-corrected generalized regression estimator in survey sampling. J Surv Stat Methodol. 2024;12:211–31. 10.1093/jssam/smac027. [Google Scholar]

[CR30] 30.Olivieri AC. Analytical figures of merit: from univariate to multiway calibration. Chem Rev. 2014;114:5358–78. 10.1021/cr400455s. [DOI] [PubMed] [Google Scholar]

[CR31] 31.Larsen C, Lundberg P, Tang S, Ràfols-Ribé J, Sandström A, Mattias Lindh E, et al. A tool for identifying green solvents for printed electronics. Nat Commun. 2021;12:4510. 10.1038/s41467-021-24761-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Shen Y, Lo C, Nagaraj DR, Farinato R, Essenfeld A, Somasundaran P. Development of Greenness Index as an evaluation tool to assess reagents: evaluation based on SDS (Safety Data Sheet) information. Min Eng. 2016;94:1–9. 10.1016/j.mineng.2016.04.015. [Google Scholar]

[CR33] 33.Chiappini FA, Pinto L, Alcaraz MR, Omidikia N, Goicoechea HC, Olivieri AC. Multivariate curve resolution-alternating least-squares and second-order advantage in first-order calibration. A systematic characterisation for three-component analytical systems. Anal Chim Acta. 2024;1328:343159. 10.1016/j.aca.2024.343159. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Garrido M, Rius FX, Larrechi MS. Multivariate curve resolution–alternating least squares (MCR-ALS) applied to spectroscopic data from monitoring chemical reactions processes. Anal Bioanal Chem. 2008;390:2059–66. 10.1007/s00216-008-1955-6. [DOI] [PubMed] [Google Scholar]

[CR35] 35.Westad F, Marini F. Validation of chemometric models—a tutorial. Anal Chim Acta. 2015;893:14–24. 10.1016/j.aca.2015.06.056. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Zappi A, Biancolillo A, Kassouf N, Marassi V, Morozzi P, Tositti L, et al. Quantification of Recycled PET in Commercial Bottles by IR Spectroscopy and Chemometrics. Analytica. 2024;5:219–32. 10.3390/analytica5020014. [Google Scholar]

[CR37] 37.Kassouf N, Zappi A, Monticelli M, Melucci D. Analysis of Solid Formulates Using UV-Visible Diffused Reflectance Spectroscopy with Multivariate Data Processing Based on Net Analyte Signal and Standard Additions Method. Chemosensors. 2024;12:227. 10.3390/chemosensors12110227. [Google Scholar]

[CR38] 38.Mostafa A, Shaaban H, Chemometric Assisted UV-S. Methods Using Multivariate Curve Resolution Alternating Least Squares and Partial Least Squares Regression for Determination of Beta-Antagonists in Formulated Products: Evaluation of the Ecological Impact. Molecules. 2022;28:328. 10.3390/molecules28010328. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Madbouly EA, El-Shanawani AA, El-adl SM, Abdelkhalek AS. Green chemometric-assisted UV-spectrophotometric methods for the determination of favipiravir, cefixime and moxifloxacin hydrochloride as an effective therapeutic combination for COVID-19; application in pharmaceutical form and spiked human plasma. BMC Chem. 2024;18:65. 10.1186/s13065-024-01168-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Gamal M, Naguib IA, Panda DS, Abdallah FF. Comparative study of four greenness assessment tools for selection of greenest analytical method for assay of hyoscine N-butyl bromide. Anal Methods. 2021;13:369–80. 10.1039/D0AY02169E. [DOI] [PubMed] [Google Scholar]

[CR41] 41.Keith LH, Gron LU, Young JL. Green analytical methodologies. Chem Rev. 2007;107:2695–708. 10.1021/cr068359e. [DOI] [PubMed] [Google Scholar]

[CR42] 42.Xin T, Yu L, Zhang W, Guo Y, Wang C, Li Z, et al Greenness evaluation metric for analytical methods and software. J Pharm Anal. 2025. 10.1016/j.jpha.2025.101202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Muthu. Carbon footprint case studies. Singapore: Springer; 2021. 10.1007/978-981-15-9577-6. [Google Scholar]

[CR44] 44.Manousi N, Wojnowski W, Płotka-Wasylka J, Samanidou V. Blue applicability grade index (BAGI) and software: a new tool for the evaluation of method practicality. Green Chem. 2023;25:7598–604. 10.1039/D3GC02347H. [Google Scholar]

[CR45] 45.Fuente-Ballesteros A, Martínez-Martínez V, Ares AM, Valverde S, Samanidou V, Bernal J. Violet Innovation Grade Index (VIGI): a new survey-based metric for evaluating innovation in analytical methods. Anal Chem. 2025;97:6946–55. 10.1021/acs.analchem.5c00212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR46] 46.Nowak PM, Arduini F. RGBfast—a user-friendly version of the Red-Green-Blue model for assessing greenness and whiteness of analytical methods. Green Anal Chem. 2024;10:100120. 10.1016/j.greeac.2024.100120. [Google Scholar]

[CR47] 47.Kiwfo K, Suteerapataranon S, McKelvie ID, Meng Woi P, Kolev SD, Saenjum C, et al. A new need, quality, and sustainability (NQS) index for evaluating chemical analysis procedures using natural reagents. Microchem J. 2023;193:109026. 10.1016/j.microc.2023.109026. [Google Scholar]

PERMALINK

Candexch algorithm-enhanced chemometric determination of a novel anti-COVID-19 therapeutics in plasma and paxlovid formulation using advanced multivariate modeling: a sustainability-centered bioanalytical approach

Ahmed Emad F Abbas

Nisreen F Abo Talib

Mohamed R Elghobashy

Omkulthom Al kamaly

Michael K Halim

Abstract

Supplementary Information

Introduction

Fig. 1.

Experimental

Instrumentation and software

Materials and chemicals

Pharmaceutical-grade reference standards

Pharmaceutical formulations

Reagents and solvents

Biological matrices

Preparation of standard solution

Linearity and spectral characteristics

Design of experiments and chemometric sample space optimization

Calibration design construction

Validation design

Sample preparation and spectral acquisition

Building chemometric models

Multivariate analytical figures of merit and validation protocol

Fig. 2.

Calibration error diagnostics

External predictive capability

Accuracy and recovery assessment

Precision profiling

Robustness evaluation

Multivariate sensitivity and detection capability

Matrix-resolved validation strategy

Analysis of Paxlovid® dosage form

Protocol for preparing samples

Analytical procedure

Validation in pharmaceutical matrix

Plasma matrix application

Biological sample preparation

Plasma sample processing and extraction protocol

Bioanalytical method validation

Results and discussion

Sustainable solvent selection strategy

The development of chemometric models

Fig. 3.

Experimental design strategy

Table 1.

Validation set design

Fig. 4.

Fig. 5.

Fig. 6.

PCR model

Fig. 7.

GA-PLS model

Fig. 8.

Table 2.

FA-PLS model

Fig. 9.

Table 3.

MCR-ALS model

Fig. 10.

Fig. 11.

Validation of the chemometric spectrophotometric methods

Performance evaluation in plasma matrix

Table 4.

Performance evaluation in pharmaceutical formulation analysis

Table 5.

Analytical method validation parameters

Robustness assessment

Comparative statistical evaluation of chemometric model performance

Plasma matrix

Pharmaceutical formulation matrix

Statistical model structure and transparency

Analysis of Paxlovid® tablets

Table 6.

Plasma bioanalytical application and clinical contextualization

Table 7.

Comparative model performance analysis

Implementation guidelines and method selection criteria

Analysis of Paxlovid^® dosage form

Analysis of Paxlovid^® tablets