Skip to main content
Current Research in Food Science logoLink to Current Research in Food Science
. 2024 Jun 17;9:100788. doi: 10.1016/j.crfs.2024.100788

Predicting the evolution of pH and total soluble solids during coffee fermentation using near-infrared spectroscopy coupled with chemometrics

Vicente Tirado-Kulieva a,b, Carlos Quijano-Jara c, Himer Avila-George d,, Wilson Castro e,⁎⁎
PMCID: PMC11245949  PMID: 39005496

Abstract

Currently, coffee fermentation is visually operated, which results in incomplete or excessive processes and coffees with undesirable characteristics. In front of it, pH and total soluble solids (TSS) have been shown to be good fermentation indicators, although this requires rapid, accurate, and chemical-free measurement techniques such as NIR spectroscopy. However, the complexity of the NIR spectra requires optimization steps in which variable selection techniques simplify profiles and subsequent models. This work tests a new covering array feature selection (CAFS) approach on NIR spectra to optimize prediction models in coffee samples during fermentation. Spectral profiles in the range 1100–2100 nm were extracted from coffee beans (Typica, Caturra, and Catimor varieties) raw and during fermentation (4, 8, 12, 16, 20, and 24 h). Partial least-squares regressions (PLSR) were performed using full spectra using a five-fold cross-validation strategy for training and validation. The relevant wavelengths were then selected using the β coefficients, the important projection of variables (VIP), and the CAFS method. Finally, optimized models were performed using the relevant wavelengths and compared among these using their statistical metrics. The models performed using the selected variables (22–47) of CAFS showed the best performance in predicting pH (R2 = 0.825–0.903, RMSE = 0.096–0.158, RPD = 6.33–10.38) and TSS (R2 = 0.865–0.922, RMSE = 0.688–1.059, RPD = 0.94–1.45) compared to the other methods. These findings suggest that simple and efficient models could be performed and implemented in routine analysis due to the maximum coverage and minimum cardinality of CAFS.

Keywords: Coffee, Feature selection, Fermentation, pH, NIRS

Graphical abstract

Image 1

Highlights

  • Non-invasive coffee fermentation monitoring using NIR spectroscopy coupled with advanced chemometric techniques.

  • TSS and pH, reliable indicators of fermentation progress, were modeled from NIR spectral data.

  • Robust and accurate prediction models were developed using PLSR.

  • The PLSR model optimized by CAFS showed superior performance in predicting TSS and pH.

1. Introduction

Coffee (Coffea sp.) is the second most consumed commodity on the market after crude oil and the second most consumed beverage after water (Bosso et al., 2023). In 2021, around 10.9 million tons of coffee were produced with constant growth in demand and supply. Peru is the seventh largest producer of conventional coffee (FAO, 2024), in addition to being the first producer and exporter of organic coffee (MIDAGRI, 2022). Coffee is one of the main crops in the country, an important source of jobs and a key piece of the economy.

Fermentation is one of the stages that has the greatest influence on coffee quality, but its control is a challenge because the end of the process is determined visually or manually (Pereira and Moreira, 2021). This causes incomplete or excessive fermentations that induce unpleasant aromas and flavors (Pereira and Moreira, 2021). This was shown in several studies, such as in Zhang et al. (2019), where the quality of the cup decreased considerably when the coffee fermentation time was not optimal.

The capacity to monitor the fermentation of different parameters has been studied. Jackels and Jackels (2005) tested pH, glucose, lactic acid and ethanol, and it was established that pH was the only reliable indicator, a fact accepted by the scientific community (Córdoba-Castro and Guerrero-Fajardo, 2016). Similarly, Prakash et al. (2022) determined that total soluble solids (TSS) can indicate mucilage removal and fermentation progress. Therefore, pH and TSS can be used to determine the end of the process.

The determination methods for pH and TSS require trained personnel, chemicals for calibration, and significant time, especially when many samples must be tested, making it a tedious process (Velmourougane, 2013). Therefore, sensitive, rapid, nondestructive, multiparametric, and chemical-free techniques, such as near-infrared spectroscopy (NIRS), are required. NIRS measures the energy absorption by the sample exposed to electromagnetic radiation in the wavelength range of 750–2500 nm, collecting information on the spectral characteristics of hydrogen-containing functional groups (O–H, C–H, N–H and S–H) (Araújo et al., 2020; Castro et al., 2022). NIRS has been widely used in food analysis and has been effective in identifying the chemical properties of coffee, but has focused on the roasting process or ground coffee (de Carvalho Pires et al., 2021; Tugnolo et al., 2021).

A common characteristic of data obtained using spectroscopic techniques is the presence of numerous variables but few samples, which complicates the analytical problem. From a practical point of view, variable selection (VS) is important because it reduces noise and avoids overfitting, resulting in simpler models to facilitate data collection and interpretation without compromising their predictive capacity (de Araújo Gomes et al., 2022). This strategy also optimizes computational calculations and reduces hardware costs, which is desirable to guide the industry in the development of low-cost, compact, fast and lightweight tools (Fatemi et al., 2022). However, finding the most significant variables is challenging because no single method can extract the most essential and significant variables for a particular application (Kamruzzaman et al., 2022).

In food engineering applications, the effects of VS techniques such as the genetic algorithm (GA), the successive projection algorithm (SPA), the variable importance projection (VIP), stepwise regression (SWR), competitive adaptive reweighted sampling (CARS), and regression coefficients or β coefficients have been tested (Fatemi et al., 2022; Kamruzzaman et al., 2022). Since it is difficult to determine which VS algorithm is suitable for a specific type of data, it is necessary to compare and test different algorithms for various applications to select the best (Vásquez et al., 2018). The dynamic chemical complexity of food products during fermentation makes this challenge more difficult.

A good search strategy is needed for VS methods because it is hard to find a subset of wavebands that are good for predicting parameters during coffee fermentation (Castro et al., 2022). In this sense, covering arrays (CAs) come into play. CAs are mathematical objects that offer the advantages of maximum coverage and minimum cardinality, making them ideal for selecting a reduced set of variables that capture all relevant information in the data (Avila-George et al., 2012; Torres-Jimenez et al., 2019a). CAs have been used successfully in VS processes (Dorado et al., 2019; Solarte-Martinez et al., 2019; Villegas et al., 2018; Vivas et al., 2019). Our team just came up with a new CA-based VS method, called Covering Array Feature Selection (CAFS), that uses NIRS to tell the difference between Amazonian cacao-clone nibs (Castro et al., 2022).

This research looked at how CAFS, NIR spectra, and PLSR could be used together to guess the pH and TSS of coffee samples during fermentation and compared them to the β coefficients and VIP.

2. Material and methods

Fig. 1 shows the main steps of the methodology proposed in this study. Each step is detailed and commented on in the following subsections.

Fig. 1.

Fig. 1

Proposed methodology for the study.

2.1. Coffee fermentation

Arabica coffee beans (Coffea arabica L.) var. Typica, Caturra, and Catimor were picked in June 2023 from the districts of Sícchez and Ayabaca in Piura, Peru. The fruits were transferred to the laboratory within 6 h after harvest, following the methodology of Elhalis et al. (2020). The green and dry fruits were removed and the selected fruits were pressed under aseptic conditions. Fermentation was carried out for 24 h with a dry must, which included coffee beans and a proportion of skin without additional water (Pereira et al., 2020).

The samples were collected at 0, 4, 8, 12, 16, 20, and 24 h from various locations within the prehomogenized mixture. Each sample was analyzed in duplicate and comprised approximately 100 defect-free grains of similar size. The grinding was carried out with a high-speed DAMAI HC-1000Y mill (Wuyi Haina Electric Appliance Co., Ltd., China) and the particle size was standardized to 300 μm using a sieve (Riceli Equipos, Peru).

2.2. Physicochemical characterization

The pH was measured directly in the volume of coffee with a pH meter (HANNA Instruments, Italy). A digital refractometer (HANNA Instruments, Italy) was used to measure TSS (°Brix) by taking a sample of liquid mucilage. All analyzes were performed in triplicate at room temperature and expressed as mean ± standard deviation.

Finally, the normality of the pH and TSS values was tested using the Kolmogorov-Smirnov test.

D=maxD+,D (1)

where D+=maxiinZi, D=maxiZii1n, Z = F(Xi), F(Xi) is the cumulative distribution function of the normal distribution, Xi is the i-th order statistic of a random sample, 1 < i < n, and n is the sample size.

2.3. NIR profile extraction and pretreatment

NIR spectral profiles were measured at room temperature. A Polytec PSS-A-T01 NIR spectrometer (Polytec GmbH, Germany), equipped with a tungsten halogen lamp as a light source and an InGaAs (Indium–Gallium–Arsenic) detector in the range of 1100–2100 nm, and a resolution of 2 nm, was used.

The database comprises 2100 profiles extracted from three coffee varieties, encompassing seven fermentation times and two sub-samples, with 50 profiles per sample. The NIR spectrometer was programmed to extract one profile at a time, with the sample remaining between measurements until fifty profiles were obtained.

Finally, the spectra acquired in reflectance mode were converted to absorbance according to Equation (2).

Aλ=log1Rλ (2)

where A is the absorbance, λ is the wavelength, and R is the reflectance.

Unwanted things such as noise, interference signals, sample heterogeneity, and baseline drift can change spectral profiles, so spectral corrections are needed (Vásquez et al., 2018). The NIR spectra were smoothed using the Savitzky-Golay filter, whose parameters were second-order and three-step windows; see Equation (3).

yjo=i=mmCiYj+iN (3)

where Y is the original profile, yo the smoothed profile, C is the coefficient of the i-th term of the profile, and N is the number of convolution stages.

2.4. PLSR modeling

All wavelengths were used to build partial least squares regression (PLSR) models to predict the pH and TSS of the coffee samples of each variety. PLSR was chosen as the test model because, in conjunction with spectroscopic techniques, it is one of the most widely used multivariate regression techniques in food analysis (Vásquez et al., 2018).

PLSR transforms predictor variables (X) into response variables (Y). PLSR decomposes X and Y to project them in new directions and describe the change of the variables together to the greatest extent possible (Vásquez et al., 2018). Then a regression step is performed with the decomposed X and Y, whose model is shown in Equation (4):

Y=βX+e (4)

where Y represents the pH or TSS values of the coffee, X is the absorbance data matrix (n observations × m wavelengths), and β is the coefficient matrix.

2.5. Feature selection

Removal of irrelevant spectral information allows us to obtain simplified and practical models, which require revealing the most important spectral bands for modeling. The methods tested are commented on below:

  • β coefficients or Regression coefficients (BC). They are unique measures of association between each variable and the response. The wavelengths are related to the absolute load weights of the entire PLSR model and are selected according to their value and predictive ability (Vásquez et al., 2018).

  • Variable Importance in Projection (VIP). The VIP, represented as vj, is a measure of the contribution of each variable according to the variance explained by each component of PLSR, as shown in Equation (5) (Mehmood et al., 2020).

vi=pa=1ASSa(waj/wa)2a=1ASSa (5)

where p is the number of variables, SSa is the sum of squares explained by the a-th component, and waj/wa2 represents the importance or weight of the j-th variable. Variables with VIP values greater than 1 were considered informative variables.

  • CAFS. Covering arrays (CAs) are defined as a matrix of N rows and k columns over an alphabet of v symbols, such that for each set of t columns, each tuple of symbols is covered at least once, denoted by CA(N; t, k, v) (Torres-Jimenez et al., 2018, 2019b). More details about this variable selection method are provided in Castro et al. (2022).

2.6. Statistical metrics determination

The models were trained using a K-Fold cross-validation strategy (K = 5) carried out with 30 repetitions. In each repetition, the following performance statistical metrics were determined: coefficient of determination (R2), root mean square error (RMSE), and the ratio of performance to deviation (RPD) defined by Equations (6), (7), (8)).

Rcv2=1i=1n(y^iyi)2i=1n(y^iym)2 (6)
RMSEcv=i=1n(y^iyi)2n (7)
RPDcv=SDRMSEcv (8)

where y^i and yi are the predicted and reference response variables of the i-th sample, ym is the mean value of the samples, n is the number of samples, and SD is the standard deviation of y. The subscript cv refers to the calculations performed in the cross-validation strategy.

The processing of spectral data, the development of the models, and the selection of relevant variables, functions, and routines were implemented in Matlab R2023a (The MathWorks, Inc., USA) and run on a computer with 16 GB RAM, Core i7, 11th generation processor.

The pH and TSS data were analyzed with Minitab 18.0 (Minitab Inc., USA) using a one-way analysis of variance (ANOVA) and the means were compared with a Tukey multiple comparison test with a significance level of 95 %.

3. Results

3.1. Evolution of physicochemical characteristics

Fig. 2(a) shows that the pH of the unfermented coffee beans was 5.60, 5.43, and 5.57 for the varieties Typica, Caturra, and Catimor, respectively. According to the National Institute of Agrarian Innovation in Peru, the pH of coffee for fermentation should be between 5.0 and 6.0 (Paredes Espinosa et al., 2022). At the same time, other authors stated that the optimal pH should be around 5.0 and 5.6 (Galarza and Figueroa, 2022; Córdoba-Castro and Guerrero-Fajardo, 2016). At the end of fermentation, the pH of the Typica, Caturra and Catimor coffee decreased to 4.77, 4.60, and 4.67, respectively. From the beginning until 16 h of fermentation, significant differences (p < 0.05) were evident in the pH values within each coffee variety but stabilized after 20 h.

Fig. 2.

Fig. 2

(a) Evolution of pH and (b) TSS during coffee fermentation Different capital letters indicate significant differences (p < 0.05) in the pH and TSS values of the same coffee variety (same color line) and different lowercase letters indicate significant differences (p < 0.05) between varieties. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

The decrease in pH in the mass of coffee is attributed to the degradation of complex organic substances in mucilage to simpler sugars and consequent production of acids (Elhalis et al., 2023). After 16 h of fermentation, the decreasing and increasing trend of pH suggests that the increase in acid and alcohol production stopped fermentation (Sunoj et al., 2016). Similar results were found when evaluating Arabica coffee from the Catuaí Vermelho variety (pH from 5.7 to 4.5) in Brazil (de Jesus Cassimiro et al., 2022), Caturra (5.5–4.4) and Castillo (5.6–4.4) varieties in Colombia (Córdoba-Castro and Guerrero-Fajardo, 2016), and Sln 795 (5.6–4.2) variety in India (Shankar et al., 2022).

The evolution of pH between coffee varieties was significant (p < 0.05) in the first half of the process and subsequently the variation decreased until stability was reached. Similar trends were observed during the fermentation of Caturra and Castillo varieties (Córdoba-Castro and Guerrero-Fajardo, 2016), as well as Mundo Novo, Ouro Amarelo, and Catuaí Vermelho varieties (Ribeiro et al., 2018), and Tabi, Castillo General, and Colombia varieties (Holguín-Sterling et al., 2023). These differences reflect variations in the initial chemical composition of coffee cherries, the microbial activity, and the specific interactions of each variety with its fermentation environment.

Coffee varieties significantly influence the diversity of the microbiota and the formation of metabolites during fermentation, which is reflected in the disparity of pH values (Ribeiro et al., 2018). Holguín-Sterling et al. (2023) found notable differences in microbial diversity among coffee varieties during fermentation, with this diversity decreasing towards the end of the process. Furthermore, Peñuela-Martínez et al. (2023) confirmed these observations, showing significant differences in pH during the early stages of fermentation.

Since a final pH close to 4.6 was established to indicate the end of the fermentation process (Córdoba-Castro and Guerrero-Fajardo, 2016; Jackels and Jackels, 2005), an important point is that at 16 h, the pH of Typica and Caturra coffee ranged from 4.6, unlike Catimor coffee, which barely reached 4.67 after 24 h of fermentation. Pothakos et al. (2020) reported microbial diversity is restricted at pH levels lower than 4.5–4.0 during long fermentation, where the communities of lactic acid bacteria tolerant to the acidic environment prevail, affecting the flavor of the product. Under this premise, the fermentation time must be limited to 16 h.

Fig. 2(b) shows that the initial TSS of Typica, Caturra and Catimor coffee was 15.87, 15.13, and 16.53, respectively, similar to those reported by de Jesus Cassimiro et al. (de Jesus Cassimiro et al., 2022) around 16 °Brix, but lower than approximately 19.9 °Brix reported by Galarza and Figueroa (2022), Puerta and Echeverry (2015). So, Galarza and Figueroa (2022) indicated that coffee fruits should be harvested in a 12–24 °Brix range for adequate processing.

During fermentation, there was a reduction in TSS as a function of time, with significant differences (p < 0.05) between each variety, indicating the use of sugar by microorganisms. Until 16 h, a notable reduction in TSS was observed, similar to that described by Prakash et al. (2022) when evaluating Robusta coffee fermentation. According to the author, it could be attributed to the decomposition of complex carbohydrates into monosaccharides by the metabolic activity of yeast. As fermentation is prolonged, the population of yeast decreases due to the conversion of metabolites to alcohols and the consequent growth of lactic acid bacteria that continue to consume sugars, but at a slower rate (Prakash et al., 2022).

Significant differences (p < 0.05) were also observed in the evolution of TSS between coffee varieties, attributed to the composition and predominance of microorganisms during fermentation. Similar results were reported in the dynamics of certain sugars and acids (Holguín-Sterling et al., 2023; Peñuela-Martínez et al., 2023; Ribeiro et al., 2018).

Other factors influencing differences in characteristics, such as pH and compounds associated with TSS between coffee varieties during fermentation, include variations in the thickness of the mucilage layer (Velmourougane, 2013), as well as differences in the size and density of the beans (Peñuela-Martínez et al., 2023).

From normality analysis (Kolmogorov-Smirnov test), p values of 1.7 × 10−108 and 3.9 × 10−21 were determined for pH and TSS values, respectively. This is consistent with what is shown in Fig. 3 for both parameters.

Fig. 3.

Fig. 3

Normal distribution function vs distribution of (a) pH and (b) TSS.

3.2. NIR spectral profiles

Table 1 shows the complete spectral profiles of the three coffee varieties and their relationship with pH and TSS. The shape of the spectral profile is similar to that described in Araújo et al. (2020), Chakravartula et al. (2022), Tugnolo et al. (2021) for ground and roasted coffee. Regarding the relevant peaks, there were slight differences in absorbance in the region of 1550–1800 nm that could be associated with variations in the content of chlorogenic acid, carbohydrates, amino acids, and caffeine (Munyendo et al., 2021), but the trends remained invariable.

Table 1.

Spectral profiles and Full PLSR results per variety and physicochemical parameter.

3.2.

The peak at 1205 nm is attributed to C–H band vibrations corresponding to fatty acids, amino acids, and lignin (Munyendo et al., 2021). Oktavianawati et al. (2020) revealed the presence of different amino acids in coffee beans, whose concentrations increased after fermentation. Shen et al. (2023) found 209 fatty acids in coffee beans subjected to different fermentation processes. Lignin is one of the main compounds in the endocarp or parchment of coffee beans. This peak is also associated with the second harmonic of C–H stretching corresponding to carbohydrates, lipids, and quinic acid (Munyendo et al., 2021). Sugars and lipids were determined to be characteristic markers of coffee fermentation (Agnoletti et al., 2022). Hu et al. (2020) showed that quinic acid and the coffee body score have a positive relationship. Shen et al. (2023) found 527 lipids in coffee beans, which retain the volatile flavor compounds and vitamins that contribute to texture and mouthfeel.

As expected, the predominant peaks at 1451 and 1927 nm are associated with the high water content in the coffee samples (Araújo et al., 2020; Tugnolo et al., 2021). They are also found in the region associated with the content of carbohydrates and chlorogenic acid (Munyendo et al., 2021), the latter responsible for the bitterness and astringency of coffee (Munyendo et al., 2021; Shen et al., 2023). The peaks at 1729 and 1927 nm are associated with the C–H harmonic bands of caffeine (Chakravartula et al., 2022; Sim et al., 2023); similarly, at 1839 nm, the C–H bonds could refer to cellulose, another predominant polysaccharide in the coffee endocarp (Sim et al., 2023).

3.3. Full PLSR models

Table 1 also displays the performance of the full PLSR models reflected in high R2 values (from 0.946 to 0.978). Regarding RMSE values in the pH prediction, low values between 0.064 and 0.082 were obtained, while in the TSS prediction, the dispersion of the data was greater with RMSE values of 0.364–0.600. At the same time, RPD, except for the TSS prediction model in Caturra coffee (RPD = 1.67), ranges from 2.25 to 15.55. That means that predictions are possible and with high values a good adjustment is possible (Araújo et al., 2020).

The pH prediction models showed RPD values of 12.21, 12.48, and 15.55 for Typica, Catimor, and Caturra varieties, reflecting a greater adjustment than those obtained in other studies where the models NIR and PLSR were combined. Among these studies, the study of Priambodo et al. (2022) for the determination of the pH of cocoa beans during fermentation, obtained R2 from 0.63 to 0.70 and RMSE 0.28–0.31 or Sunoj et al. (2016) that in similar conditions fits similarly (R2 = 0.58–0.75 and RMSE = 0.26–0.35), with RPD between 1.52 and 2.05. Similar results were obtained for another fermented product, as in the study of Wu et al. (2015) those who predicted pH during Chinese rice wine fermentation R2 = 0.90, RMSE = 0.15 or Giovenzana et al. (2014) for pH during craft beer fermentation R2 = 0.38 − 0.89 and RMSE = 0.10–0.29.

According to Kasemsumran et al. (2023), the low performance of some results could be due to the complexity of biological processes that could be visualized in several overlapped absorption bands. At the same time, the low adjustment of PLSR and TSS could be due to a nonlinear relationship, which has been faced by da Silva Melo et al. (da Silva Melo et al., 2022), Zeng et al. (2024) through nonlinear models.

3.4. Optimized PLSR models

Table 2 shows the wavebands selected by the proposed CAFS method and the β coefficients and VIP methods. For pH prediction high R2 values (see Table 3, Table 4) were obtained using VIP (> 0.921), β coefficients (> 0.749), and CAFS (> 0.825) approaches. Although RPD values varied in the ranges [9.15–11.56], [6.28–6.69], and [6.33 to 10.38], respectively.

Table 2.

Selected features per each method.

3.4.

Table 3.

Optimized PLSR results per each method.

3.4.

Table 4.

Performance metrics of the full and optimized PLSR models.

PLSR Typica
Caturra
Catimor
R2 RMSE RPD R2 RMSE RPD R2 RMSE RPD
pH
Full VIP 0.955 0.082 12.21 0.957 0.064 15.55 0.946 0.080 12.48
0.921 0.109 9.15 0.922 0.087 11.56 0.921 0.097 10.33
Opt
β coefficients 0.832 0.159 6.28 0.749 0.155 6.44 0.811 0.150 6.69
CAFS
0.834
0.158
6.33
0.903
0.096
10.38
0.825
0.144
6.96
TSS Full VIP 0.978 0.364 2.75 0.958 0.600 1.67 0.957 0.445 2.25
0.936 0.626 1.60 0.921 0.823 1.22 0.917 0.617 1.62
Opt β coefficients 0.811 1.073 0.93 0.861 1.093 0.92 0.854 0.820 1.22
CAFS 0.922 0.688 1.45 0.869 1.059 0.94 0.865 0.787 1.27

The results, see Table 3, were better than in other studies using different VS approaches. In the study of Li et al. (2018) using GA and SPA to predict the pH of cherries with 24 and 28 variables, respectively, achieving R2 values of 0.819 and 0.815, RMSE, and RPD 2.351 and 2.326. Cheng et al. (2022) used the CARS, SPA, and UVE to predict the pH of pears with 52, 11, and 557 variables, respectively, achieving R2 values of 0.862–0.915 and an RMSE of 0.024–0.033. Lamptey et al. (2023) used the iPLS, Si-PLS, and Bi-PLS to predict the pH of mangoes with 52, 11, and 557 variables, respectively, achieving R2 values of up to 0.79–0.89, RMSE of 0.44–0.51 and RPD of 2.20–2.97.

Table 4 summarizes the statistical metrics; it is shown that the optimized models had a comparatively worse adjustment when TSS was predicted. The VIP, β coefficients, and the CAFS approach yielded R2 > 0.811, and RMSE from 0.626 to 1.093. Likewise, the instability of the models is reflected in values of excessive variation RPD of 1.22–1.62 for VIP, 0.92 to 1.22 for β coefficients, and 0.94 to 1.45 for CAFS. Saavedra et al. (2014) obtained similar results when predicting the TSS of cape gooseberry using iPLS, GA, rPLS and CovSel and obtained R2 of 0.48–0.58, RMSE of 0.61–0.61 and RPD of 1.61–1.77. Lamptey et al. (2023) used the iPLS, Si-PLS and Bi-PLS to predict the TSS of mangoes with 52, 11 and 557 variables, respectively, achieving R2 values of up to 0.60–0.67, RMSE of 1.85–1.94 and RPD of 1.58–1.74. Li et al. (2018) used the GA and SPA to predict TSS of cherries with 54 and 28 variables, respectively, achieving R2 values of 0.863 and 0.771, RMSE of 1.210 and 1.563 and RPD of 2.700 and 2.089. Superior results were obtained by Cheng et al. (2022) using CARS, SPA and SVU to predict the TSS of pears with 70, 18 and 505 variables, respectively, achieving R2 values of 0.876–0.943, RMSE of 0.142–0.248.

Regarding fermentation monitoring, Kasemsumran et al. (2022) used SCMWPLS to optimize the models and obtain R2 values of 0.996, RMSE of 0.166 and RPD of 27.82 in the prediction of TSS during the fermentation of pineapple wine. Despite the high predictive capacity, the model still contained a large number of variables (181); although it is applicable in a laboratory for experimental purposes, it is not very viable for practical purposes that involve the development of portable or online devices. Kasemsumran et al. (2023) used the same spectrometer to monitor TSS in pineapple vinegar broth during fermentation. The model was optimized with only 21 variables using stability CARS (SCARS), obtaining an R2 of 0.903 and an RMSE of 0.875.

Regarding the effect of each VS method on the metrics of the PLSR models, VIP obtained the best results, but selected a large number of variables (from 164 to 221) compared to CAFS (from 22 to 47) and β coefficients (15). The selection of variables with VIP was concentrated at some points (see Table 2), which could have been the product of the codependence of the variables. Although the results were comparable to those obtained with the full spectrum, the model is not viable for practical applications. Similarly, in Liao et al. (2012), the VIP-PLSR method to predict pH in fresh beef selected 20.13% of the variables but showed similar results to when all variables were used. Similar trends were reported when using VIP to predict the same parameters (Afonso et al., 2022; Cevoli et al., 2024). The VIP could have selected many variables because it focuses on the weight of the variable in the projection but ignores its stability. Some variables whose stability was the same as the noise played an important role within the latent projection but provided a performance deficiency in the PLSR model (Liao et al., 2012).

Regarding β coefficients, Chong and Jun determined that, in terms of quantitative selection of relevant predictors, the PLSR-β coefficients method is superior to the PLSR-VIP method, which was corroborated in this study. However, the performance of the PLSR-β coefficients models for the prediction of TSS was low (RPD of 0.92–1.22) compared to the estimation of pH (RPD of 6.28–6.69). According to Cattaldo et al. (2024), the β coefficients work well even for small samples, but when the true complexity of the model is linear, they are less suitable for modeling nonlinear problems, as discussed above, regarding the relationship between the spectra and TSS.

The β coefficients and CAFS approaches were comparable in terms of the number of variables selected, but, as expected, the latter showed better performance. With the β coefficients approach, variables were selected in certain bands, limiting themselves to local minima, while CAFS tried to select relevant variables throughout the spectrum. In this way, the problem of modeling TSS was also improved. TSS involve several compounds such as vitamins, sugars, and amino acids, so it is difficult to assign importance to a single region of the spectrum, as reported by Agulheiro-Santos et al. (2022) when predicting TSS in strawberries. In some cases, the search for variables throughout the spectral region is not desirable because it can lead to the potential problem of placing unrelated variables, as reported in the prediction of TSS in tomatoes (Li et al., 2020). This was addressed thanks to the maximum coverage and minimum cardinality of CAFS. Similar results were obtained when using CAFS to discriminate Amazonian cacao-clone nibs (Castro et al., 2022).

The results highlighted the potential of CAFS to select effective wavelengths in the NIR spectrum and develop robust PLSR models to predict the evolution of pH and TSS during coffee fermentation. Coffee producers could replicate the findings of this study to avoid excessive or incomplete fermentations, which would represent a great transition step from rural to technology with the aim of industrial scale and adoption.

4. Conclusions

This study proposed using CAFS, a new VS approach to regression problems based on coverage matrices. CAFS was tested to predict the evolution of the physicochemical parameters during the fermentation of three varieties of coffee using NIR spectra and PLSR models. Unlike common algorithms such as VIP and β coefficients, the models with wavelengths selected by CAFS showed the best performance in pH prediction (R2 = 0.825–0.903, RMSE = 0.096–0.158, RPD = 6.33–10.38) and TSS (R2 = 0.865–0.922, RMSE = 0.688–1.059, RPD = 0.94–1.45). CAFS selected limited but important variables (22–47) throughout the spectrum due to their maximum coverage and minimum cardinality. This is convenient in practice because it allows the development of low-cost miniaturized systems for real-time monitoring of the process, facilitating the production of quality coffee for small coffee growers. The performance of the models can be improved by increasing the number of samples, testing nonlinear models, and using different features or CA families. Furthermore, research can potentially be extended to other quality parameters and applied to other food processes and products.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Wilson Castro reports financial support was provided by Programa Nacional de Investigación Científica y Estudios Avanzados (PROCIENCIA, Peru), Project "Sistema prototipo de determinación de calidad de taza para café: estudio de técnicas deep learning", CONTRACT No. PE501080928-2022-PROCIENCIA. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to express their gratitude for the support provided to this work by the following projects: (1)“Prototype System for Determining Coffee Cup Quality: Study of Deep Learning Techniques” under contract N° PE501080928-2022-PROCIENCIA; and (2) “Creation of the laboratory service for food safety research at the Universidad Nacional de Frontera” with Unique Investment Code N° 2439545.

Contributor Information

Himer Avila-George, Email: himer.avila@academicos.udg.mx.

Wilson Castro, Email: wcastro@unf.edu.pe.

Data availability

Data will be made available on request.

References

  1. Afonso A.M., Antunes M.D., Cruz S., Cavaco A.M., Guerra R. Non-destructive follow-up of ‘jintao’kiwifruit ripening through vis-nir spectroscopy–individual vs. average calibration model's predictions. Postharvest Biol. Technol. 2022;188 doi: 10.1016/j.postharvbio.2022.111895. [DOI] [Google Scholar]
  2. Agnoletti B.Z., dos Santos Gomes W., de Oliveira G.F., da Cunha P.H., Nascimento M.H.C., Neto Á.C., Pereira L.L., de Castro E.V.R., da Silva Oliveira E.C., Filgueiras P.R. Effect of fermentation on the quality of conilon coffee (coffea canephora): chemical and sensory aspects. Microchem. J. 2022;182 doi: 10.1016/j.microc.2022.107966. [DOI] [Google Scholar]
  3. Agulheiro-Santos A.C., Ricardo-Rodrigues S., Laranjo M., Melgão C., Velázquez R. Non-destructive prediction of total soluble solids in strawberry using near infrared spectroscopy. J. Sci. Food Agric. 2022;102:4866–4872. doi: 10.1002/jsfa.11849. [DOI] [PubMed] [Google Scholar]
  4. Araújo C.d.S., Macedo L.L., Vimercati W.C., Ferreira A., Prezotti L.C., Saraiva S.H. Determination of ph and acidity in green coffee using near-infrared spectroscopy and multivariate regression. J. Sci. Food Agric. 2020;100:2488–2493. doi: 10.1002/jsfa.10270. [DOI] [PubMed] [Google Scholar]
  5. Avila-George H., Torres-Jimenez J., Hernández V. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2012) Arabnia H.R., Hiroshi Ishii M.I., Kazuki Joe H.N., editors. CSREA Press, Las Vegas; Nevada, USA: 2012. Parallel simulated annealing for the covering arrays construction problem; pp. 522–528. [Google Scholar]
  6. Bosso H., Barbalho S.M., de Alvares Goulart R., Otoboni A.M.M.B. Green coffee: economic relevance and a systematic review of the effects on human health. Crit. Rev. Food Sci. Nutr. 2023;63:394–410. doi: 10.1080/10408398.2021.1948817. [DOI] [PubMed] [Google Scholar]
  7. Castro W., De-la Torre M., Avila-George H., Torres-Jimenez J., Guivin A., Acevedo-Juárez B. Amazonian cacao-clone nibs discrimination using nir spectroscopy coupled to naïve bayes classifier and a new waveband selection approach. Spectrochim. Acta Mol. Biomol. Spectrosc. 2022;270 doi: 10.1016/j.saa.2021.120815. [DOI] [PubMed] [Google Scholar]
  8. Cattaldo M., Ferrer A., Måge I. Variable time delay estimation in continuous industrial processes. Chemometr. Intell. Lab. Syst. 2024 doi: 10.1016/j.chemolab.2024.105082. [DOI] [Google Scholar]
  9. Cevoli C., Iaccheri E., Fabbri A., Ragni L. Data fusion of ft-nir spectroscopy and vis/nir hyperspectral imaging to predict quality parameters of yellow flesh “jintao” kiwifruit. Biosyst. Eng. 2024;237:157–169. doi: 10.1016/j.biosystemseng.2023.12.011. [DOI] [Google Scholar]
  10. Chakravartula S.S.N., Moscetti R., Bedini G., Nardella M., Massantini R. Use of convolutional neural network (cnn) combined with ft-nir spectroscopy to predict food adulteration: a case study on coffee. Food Control. 2022;135 doi: 10.1016/j.foodcont.2022.108816. [DOI] [Google Scholar]
  11. Cheng T., Guo S., Pan Z., Fan S., Ju S., Xin Z., Zhou X.-G., Jiang F., Zhang D. Near-infrared model and its robustness as affected by fruit origin for ‘dangshan’pear soluble solids content and ph measurement. Agriculture. 2022;12:1618. doi: 10.3390/agriculture12101618. [DOI] [Google Scholar]
  12. Chong I.-G., Jun C.-H. Performance of some variable selection methods when multicollinearity is present. Chemometr. Intell. Lab. Syst. 2005;78:103–112. doi: 10.1016/j.chemolab.2004.12.011. [DOI] [Google Scholar]
  13. Córdoba-Castro N., Guerrero-Fajardo J.E. Characterization of traditional coffee fermentation processes in the department of nariño [caracterización de los procesos tradicionales de fermentación de café en el departamento de nariño] Biotecnología en el Sector agropecuario y agroindustrial. 2016;14:75–83. doi: 10.18684/BSAA(14)75-83. [DOI] [Google Scholar]
  14. da Silva Melo B.H., Sales R.F., da Silva Bastos Filho L., da Silva J.S.P., de Almeida Sousa A.G.C., Peixoto D.M.C., Pimentel M.F. Handheld near infrared spectrometer and machine learning methods applied to the monitoring of multiple process stages in industrial sugar production. Food Chem. 2022;369 doi: 10.1016/j.foodchem.2021.130919. [DOI] [PubMed] [Google Scholar]
  15. de Araújo Gomes A., Azcarate S.M., Diniz P.H.G.D., de Sousa Fernandes D.D., Veras G. Variable selection in the chemometric treatment of food data: a tutorial review. Food Chem. 2022;370 doi: 10.1016/j.foodchem.2021.131072. [DOI] [PubMed] [Google Scholar]
  16. de Carvalho Pires F., Pereira R.G.F.A., Baqueta M.R., Valderrama P., da Rocha R.A. Near-infrared spectroscopy and multivariate calibration as an alternative to the agtron to predict roasting degrees in coffee beans and ground coffees. Food Chem. 2021;365 doi: 10.1016/j.foodchem.2021.130471. [DOI] [PubMed] [Google Scholar]
  17. de Jesus Cassimiro D.M., Batista N.N., Fonseca H.C., Naves J.A.O., Dias D.R., Schwan R.F. Coinoculation of lactic acid bacteria and yeasts increases the quality of wet fermented arabica coffee. Int. J. Food Microbiol. 2022;369 doi: 10.1016/j.ijfoodmicro.2022.109627. [DOI] [PubMed] [Google Scholar]
  18. Dorado H., Cobos C., Torres-Jimenez J., Burra D.D., Mendoza M., Jiménez D. Wrapper for building classification models using covering arrays. IEEE Access. 2019;7:148297–148312. doi: 10.1109/access.2019.2944641. [DOI] [Google Scholar]
  19. Elhalis H., Cox J., Zhao J. Ecological diversity, evolution and metabolism of microbial communities in the wet fermentation of australian coffee beans. Int. J. Food Microbiol. 2020;321 doi: 10.1016/j.ijfoodmicro.2020.108544. [DOI] [PubMed] [Google Scholar]
  20. Elhalis H., Cox J., Zhao J. Coffee fermentation: expedition from traditional to controlled process and perspectives for industrialization. Appl. Food Res. 2023;3 doi: 10.1016/j.afres.2022.100253. [DOI] [Google Scholar]
  21. FAO . 2024. Crops and Livestock Products: Coffee, Green.https://www.fao.org/faostat/en/#data/QCL URL: [Google Scholar]
  22. Fatemi A., Singh V., Kamruzzaman M. Identification of informative spectral ranges for predicting major chemical constituents in corn using nir spectroscopy. Food Chem. 2022;383 doi: 10.1016/j.foodchem.2022.132442. [DOI] [PubMed] [Google Scholar]
  23. Galarza G., Figueroa J.G. Volatile compound characterization of coffee (coffea arabica) processed at different fermentation times using spme–gc–ms. Molecules. 2022;27:2004. doi: 10.3390/molecules27062004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Giovenzana V., Beghi R., Guidetti R. Rapid evaluation of craft beer quality during fermentation process by vis/nir spectroscopy. J. Food Eng. 2014;142:80–86. doi: 10.1016/j.jfoodeng.2014.06.017. [DOI] [Google Scholar]
  25. Holguín-Sterling L., Pedraza-Claros B., Pérez-Salinas R., Ortiz A., Navarro-Escalante L., Góngora C.E. Physical–chemical and metataxonomic characterization of the microbial communities present during the fermentation of three varieties of coffee from Colombia and their sensory qualities. Agriculture. 2023;13:1980. doi: 10.3390/agriculture13101980. [DOI] [Google Scholar]
  26. Hu G., Peng X., Gao Y., Huang Y., Li X., Su H., Qiu M. Effect of roasting degree of coffee beans on sensory evaluation: research from the perspective of major chemical ingredients. Food Chem. 2020;331 doi: 10.1016/j.foodchem.2020.127329. [DOI] [PubMed] [Google Scholar]
  27. Jackels S.C., Jackels C.F. Characterization of the coffee mucilage fermentation process using chemical indicators: a field study in Nicaragua. J. Food Sci. 2005;70:C321–C325. doi: 10.1111/j.1365-2621.2005.tb09960.x. [DOI] [Google Scholar]
  28. Kamruzzaman M., Kalita D., Ahmed M.T., ElMasry G., Makino Y. Effect of variable selection algorithms on model performance for predicting moisture content in biological materials using spectral data. Anal. Chim. Acta. 2022;1202 doi: 10.1016/j.aca.2021.339390. [DOI] [PubMed] [Google Scholar]
  29. Kasemsumran S., Boondaeng A., Ngowsuwan K., Jungtheerapanich S., Apiwatanapiwat W., Janchai P., Meelaksana J., Vaithanomsat P. Simultaneous monitoring of the evolution of chemical parameters in the fermentation process of pineapple fruit wine using the liquid probe for near-infrared coupled with chemometrics. Foods. 2022;11:377. doi: 10.3390/foods11030377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kasemsumran S., Boondaeng A., Jungtheerapanich S., Ngowsuwan K., Apiwatanapiwat W., Janchai P., Vaithanomsat P. Assessing fermentation broth quality of pineapple vinegar production with a near-infrared fiber-optic probe coupled with stability competitive adaptive reweighted sampling. Molecules. 2023;28:6239. doi: 10.3390/molecules28176239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lamptey F.P., Teye E., Abano E.E., Amuah C.L. Application of handheld nir spectrometer for simultaneous identification and quantification of quality parameters in intact mango fruits. Smart Agricultural Technol. 2023;6 doi: 10.1016/j.atech.2023.100357. [DOI] [Google Scholar]
  32. Li X., Wei Y., Xu J., Feng X., Wu F., Zhou R., Jin J., Xu K., Yu X., He Y. Ssc and ph for sweet assessment and maturity classification of harvested cherry fruit based on nir hyperspectral imaging technology. Postharvest Biol. Technol. 2018;143:112–118. doi: 10.1016/j.postharvbio.2018.05.003. [DOI] [Google Scholar]
  33. Li H., Zhu J., Jiao T., Wang B., Wei W., Ali S., Ouyang Q., Zuo M., Chen Q. Development of a novel wavelength selection method vcpa-pls for robust quantification of soluble solids in tomato by on-line diffuse reflectance nir. Spectrochim. Acta Mol. Biomol. Spectrosc. 2020;243 doi: 10.1016/j.saa.2020.118765. [DOI] [PubMed] [Google Scholar]
  34. Liao Y., Fan Y., Cheng F. On-line prediction of ph values in fresh pork using visible/near-infrared spectroscopy with wavelet de-noising and variable selection methods. J. Food Eng. 2012;109:668–675. doi: 10.1016/j.jfoodeng.2011.11.029. [DOI] [Google Scholar]
  35. Mehmood T., Sæbø S., Liland K.H. Comparison of variable selection methods in partial least squares regression. J. Chemometr. 2020;34 doi: 10.1002/cem.3226. [DOI] [Google Scholar]
  36. MIDAGRI . 2022. Peru is the world's leading producer and exporter of organic coffee along with ethiopia [perú es el primer productor y exportador mundial de café orgánico junto con etiopía]https://www.gob.pe/institucion/midagri/noticias/647409 URL: [Google Scholar]
  37. Munyendo L., Njoroge D., Hitzmann B. The potential of spectroscopic techniques in coffee analysis—a review. Processes. 2021;10:71. doi: 10.3390/pr10010071. [DOI] [Google Scholar]
  38. Oktavianawati I., Arimurti S., Suharjono S. The impacts of traditional fermentation method on the chemical characteristics of arabica coffee beans from bondowoso district, east java. Forest@. 2020;3:4. doi: 10.21776/ub.jpacr.2020.009.02.526. [DOI] [Google Scholar]
  39. Paredes Espinosa R., Arias Ricaldi J.N., Abarca Piñan V.E., Montañez Artica A.G. Instituto Nacional de Innovación Agraria; 2022. Harvesting and Wet Processing for Specialty Coffees [Cosecha y beneficio húmedo para cafés especiales] [Google Scholar]
  40. Peñuela-Martínez A.E., García-Duque J.F., Sanz-Uribe J.R. Characterization of fermentations with controlled temperature with three varieties of coffee (coffea arabica l.) Fermentation. 2023;9:976. doi: 10.3390/fermentation9110976. [DOI] [Google Scholar]
  41. Pereira L.L., Moreira T.R. Springer; 2021. Quality Determinants in Coffee Production. [DOI] [Google Scholar]
  42. Pereira L.L., Guarçoni R.C., Pinheiro P.F., Osório V.M., Pinheiro C.A., Moreira T.R., Ten Caten C.S. New propositions about coffee wet processing: chemical and sensory perspectives. Food Chem. 2020;310 doi: 10.1016/j.foodchem.2019.125943. [DOI] [PubMed] [Google Scholar]
  43. Pothakos V., De Vuyst L., Zhang S.J., De Bruyn F., Verce M., Torres J., Callanan M., Moccand C., Weckx S. Temporal shotgun metagenomics of an ecuadorian coffee fermentation process highlights the predominance of lactic acid bacteria. Curr. Res. Biotechnol. 2020;2:1–15. doi: 10.1016/j.crbiot.2020.02.001. [DOI] [Google Scholar]
  44. Prakash I., Kumar P., Om H., Basavaraj K., Murthy P.S., et al. Metabolomics and volatile fingerprint of yeast fermented robusta coffee: a value-added coffee. Lebensm. Wiss. Technol. 2022;154 doi: 10.1016/j.lwt.2021.112717. [DOI] [Google Scholar]
  45. Priambodo D.C., Saputro D., Pahlawan M.F.R., Saputro A.D., Masithoh R.E. IOP Conference Series: Earth and Environmental Science. 2022. Determination of acid level (ph) and moisture content of cocoa beans at various fermentation level using visible near-infrared (vis-nir) spectroscopy. [DOI] [Google Scholar]
  46. Puerta G.I., Echeverry J.G. Centro Nacional de Investigaciones de Café (Cenicafé); 2015. Controlled Coffee Fermentation: Technology for Adding Value to Quality [Fermentación controlada del café: Tecnología para agregar valor a la calidad] Technical Report. [Google Scholar]
  47. Ribeiro L.S., Evangelista S.R., da Cruz Pedrozo Miguel M.G., van Mullem J., Silva C.F., Schwan R.F. Microbiological and chemical-sensory characteristics of three coffee varieties processed by wet fermentation. Ann. Microbiol. 2018;68:705–716. doi: 10.1007/s13213-018-1377-4. [DOI] [Google Scholar]
  48. Shankar S.R., Sneha H.P., Prakash I., Khan M., H. N P.K., Om H., Basavaraj K., Murthy P.S. Microbial ecology and functional coffee fermentation dynamics with pichia kudriavzevii. Food Microbiol. 2022;105 doi: 10.1016/j.fm.2022.104012. [DOI] [PubMed] [Google Scholar]
  49. Shen X., Zi C., Yang Y., Wang Q., Zhang Z., Shao J., Zhao P., Liu K., Li X., Fan J. Effects of different primary processing methods on the flavor of coffea arabica beans by metabolomics. Fermentation. 2023;9:717. doi: 10.3390/fermentation9080717. [DOI] [Google Scholar]
  50. Sim J., McGoverin C., Oey I., Frew R., Kebede B. Near-infrared reflectance spectroscopy accurately predicted isotope and elemental compositions for origin traceability of coffee. Food Chem. 2023;427 doi: 10.1016/j.foodchem.2023.136695. [DOI] [PubMed] [Google Scholar]
  51. Solarte-Martinez J., Cobos C., Mendoza M. Algorithm for instance selection in classification problems based on covering arrays [algoritmo para la selección de instancias en problemas de clasificación basado en arreglos de cobertura] RISTI Rev. Ibérica Sist. Tecnol. Informação. 2019:215–229. [Google Scholar]
  52. Sunoj S., Igathinathane C., Visvanathan R. Nondestructive determination of cocoa bean quality using ft-nir spectroscopy. Comput. Electron. Agric. 2016;124:234–242. doi: 10.1016/j.compag.2016.04.012. [DOI] [Google Scholar]
  53. Torres-Jimenez J., Izquierdo-Marquez I., Avila-George H. Search-based software engineering for constructing covering arrays. IET Softw. 2018;12:324–332. doi: 10.1049/iet-sen.2018.5141. [DOI] [Google Scholar]
  54. Torres-Jimenez J., Izquierdo-Marquez I., Avila-George H. Methods to construct uniform covering arrays. IEEE Access. 2019;7:42774–42797. doi: 10.1109/ACCESS.2019.2907057. [DOI] [Google Scholar]
  55. Torres-Jimenez J., Izquierdo-Marquez I., Avila-George H. Methods to construct uniform covering arrays. IEEE Access. 2019;7:42774–42797. doi: 10.1109/ACCESS.2019.2907057. [DOI] [Google Scholar]
  56. Tugnolo A., Giovenzana V., Malegori C., Oliveri P., Casson A., Curatitoli M., Guidetti R., Beghi R. A reliable tool based on near-infrared spectroscopy for the monitoring of moisture content in roasted and ground coffee: a comparative study with thermogravimetric analysis. Food Control. 2021;130 doi: 10.1016/j.foodcont.2021.108312. [DOI] [Google Scholar]
  57. Vásquez N., Magán C., Oblitas J., Chuquizuta T., Avila-George H., Castro W. Comparison between artificial neural network and partial least squares regression models for hardness modeling during the ripening process of swiss-type cheese using spectral profiles. J. Food Eng. 2018;219:8–15. doi: 10.1016/j.jfoodeng.2017.09.008. [DOI] [Google Scholar]
  58. Velmourougane K. Impact of natural fermentation on physicochemical, microbiological and cup quality characteristics of arabica and robusta coffee. Proc. Natl. Acad. Sci. India B Biol. Sci. 2013;83:233–239. doi: 10.1007/s40011-012-0130-1. [DOI] [Google Scholar]
  59. Villegas J., Cobos C., Mendoza M., Herrera-Viedma E. Advances in Artificial Intelligence-IBERAMIA 2018: 16th Ibero-American Conference on AI, Trujillo, Peru, November 13-16, 2018, Proceedings 16. 2018. Feature selection using sampling with replacement, covering arrays and rule-induction techniques to aid polarity detection in twitter sentiment analysis; pp. 467–480. [DOI] [Google Scholar]
  60. Vivas S., Cobos C., Mendoza M. Machine Learning, Optimization, and Data Science: 4th International Conference, LOD 2018, Volterra, Italy, September 13-16, 2018, Revised Selected Papers 4. 2019. Covering arrays to support the process of feature selection in the random forest classifier; pp. 64–76. [DOI] [Google Scholar]
  61. Wu Z., Long J., Xu E., Wu C., Wang F., Xu X., Jin Z., Jiao A. Application of ft-nir spectroscopy and ft-ir spectroscopy to Chinese rice wine for rapid determination of fermentation process parameters. Anal. Methods. 2015;7:2726–2737. doi: 10.1039/c4ay02851a. [DOI] [Google Scholar]
  62. Zeng S., Zhang Z., Cheng X., Cai X., Cao M., Guo W. Prediction of soluble solids content using near-infrared spectra and optical properties of intact apple and pulp applying plsr and cnn. Spectrochim. Acta Mol. Biomol. Spectrosc. 2024;304 doi: 10.1016/j.saa.2023.123402. [DOI] [PubMed] [Google Scholar]
  63. Zhang S.J., De Bruyn F., Pothakos V., Torres J., Falconi C., Moccand C., Weckx S., De Vuyst L. Following coffee production from cherries to cup: microbiological and metabolomic analysis of wet processing of coffea arabica. Appl. Environ. Microbiol. 2019;85 doi: 10.1128/AEM.02635-18. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.


Articles from Current Research in Food Science are provided here courtesy of Elsevier

RESOURCES