Skip to main content
Heliyon logoLink to Heliyon
. 2023 Nov 4;9(11):e21991. doi: 10.1016/j.heliyon.2023.e21991

Optimization studies on batch extraction of phenolic compounds from Azadirachta indica using genetic algorithm and machine learning techniques

Sunita S Patil a, Umesh B Deshannavar b,c,, Shambala N Gadekar-Shinde d, Amith H Gadagi e, Santosh A Kadapure b
PMCID: PMC10658312  PMID: 38027702

Abstract

Phenolic compounds play a crucial role as secondary metabolites due to their substantial biological activity and medicinal value. These compounds are present in various parts of plant species. This study focused on solid-liquid batch extraction to recover total phenolic compounds from Azadirachta indica leaves. The experimental design was based on the Taguchi L16 array, considering four independent factors: extraction time, temperature, particle size, and solid-to-solvent ratio. Among these factors, the particle size exerted the maximum influence. Particle size inversely affects the yield of total phenolic content (TPC), while temperature, time, and solid-to-liquid ratio have a direct impact. The process factors concerned were investigated both experimentally and through machine learning techniques. Support vector regression (SVR) and random forest method (RFM) algorithms were utilized for predicting TPC, while a genetic algorithm (GA) was employed to derive optimal process parameters. The GA predicts the optimal extraction factors, yielding the maximum TPC. During this study, these factors were the following: particle size of 0.15 mm, extraction time of 40 min, solid-to-liquid ratio of 1:25 g/mL, and a temperature of 55 °C, with a predicted value of 23.039 mg GAE/g of plant material. Notably, in this study, the SVR values of TPC yield closely matched the experimental values for the training and test data set when compared with the random forest method values.

Keywords: Batch extraction, Optimization, Total phenolic content, Genetic algorithm, Machine learning

1. Introduction

The extraction of phenolic compounds from natural sources has garnered increasing attention in recent years due to their distinctive pharmaceutical properties. These compounds are increasingly being utilized as antioxidants, antimicrobials, anti-inflammatories, and antiviral agents in the pharmaceutical industry. Phenolic compounds are extracted from fruits, vegetables, herbal plants, and various edible and non-edible plant species, finding application in medical, food, and cosmetic industries. Interestingly, fruit waste products such as peels, seeds, stalks, and leaves often exhibit higher phenolic compound percentages compared to the edible fleshy parts. For example, grapes (Vitaceae) skin [1], papaya (Carica papaya) leaves [2], rambutan (Nephelium lappaceum ) peel [3], avocado (Persea Americana) pulp [4], pomegranate (Punica granatum ) peel [5], and mango (Mangifera indica) peel [6]. Phenolic compounds are also extracted from various vegetable and plant parts, including rose (Pingyin rose) [7], arrowleaf dock (Rumex dentatus) [8], fenugreek (Trigonella foenum-graecum) seeds [9], bay (Laurus Nobilis L) leaves [10], yarrow dust (Achillea millefolium) [11], green tea (Camellia sinensis) leaves [12], oak (Quercus bark) [13], mushroom waste (Shiitake) (Lentinula edodes) [14], bugleweed (Ajuga ciliate) leaves [15], and brinjal (Solanum melongena) [16]. Phenolic compound extraction can be achieved through both conventional and advanced methods. Traditional techniques encompass Soxhlet extraction, maceration, percolation, and solid-liquid batch extraction. Advanced methodologies involve ultrasound-assisted extraction, microwave-assisted extraction, supercritical fluid extraction, and pressurized liquid extraction [17]. Extraction parameters such as particle size, temperature, extraction time, solid-to-liquid ratio, and solvent type profoundly influence the extraction process. Particularly in solid-liquid batch extraction, the choice of solvent significantly impacts process efficiency. Different solvents possess varying solubility characteristics towards distinct bioactive compounds. Polar solvents, including methanol, ethanol, n-propanol, and acetone, dissolve polar solutes, while nonpolar solvents like hexane, toluene, and benzene are effective for dissolving nonpolar solutes [18]. During solvent selection, crucial physical properties such as vapor pressure, boiling point, viscosity, surface tension, and polarity should be considered. Given that phenolic compounds exhibit polarity, the preliminary study employs polar solvents such as methanol, ethanol, and acetone.

Neem (Azadirachta indica) represents a valuable natural resource abundant in essential chemical constituents. In India, neem has been revered for its medicinal uses since ancient times. All parts of the neem tree, namely leaves, bark, twigs, trunk, and fruit, harbor substantial quantities of phenolic compounds [19]. Traditional extraction methods have been harnessed to extract phenolic compounds from neem leaves, revealing antioxidant and antimicrobial activities [[20], [21], [22]]. The total phenolic content (TPC) has been extracted from neem leaves using conventional stirred batch extraction, altering one variable at a time [23]. However, this conventional optimization approach proves labor-intensive, costly, and time-consuming. Modern optimization techniques such as particle swarm optimization (PSO), genetic algorithm (GA), and ant colony optimization (ACO), which are cost-effective and efficient in terms of time and labor, can be employed [24].

This study endeavors to apply batch extraction methodology utilizing the Taguchi L16 experimental design for extracting phenolic compounds from neem leaves. Additionally, modern optimization tools and machine learning techniques are integrated. The objectives encompass identifying optimal process parameters for maximizing TPC yield using a GA, experimental exploration of extraction factors, and employing machine learning techniques such as support vector regression (SVR) and random forest method (RFM) for predicting (TPC).

2. Materials and methods

2.1. Plant material and chemicals

Healthy and mature leaves of Azadirachta indica were collected from the college campus in Pune, India. The leaves underwent thorough cleaning with tap water followed by distilled water to remove dirt. After rinsing, the leaves were dried in a hot air oven at 40–45 °C until a constant weight was achieved. The dried material was ground using a Jaipan grinder, then sieved through different mesh number screens, and stored in plastic bags at 5 °C. Analytical grade gallic acid, Folin-Ciocalteu reagent, anhydrous sodium carbonate, and selected solvents were sourced from “S. D. Fine-Chem Ltd., Mumbai, India."

2.2. Design of experiments for batch extraction

Numerous experimental factors such as temperature, pressure, particle size, solvent, solid-to-liquid ratio, time, and frequency can influence extraction efficiency [25]. The experimental design for the four selected factors and their respective levels is based on Taguchi's method. The independent factors considered are extraction time, temperature, particle size, and solid-to-solvent ratio as detailed in Table 1. These independent factors and their levels were screened during a preliminary study using a conventional one-factor-at-a-time approach (Figs. S1–S4, Supplementary data). The ranges for the selected factors were: particle size (0.09, 0.15, 0.212, 0.425, 0.60, and 1.0 mm), temperature (15, 25, 35, 45, 55, and 60 °C), extraction time (15, 30, 45, 60, 75, and 90 min), and solid-to-liquid ratio (1:10, 1:20, 1:30, 1:40, 1:50, and 1:60 g/mL). After each experimental run, the extract underwent filtration using Whatman no. 42 filter paper at atmospheric pressure and then cooled to room temperature for determination of phenolic content. The ranges of these factors were subsequently employed to design the Taguchi experimental design, which is presented in Table 2. The Taguchi experimental design encompassed 16 experiments aimed at investigating the impact of each factor and its level on phenolic content extraction. This approach provided optimal conditions for achieving higher total phenolic yield.

Table 1.

Process factors and their respective levels in the Taguchi design.

Factor Description Unit L1 L2 L3 L4
A Particle size mm 0.150 0.212 0.425 0.600
B Time min 15 30 45 60
C Solid-to-solvent ratio g/mL 1:20 1:30 1:40 1:50
D Temperature oC 25 35 45 55

L1: Level 1, L2: Level 2, L3: Level 3, L4: Level 4.

Table 2.

Taguchi design and results for batch extraction.

Run Particle size (mm) Time (min) Solid-to-solvent ratio (g/mL) Temperature (°C) TPC yield (mg GAE/g plant material) S/Na ratio (dB)
1 0.150 15 1:20 25 5.109 ± 0.17 14.1667
2 0.150 30 1:30 35 10.105 ± 1.10 20.090
3 0.150 45 1:40 45 11.923 ± 1.03 21.527
4 0.150 60 1:50 55 17.268 ± 0.92 24.745
5 0.212 15 1:30 45 7.389 ± 1.00 17.372
6 0.212 30 1:20 55 8.882 ± 1.10 18.970
7 0.212 45 1:50 25 7.359 ± 0.15 17.336
8 0.212 60 1:40 35 9.915 ± 0.38 19.925
9 0.425 15 1:40 55 6.760 ± 0.14 16.598
10 0.425 30 1:50 45 7.752 ± 0.75 17.788
11 0.425 45 1:20 35 6.191 ± 0.26 15.835
12 0.425 60 1:30 25 4.535 ± 0.27 13.131
13 0.600 15 1:50 35 3.519 ± 0.52 10.928
14 0.600 30 1:40 25 3.488 ± 0.42 10.851
15 0.600 45 1:30 55 6.053 ± 0.55 15.639
16 0.600 60 1:20 45 4.102 ± 0.95 12.259
a

S/N ratio: Signal-to-noise ratio.

2.3. Experimental methods

2.3.1. Preliminary study for solvent selection

A preliminary study was conducted to determine the suitable solvent for phenolic compound extraction. Acetone, ethanol, and methanol were evaluated using batch extraction methodology, with a particle size of 0.425 mm, solid-to-liquid ratio of 1:40 g/mL, and a temperature of 30 °C. The experiments were carried out in a 500 mL stirred glass reactor. Post 60 min, the extracts were filtered using Whatman no. 42 filter paper and subjected to total phenolic content analysis.

2.3.2. Soxhlet extraction

Dried Azadirachta indica powdered leaves (size: 0.425 mm), weighing 10 g. were placed in the thimble of a Soxhlet apparatus. For the extraction of TPC, 400 mL of methanol solvent was added to a 500 mL three-neck round bottom flask [26]. The extraction process involved the siphoning of the extracting solvent from the thimble holder back into the flask when it reached overflow level. This method was carried out at the boiling point of methanol, yielding the maximum recoverable TPC.

2.3.3. Solid-liquid batch extraction

Batch extraction experiments were conducted utilizing a 500 mL agitated cylindrical baffled glass reactor equipped with a marine propeller. Temperature control was achieved using a water bath. The study evaluated the impact of the following four independent factors: extraction time (15–60 min), particle size (0.15–0.60 mm), solid-to-solvent ratio (1:20–1:50 g/mL), and temperature (25–55 °C) on TPC yield. Following each experiment (as detailed in Table 2), the extracts were filtered using Whatman no. 42 filter paper and analyzed by following the process described in Section 2.4. Each of the experiments involved was conducted thrice.

2.4. Analysis of total phenolic content

TPC in the plant extract was determined using the Folin-Ciocalteu method [27]. In this approach, 1 mL of filtered extract was mixed with 60 mL of distilled water in a 100 mL volumetric flask. Subsequently, 5 mL of Folin-Ciocalteu reagent was added and thoroughly mixed. Between 1 and 8 min, 15 mL of 20 % (w/v) sodium carbonate solution was added, followed by thorough shaking. The flask was then filled to volume with distilled water and incubated for 30 min at 40 °C. Sample absorbance was measured at 765 nm using a UV spectrophotometer (UV-1900, Shimadzu). The concentration of total phenolic compounds was determined via the calibration plot and recorded as mg/L of gallic acid equivalent. For the calibration plot, the absorbance was measured using a UV-Spectrophotometer for standard gallic acid solutions with concentrations of 50, 100, 150, 200, 250, 500, and 1000 mg GAE/g of plant material (R2 = 0.9964). Results from three analyses were averaged, with TPC reported as mg GAE/g of plant material.

In the Taguchi method, process performance is assessed through a quality loss function, converted into a signal-to-noise (S/N) ratio. S/N ratio falls into three categories: nominal, the best; larger, the better; and smaller, the best. Given the study's focus on maximizing phenolic compound yield, the “larger is better” approach was selected for the S/N ratio, which was calculated using Equation (1) [28,29].

SN=10log(1ni=1n1Ri2) (1)

where n represents the number of repetitions and Ri represents the response (TPC yield) for the ith experiment.

2.5. Ranking of factors

The average TPC yield was calculated for each factor at every level. Delta signifies the difference between the maximum and minimum values of each factor. Factor ranking was determined based on delta values, with factors possessing higher delta values deemed more influential [29].

2.6. Multiple regression analysis and genetic algorithm

2.6.1. Multiple regression analysis

In this study, a multiple regression analysis was conducted to establish an objective function essential for optimizing process parameters: particle size (mm), time (min), solid-to-solvent ratio (g/mL), and temperature (o C). This was done to maximize TPC yield.

2.6.2. Genetic algorithm

GA serves as a heuristic optimization technique inspired by biological reproductive systems. GA is applied to optimize both constrained and unconstrained objective functions. Commencing with a randomized set of populations, the algorithm iteratively refines this population. Each individual in the population is represented by a string of genes known as a chromosome, where each gene signifies a process or design variable. These strings commonly adopt binary bits, alphabets, or real numbers. Each iteration of GA generates offspring through the random pairing of parents, who are selected from the assumed population set based on their fitness levels. This reproductive process begets children for subsequent generations. The algorithm strives to converge upon an optimal set of populations with each iteration. GA aptly handles objective functions characterized by nonlinearity, heuristics, non-differentiability, and discontinuity. The operational course of GA is illustrated by the flowchart (Fig. 1).

Fig. 1.

Fig. 1

Flowchart of GA [30].

With reference to Fig. 1, the GA workflow can be stated as follows:

  • a.

    GA generates the random set of the initial population pool using a random number generator.

  • b.

    Parent pairing, as generated in Step (a), leads to the creation of a fresh set of populations termed as children. This process, known as crossover, involves selecting parents based on their fitness levels.

  • c.

    The pool of children resulting from Step (b) undergoes a mutation process wherein bits in a child's string undergo complementation.

  • d.

    Children resulting from Step (b) become parents for the next generation.

  • e.

    GA continues iterating until the stipulated stopping criteria are satisfied.

3. Machine learning techniques

3.1. Random forest method

The RFM is extensively applied to regression and classification problems. This supervised machine-learning algorithm hinges on the popular bagging concept. The RFM integrates a multitude of decision trees; predictions from each tree contribute to the final prediction, which is computed by averaging the predictions of all trees. Essential constituents of the random forest include the root node, leaf node, and decision nodes. Nodes emerging after root node splits are recognized as decision nodes, while nodes without further subdivisions are referred to as leaf nodes.

The development of the random forest model involves the following steps:

  • a.

    Creation of subsets from the original data, followed by feature and row sampling.

  • b.

    Construction of individual decision trees for each data subset.

  • c.

    Enhancement of tree growth through node splitting. The node with the lowest impurity serves as the splitting point, and impurity is calculated by Equation (2):

GiniIndex=1b=1b=B(Pb)2 (2)

here B signifies individual nodes, and b corresponds to the probability value for each node.

  • d.

    Acquisition of the output from each decision tree.

  • e.

    Computation of the final prediction as the average prediction from each decision tree is computed as per Equation (3).

=1Mm=1m=Mm(n) (3)

where n signifies an individual sample, M denotes the total tree count, m represents each tree's prediction, and signifies the average prediction.

3.2. Support vector regression

SVR ranks among the most potent supervised machine learning methods and is a subset of Support Vector Machine (SVM). It excels in handling nonlinear facets of processes owing to its kernel method implementation. SVR effectively estimates real-valued functions and relies on a symmetric loss function for training. This function penalizes both high and low error predictions equally. SVR leverages Vapnik's insensitive approach to establish a flexible tube surrounding the evaluated function. This tube, known as an ε-tube, mitigates errors smaller than a threshold, while points outside the tube are penalized. SVR introduces an ε-insensitive space around the ε-tube function, formulating an optimization problem to determine the tube shape that best approximates the continuous-valued function while balancing prediction error and model complexity. The optimization function, derived from tube geometrical parameters and loss function, is minimized to yield the flattest tube encompassing the maximum training instances. SVR primarily strives to identify the optimal hyperplane such that all data points are at a minimum distance.

The relationship between dependent and independent parameters in SVR is approximated by Equation (4):

A=f(B)=Wx(B)+C (4)

Here, A signifies the dependent variable, B denotes the independent variable, W represents the weight matrix, C is a constant, (B) signifies nonlinearly mapped high-dimensional space.

The regularized function in SVR is minimized by using Equation (5) by the constraint expression as given by Equation (6).

Minimize:R(f)=γ1NJ=1J=M(δj+δj*)+12W2 (5)
Constraints:{Ajwx(Bj)C(δ+δj)wx(Bj)+CAiδ+δj*δj,δj0 (6)

Incorporating the kernel function K(B,Bj) and Lagrangian multipliers (αj,αj*), the solution to the optimization problem can be represented by Equation (7).

Maximize:12j=1j=m1k=1k=m(αjαj*)(αkαk*)K(Bj,Bk)εj=1j=m(αjαj*)+j=1j=mAj(αjαj*) (7)
Constraints:{j=1j=m(αjαj*)=0αj,αj*ε(0,γ) (8)

The solution to the above optimization problem, as expressed in Equation (7), subjected to the constraint according to Equation (8) yields the final SVR function. as expressed by Equation (9).

A=f(B)=(αjαj*)xK(B,Bj)+C (9)

4. Results and discussions

4.1. Preliminary study for solvent selection

An assessment of three chosen solvents—ethanol, methanol, and acetone—was conducted to evaluate their impact on TPC yield. Fig. 2 indicates that solvent efficacy concerning phenolic compounds follows the order: methanol > ethanol > acetone. Variations in phenolic compound count across different solvents may be attributed to their respective solubilities. Given that diffusion serves as the primary mechanism in solid-liquid extraction, smaller solvent molecules permeate the plant matrix more effectively. The superior yield of TPC with methanol can be attributed to its smaller molecular size and physical properties [31]. Consequently, methanol was chosen for subsequent investigations.

Fig. 2.

Fig. 2

Effect of solvent on TPC yield.

4.2. Effect of factors on TPC yield

The graphical representation of TPC yield acquired from Taguchi L16 experimental design runs is presented in Fig. 3, Fig. 4, Fig. 5, Fig. 6. The average TPC yield was computed for all four factors at specified levels.

Fig. 3.

Fig. 3

Effect of particle size TPC yield.

Fig. 4.

Fig. 4

Effect of temperature on TPC yield.

Fig. 5.

Fig. 5

Effect of time on TPC yield.

Fig. 6.

Fig. 6

Effect of solid-to-solvent ratio on TPC yield.

Fig. 3 illustrates the effect of particle size on TPC yield. Notably, decreasing particle size was associated with an increase in TPC yield. The Taguchi optimization study revealed the highest TPC yield of 11.101 mg GAE/g of plant material for a particle size of 0.15 mm. In contrast, a particle size of 0.6 mm yielded 4.29 mg GAE/g of plant material. Diminishing particle size facilitated greater surface interaction between the solid and solvent. The reduced particle size minimized the diffusion path between solute and solvent, consequently augmenting the mass transfer rate [32,33]. The reduction in plant material particle size directly exposed a higher number of plant cells to the solvent [34]. Bucic-Kojic et al. [35] examined the impact of particle size on total polyphenol kinetics in grape seeds and discovered that a smaller particle size range yielded the highest total polyphenol concentration. Our findings align with this prior research. Notably, the literature on neem leaves lacks mention of a lower particle size limit.

As temperature ascended from 25 to 55 °C, the TPC yield exhibited an upward trajectory, with maximum yield attained at 55 °C (Fig. 4). The elevated TPC yield can be attributed to increased solute solubility and diffusivity at higher temperatures, resulting from decreased solution viscosity and consequent acceleration of the extraction process [36]. Temperature elevation induced softening and swelling of plant material [37], intensifying the mass transfer of total phenolics from Azadirchata indica leaves and thus elevating the TPC yield. Notably, the study of solid-liquid extraction of TPC from grape marc showed an extraction rate rise in the temperature range of 25–60 °C [38]. Similarly, anthocyanin extraction from milled berries exhibited increased yields with temperature elevation from 6 to 30 °C [32]. However, elevated temperature values led to solvent evaporation and thermal degradation of thermolabile phenolic compounds [39]. Consequently, the optimal temperature was determined as 55 °C due to the maximum TPC yield observed at this point.

Fig. 5 portrays the effect of extraction time on TPC yield, indicating that prolonged time positively influenced TPC yield. The TPC yield escalated from 5.69 to 8.95 mg GAE/g of plant material with extended extraction time. The peak yield of 8.955 mg GAE/g of plant material was achieved after 60 min of extraction. A comparable positive time effect emerged in the batch extraction of phenolic compounds from neem leaves [40]. The extraction process initially exhibited faster extraction rates owing to the higher concentration gradient between solvent and solid material. As the concentration gradient diminished and equilibrium approached, the extraction process decelerated.

The affirmative impact of the solid-to-solvent ratio on TPC yield is evident in Fig. 6. The maximum yield of TPC was achieved at a ratio of 1:50 g/mL. A higher solid-to-solvent ratio was found to amplify the concentration gradient, propelling the transfer of phenolic compounds from neem leaves into the solvent during extraction [40]. Comparable findings were noted in the study of solid-liquid batch extraction of phenolic compounds from neem leaves, where an increase in the solid-to-solvent ratio from 1:10 to 1:50 g/mL led to enhanced extraction of total phenolic compounds [23]. Sivarajan et al. [41] also observed an increase in the percentage extraction yield of phenolic content from two spice varieties with a solid-to-solvent ratio escalation from 1:20 to 1:50 g/mL.

The impact of factors on TPC yield was validated through factor ranking. As per Table 3, particle size exerted the most substantial influence on TPC yield, followed by temperature, time, and solid-to-liquid ratio.

Table 3.

Ranking of process factors based on TPC yield.

Level Particle size (mm) Time (min) Solid-to-solvent ratio (g/mL) Temperature (oC)
1 11.101 5.694 8.975 5.123
2 8.386 7.557 8.021 7.432
3 6.309 7.882 7.021 7.792
4 4.290 8.955 6.071 9.741
Delta 6.811 3.261 2.904 4.618
Rank 1 3 4 2

4.3. Multiple regression analysis

The regression equation obtained in MINITAB Statistical Software by utilizing the experimental data of the Taguchi L16 array is provided by Equation (10). The ANOVA results (Table 4) indicate that all process variables, including particle size, time, solid-to-liquid ratio, and temperature, possess a P-Value of less than 0.05, significantly affecting TPC yield with a 95 % confidence level.

TPCYield=21.7+47.2*ParticleSize0.547*Time+329*SolidtoLiquidratio0.796*Temperature0.319*ParticleSize*Time84*ParticleSize*SolidtoLiquidratio0.093*ParticleSize*Temperature+8.43*Time*SolidtoLiquidratio+0.01289*Time*Temperature+0.61*SolidtoLiquidratio*Temperature65.2*ParticleSize20.00072*Time29641*SolidtoLiquidratio2+0.00563*Temperature2 (10)

Table 4.

ANOVA.

Source DFa Adj SSb Adj MSc F-Value P-Value
Regression 4 172.89 43.222 24.95 0
Particle size 1 94.6 94.599 54.6 0
Time 1 20.43 20.43 11.79 0.006
Solid-to-liquid ratio 1 17.46 17.457 10.08 0.009
Temperature 1 40.4 40.402 23.32 0.001
Error 11 19.06 1.733
Total 15 191.95
a

Degree of freedom.

b

adjusted sums of squares.

c

adjusted mean squares.

4.4. Genetic algorithm

In this study, the GA is implemented in MATLAB R2022b software, and its specifics are as follows:

  • Population size = 50

  • Elite count = 2.5

  • Population type = double vector

  • Cross over fraction = 0.8

  • Max Generations = 400

  • Max stall generations = 50

  • Function tolerance = 1 × 10−6

The regression equation serves as the objective function for optimization in GA, and the constraints are expressed in Equations (11)–(14). The regression equation is negated and considered an objective function since it involves a maximization problem.

0.15 ≤ Particle size ≤0.6 (11)
15 ≤ Time ≤60 (12)
1:20 ≤ Solid-to-liquid ratio ≤1:50 (13)
25 ≤ Temperature ≤55 (14)

The GA-optimized values of particle size, time, solid-to-liquid ratio, temperature, and their corresponding maximum TPC yield are provided in Table 5.

Table 5.

GA-optimized values for maximum TPC yield.

Particle size (mm) Time (min) Solid-to-solvent Ratio (g/mL) Temperature (°C) TPC yield (mgGAE/g)
0.15 40 1:25 55 23.0399

The variation of fitness values across the generations is depicted in Fig. 7, showing stabilization at lower Generation values.

Fig. 7.

Fig. 7

Fitness value vs. generation.

Upon substitution into the regression equation, the GA-optimized parameters yielded a TPC value of 23.03992 mg GAE/g, which closely aligns with the GA TPC yield value of 23.03990 mg GAE/g.

4.5. Random forest method

The random forest algorithm is implemented in Python's Jupyter Notebook. A total of 100 decision trees were employed in this study. The experimental data from Taguchi's L16 array were utilized to train the random forest machine learning model. Comparison plots of experimental and random forest values of TPC yield for the training and test data sets are shown in Fig. 8, Fig. 9, respectively.

Fig. 8.

Fig. 8

Comparison plot of experimental and random forest values of TPC yield for training data set.

Fig. 9.

Fig. 9

Comparison plot of experimental and random forest values of TPC yield for the test data set.

It is evident from Fig. 8, Fig. 9 that the random forest values of TPC yield closely align with the experimental values for both the training and test data sets. The average and maximum errors in the SVR random forest estimated values of TPC yield for the training data set are 0.16 % and 0.289 %, respectively. The maximum and average errors between the random forest–predicted TPC yield values for the test data set are 3.1 % and 2.98 %, respectively.

Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15 reveal that the contours of random forest TPC yield for various cases do not align perfectly with the experimental contours as shown in Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15.

Fig. 10.

Fig. 10

Contour plot of experimental TPC yield vs. particle size and time: (a) experimental (b) random forest.

Fig. 11.

Fig. 11

Contour plot of experimental TPC yield vs. particle size and solid-to-solvent ratio: (a) experimental (b) random forest.

Fig. 12.

Fig. 12

Contour plot of experimental TPC yield vs. particle size and temperature: (a) experimental (b) random forest.

Fig. 13.

Fig. 13

Contour plot of experimental TPC yield vs. time and solid-to-solvent ratio: (a) experimental (b) random forest.

Fig. 14.

Fig. 14

Contour plot of experimental TPC yield vs. time and temperature: (a) experimental (b) random forest.

Fig. 15.

Fig. 15

Contour plot of experimental TPC yield vs. solid-to-solvent ratio and temperature: (a) experimental (b) random forest.

4.6. Support vector regression

The SVM algorithm is implemented in Python's Jupyter Notebook, considering the following SVR model hyperparameters:

  • Penalty Parameter (C): 10,000

  • Kernel: Radial Basis Function

  • •Gamma: 0.5

  • •Epsilon: 0.07

The experimental data from Taguchi's L16 array were used to train the SVR machine-learning model. A comparison of the experimental and SVR values of TPC yield for both the training and test data sets is depicted in Fig. 16, Fig. 17, respectively.

Fig. 16.

Fig. 16

Comparison plot of experimental and SVR values of TPC yield for training data set.

Fig. 17.

Fig. 17

Comparison plot of experimental and SVR values of TPC yield for the test data set.

Fig. 16, Fig. 17 illustrate that the SVR TPC yield values closely align with the experimental values for both training and test data sets. The average and maximum errors in the SVR-predicted TPC yield values for the training data set are 0.16 % and 0.289 %, respectively. Likewise, the maximum and average errors between the SVR-estimated TPC yield values for the test data set are 3.1 % and 2.98 %, respectively.

From the SVR contour in Fig. 18 (b), the maximum TPC is obtained when the particle size is minimized and time is maximized. The SVR contour in Fig. 19 (b) indicates that the minimum particle size with the minimum solid-to-liquid ratio yields maximum TPC. In Fig. 20 (b), the SVR contour suggests that the combination of minimum particle size and maximum temperature results in maximum TPC. The plot in Fig. 21 (b) shows that the maximum TPC is achieved with a combination of maximum time and minimum solid-to-liquid ratio. According to the SVR contour in Fig. 22 (b), the maximum TPC can be attained by maintaining both the maximum time and temperature. In Fig. 23 (b), the SVR contour demonstrates that the combination of maximum temperature and minimum solid-to-liquid ratio produces maximum TPC. Overall, Fig. 18, Fig. 19, Fig. 20, Fig. 21, Fig. 22, Fig. 23 illustrate that the SVR TPC yield contours for different cases match the experimental contours shown in Fig. 18, Fig. 19, Fig. 20, Fig. 21, Fig. 22, Fig. 23.

Fig. 18.

Fig. 18

Contour plot of experimental TPC yield vs. particle size and time: (a) experimental (b) SVR.

Fig. 19.

Fig. 19

Contour plot of experimental TPC yield vs. particle size and solid-to-solvent ratio: (a) experimental (b) SVR.

Fig. 20.

Fig. 20

Contour plot of experimental TPC yield vs. particle size and temperature: (a) experimental (b) SVR.

Fig. 21.

Fig. 21

Contour plot of experimental TPC yield vs. time and solid-to-solvent ratio: (a) experimental (b) SVR.

Fig. 22.

Fig. 22

Contour plot of experimental TPC yield vs. time and temperature: (a) experimental (b) SVR.

Fig. 23.

Fig. 23

Contour plot of experimental TPC yield vs. solid-to-solvent ratio and temperature: (a) experimental (b) SVR.

5. Conclusions

The present study on solid-liquid batch extraction aimed to extract TPC from neem leaves using the Taguchi L16 experimental design. For this process, methanol was identified as the optimal solvent among the options considered. The ranking of process factors indicated that the particle size of neem leaves holds the greatest influence in the said process. The GA predicted the optimum process factors: 0.15 mm particle size, 40 min extraction time, 1:25 g/mL solid-to-solvent ratio, and 55 °C temperature. Under these predicted extraction conditions, the TPC yielded 23.039 mg GAE/g. The employed machine learning techniques demonstrated that SVR values of TPC yield exhibited a strong correlation with experimental values for both training and test data sets, as compared to the values from the RFM.

Funding

This research received no specific grant from funding agencies belonging to either the public, commercial, or not-for-profit sectors.

Data availability

Data included in article/supp. material/referenced in article.

CRediT authorship contribution statement

Sunita S. Patil: Investigation, Writing – original draft, Data curation, Methodology. Umesh B. Deshannavar: Conceptualization, Writing – review & editing, Supervision. Shambala N. Gadekar-Shinde: Conceptualization. Amith H. Gadagi: Software, Validation. Santosh A. Kadapure: Conceptualization, Writing – review & editing.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:Dr. Umesh B. Deshannavar reports was provided by DR MS Sheshgiri College of Engineering and Technology. Dr. Umesh B. Deshannavar reports a relationship with DR MS Sheshgiri College of Engineering and Technology that includes: employment.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2023.e21991.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.docx (83.6KB, docx)

References

  • 1.Gallo M., Formato A., Giacco R., Riccardi G., Luongo D., Formato G., Amoresano A., Naviglio D. Mathematical optimization of the green extraction of polyphenols from grape peels through a cyclic pressurization process. Heliyon. 2019;5 doi: 10.1016/j.heliyon.2019.e01526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Adeyi O., Oke E.O., Okolo B.I., Adeyi A.J., Otolorin J.A., Nwosu-Obieogu K., Adeyanju J.A., Dzarma G.W., Okhale S., Ogu D., Onu P.N. Process optimization, scale-up studies, economic analysis and risk assessment of phenolic rich bioactive extracts production from Carica papaya L. leaves via heat-assisted extraction technology. Heliyon. 2022;8 doi: 10.1016/j.heliyon.2022.e09216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tingting Z., Xiuli Z., Kun W., Liping S., Yongliang Z. A review: extraction, phytochemicals, and biological activities of rambutan (Nephelium lappaceum L) peel extract. Heliyon. 2022;8 doi: 10.1016/j.heliyon.2022.e11314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bi X., Hemar Y., Balaban M.O., Liao X. The effect of ultrasound on particle size, color, viscosity and polyphenol oxidase activity of diluted avocado puree. Ultrason. Sonochem. 2015;27:567–575. doi: 10.1016/j.ultsonch.2015.04.011. [DOI] [PubMed] [Google Scholar]
  • 5.Skenderidis P., Leontopoulos S., Petrotos K., Giavasis I. Optimization of vacuum microwave-assisted extraction of pomegranate fruits peels by the evaluation of extracts' phenolic content and antioxidant activity. Foods. 2020;9:1655. doi: 10.3390/foods9111655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rojas R., Contreras-Esquivel J.C., Orozco-Esquivel M.T., Muñoz C., Aguirre-Joya J.A., Aguilar C.N. Mango peel as source of antioxidants and pectin: microwave assisted extraction. Waste and Biomass Valorization. 2015;6:1095–1102. doi: 10.1007/s12649-015-9401-4. [DOI] [Google Scholar]
  • 7.Xu B., Feng M., Tiliwa E.S., Yan W., Wei B., Zhou C., Ma H., Wang B., Chang L. Multi-frequency power ultrasound green extraction of polyphenols from Pingyin rose: optimization using the response surface methodology and exploration of the underlying mechanism. LWT. 2022;156 doi: 10.1016/j.lwt.2021.113037. [DOI] [Google Scholar]
  • 8.Anis N., Ahmed D. Modelling and optimization of polyphenol and antioxidant extraction from Rumex hastatus by green glycerol-water solvent according to response surface methodology. Heliyon. 2022;8 doi: 10.1016/j.heliyon.2022.e11992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Akbari S., Abdurahman N.H., Yunus R.M., Alara O.R., Abayomi O.O. Extraction, characterization and antioxidant activity of fenugreek (Trigonella-Foenum Graecum) seed oil. Mater. Sci. Energy Technol. 2019;2:349–355. doi: 10.1016/j.mset.2018.12.001. [DOI] [Google Scholar]
  • 10.Muñiz-Márquez D.B., Martínez-Ávila G.C., Wong-Paz J.E., Belmares-Cerda R., Rodríguez-Herrera R., Aguilar C.N. Ultrasound-assisted extraction of phenolic compounds from Laurus nobilis L. and their antioxidant activity. Ultrason. Sonochem. 2013;20:1149–1154. doi: 10.1016/j.ultsonch.2013.02.008. [DOI] [PubMed] [Google Scholar]
  • 11.Milutinović M., Radovanović N., Ćorović M., Šiler-Marinković S., Rajilić-Stojanović M., Dimitrijević-Branković S. Optimization of microwave-assisted extraction parameters for antioxidants from waste Achillea millefolium dust. Ind. Crops Prod. 2015;77:333–341. doi: 10.1016/j.indcrop.2015.09.007. [DOI] [Google Scholar]
  • 12.Pan X., Niu G., Liu H. Microwave-assisted extraction of tea polyphenols and tea caffeine from green tea leaves. Chem. Eng. Process. Process Intensif. 2003;42:129–133. doi: 10.1016/S0255-2701(02)00037-5. [DOI] [Google Scholar]
  • 13.Bouras M., Chadni M., Barba F.J., Grimi N., Bals O., Vorobiev E. Optimization of microwave-assisted extraction of polyphenols from Quercus bark. Ind. Crops Prod. 2015;77:590–601. doi: 10.1016/j.indcrop.2015.09.018. [DOI] [Google Scholar]
  • 14.Xiaokang W., Lyng J.G., Brunton N.P., Cody L., Jacquier J.C., Harrison S.M., Papoutsis K. Monitoring the effect of different microwave extraction parameters on the recovery of polyphenols from shiitake mushrooms: comparison with hot-water and organic-solvent extractions. Biotechnol. Reports. 2020;27 doi: 10.1016/j.btre.2020.e00504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhang Y., Tang H., Zheng Y., Li J., Pan L. Optimization of ultrasound-assisted extraction of poly-phenols from Ajuga ciliata Bunge and evaluation of antioxidant activities in vitro. Heliyon. 2019;5 doi: 10.1016/j.heliyon.2019.e02733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Dranca F., Oroian M. Optimization of ultrasound-assisted extraction of total monomeric anthocyanin (TMA) and total phenolic content (TPC) from eggplant (Solanum melongena L.) peel. Ultrason. Sonochem. 2016;31:637–646. doi: 10.1016/j.ultsonch.2015.11.008. [DOI] [PubMed] [Google Scholar]
  • 17.Sridhar A., Ponnuchamy M., Kumar P.S., Kapoor A., Vo D.V.N., Prabhakar S. Techniques and modeling of polyphenol extraction from food: a review. Environ. Chem. Lett. 2021;19:3409–3443. doi: 10.1007/s10311-021-01217-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jadhav D., R B.N., Gogate P.R., Rathod V.K. Extraction of vanillin from vanilla pods: a comparison study of conventional soxhlet and ultrasound assisted extraction. J. Food Eng. 2009;93:421–426. doi: 10.1016/j.jfoodeng.2009.02.007. [DOI] [Google Scholar]
  • 19.Tandon S., Sand N.K. Qualitative analysis of phenolic constituents from leaves of some plants of family meliaceae. Int. J. Med. Plants Nat. Prod. 2016;2 doi: 10.20431/2454-7999.0201005. [DOI] [Google Scholar]
  • 20.Nahak G., Sahu R. In vitro antioxidative activity of Azadirachta indica and Melia azedarach Leaves by DPPH scavenging assay. Nat. Sci. 2010;8:22–28. [Google Scholar]
  • 21.Hossain M.A., Al-Toubi W.A.S., Weli A.M., Al-Riyami Q.A., Al-Sabahi J.N. Identification and characterization of chemical compounds in different crude extracts from leaves of Omani neem. J. Taibah Univ. Sci. 2013;7:181–188. doi: 10.1016/j.jtusci.2013.05.003. [DOI] [Google Scholar]
  • 22.Al-Jadidi H.S.K., Hossain M.A. Determination of the total phenols, flavonoids and antimicrobial activity of the crude extracts from locally grown neem stems. Asian Pacific J. Trop. Dis. 2016;6:376–379. doi: 10.1016/S2222-1808(15)61051-9. [DOI] [Google Scholar]
  • 23.Shewale S., Rathod V.K. Extraction of total phenolic content from Azadirachta indica or (neem) leaves: kinetics study. Prep. Biochem. Biotechnol. 2018;48:312–320. doi: 10.1080/10826068.2018.1431784. [DOI] [PubMed] [Google Scholar]
  • 24.Sinha G.R., editor. Modern Optimization Methods for Science, Engineering and Technology. IOP Publishing; 2019. [DOI] [Google Scholar]
  • 25.Azmir J., Zaidul I.S.M., Rahman M.M., Sharif K.M., Mohamed A., Sahena F., Jahurul M.H.A., Ghafoor K., Norulaini N.A.N., Omar A.K.M. Techniques for extraction of bioactive compounds from plant materials: a review. J. Food Eng. 2013;117:426–436. doi: 10.1016/j.jfoodeng.2013.01.014. [DOI] [Google Scholar]
  • 26.Gujar J.G., Chattopadhyay S., Wagh S.J., Gaikar V.G. Experimental and modeling studies on extraction of catechin hydrate and epicatechin from Indian green tea leaves. Can. J. Chem. Eng. 2010 doi: 10.1002/cjce.20271. n/a-n/a. [DOI] [Google Scholar]
  • 27.Singleton V.L., Orthofer R., Lamuela-Raventós R.M. 1999. Analysis of Total Phenols and Other Oxidation Substrates and Antioxidants by Means of Folin-Ciocalteu Reagent; pp. 152–178. [DOI] [Google Scholar]
  • 28.Davis R., John P. Stat. Approaches with Emphas. Des. Exp. Appl. To Chem. Process. InTech; 2018. Application of taguchi-based design of experiments for industrial chemical processes. [DOI] [Google Scholar]
  • 29.Cheah E.L.C., Heng P.W.S., Chan L.W. Optimization of supercritical fluid extraction and pressurized liquid extraction of active principles from Magnolia officinalis using the Taguchi design. Sep. Purif. Technol. 2010;71:293–301. doi: 10.1016/j.seppur.2009.12.009. [DOI] [Google Scholar]
  • 30.Abdul Hamed A.A., Tawfeek M.A., Keshk A.E. A genetic algorithm for service flow management with budget constraint in heterogeneous computing. Futur. Comput. Informatics J. 2018;3:341–347. doi: 10.1016/j.fcij.2018.10.004. [DOI] [Google Scholar]
  • 31.Taralkar S.V., Chattopadhyay S., Gaikar V.G. Parametric optimization and modeling of batch extraction process for extraction of betulinic acid from leaves of Vitex Negundo Linn. Sep. Sci. Technol. 2016;51:641–652. doi: 10.1080/01496395.2015.1105822. [DOI] [Google Scholar]
  • 32.Cacace J.E., Mazza G. Mass transfer process during extraction of phenolic compounds from milled berries. J. Food Eng. 2003;59:379–389. doi: 10.1016/S0260-8774(02)00497-1. [DOI] [Google Scholar]
  • 33.Pinelo M., Sineiro J., Núñez M.J. Mass transfer during continuous solid-liquid extraction of antioxidants from grape byproducts. J. Food Eng. 2006;77:57–63. doi: 10.1016/j.jfoodeng.2005.06.021. [DOI] [Google Scholar]
  • 34.Vinatoru M. An overview of the ultrasonically assisted extraction of bioactive principles from herbs. Ultrason. Sonochem. 2001;8:303–313. doi: 10.1016/S1350-4177(01)00071-2. [DOI] [PubMed] [Google Scholar]
  • 35.Bucić-Kojić A., Planinić M., Tomas S., Bilić M., Velić D. Study of solid-liquid extraction kinetics of total polyphenols from grape seeds. J. Food Eng. 2007;81:236–242. doi: 10.1016/j.jfoodeng.2006.10.027. [DOI] [Google Scholar]
  • 36.Lazar L., Talmaciu A.I., Volf I., Popa V.I. Kinetic modeling of the ultrasound-assisted extraction of polyphenols from Picea abies bark. Ultrason. Sonochem. 2016;32:191–197. doi: 10.1016/j.ultsonch.2016.03.009. [DOI] [PubMed] [Google Scholar]
  • 37.Tao Y., Zhang Z., Sun D.-W. Kinetic modeling of ultrasound-assisted extraction of phenolic compounds from grape marc: influence of acoustic energy density and temperature. Ultrason. Sonochem. 2014;21:1461–1469. doi: 10.1016/j.ultsonch.2014.01.029. [DOI] [PubMed] [Google Scholar]
  • 38.Sant' Anna V., Brandelli A., Marczak L.D.F., Tessaro I.C. Kinetic modeling of total polyphenol extraction from grape marc and characterization of the extracts. Sep. Purif. Technol. 2012;100:82–87. doi: 10.1016/j.seppur.2012.09.004. [DOI] [Google Scholar]
  • 39.Goula A.M. Ultrasound-assisted extraction of pomegranate seed oil – kinetic modeling. J. Food Eng. 2013;117:492–498. doi: 10.1016/j.jfoodeng.2012.10.009. [DOI] [Google Scholar]
  • 40.Hismath I., Mustapha W., Ho C. Optimization of extraction conditions for phenolic compounds from neem (Azadirachta indica) leaves. Int. Food Res. J. 2011;18 [Google Scholar]
  • 41.Radha krishnan K., Sivarajan M., Babuskin S., Archana G., Azhagu Saravana Babu P., Sukumar M. Kinetic modeling of spice extraction from S. aromaticum and C. cassia. J. Food Eng. 2013;117:326–332. doi: 10.1016/j.jfoodeng.2013.03.011. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.docx (83.6KB, docx)

Data Availability Statement

Data included in article/supp. material/referenced in article.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES