Abstract
Premise
Plants are frequently exposed to combinations of abiotic and biotic stresses that pose a greater threat to yield and productivity than individual stresses. However, knowledge of the impact of many stress combinations in numerous plants is limited due to the lack of experimental data, which could take decades to generate. To overcome this limitation, we utilized existing literature data from various plant species and stress combinations to derive biological inferences, thereby gaining a comprehensive understanding of plant responses through a computational tool.
Methods
Public databases were used to gather literature on the impact of various abiotic and biotic stress combinations. Then, a composite artificial neural network (ANN)–based multi‐target classification and regression deep learning model was developed using machine learning algorithms.
Results
The model predicted the impact of stress interactions in plants, including the morphological parameters affected and percentage changes in those parameters, with an overall accuracy of 76.33%. Predicted reductions in yield were validated in rice under combined drought and heat stress.
Discussion
The ANN‐based model developed in this study is a valuable resource for plant researchers seeking to understand the impact of stress combinations. The tool can make use of multivariate and complex combined stress datasets.
Keywords: artificial intelligence, artificial neural networks, combined stresses, computational phenomics, deep learning, digital plant phenomics, knowledge‐based systems
Combined biotic and abiotic stresses, originating from factors like extreme climatic conditions, pathogens, pests, and weeds, exert a continuous and complex influence on plants and impact their growth and productivity (Ramegowda and Senthil‐Kumar, 2015; Zandalinas and Mittler, 2022; Priya et al., 2023). Several of the abiotic stresses, such as drought, heat, salinity, high light, and nutrient deficiencies, are known to co‐occur and have a cumulative impact on plant health and their vulnerability to pests and pathogens. For instance, a plant weakened by drought may become more susceptible to disease, compounding the damage and reducing resilience (Chilakala et al., 2022; Pandey and Senthil‐Kumar, 2024). Furthermore, interactions between two pathogens (biotic–biotic stress combinations) can result in additive damage to plants, exemplified by the co‐infection of Fusarium oxysporum sp. medicaginis and Rhizoctonia solani, which causes greater growth reduction in alfalfa compared to mono‐infections (Fang et al., 2021). In the recently established Stress Combinations and their Interaction in Plants Database (SCIPDb) (Priya et al., 2023), 123 stress combinations involving diverse abiotic and biotic stressors were identified that affect the growth and productivity of 118 economically significant plant crop species. Out of the 123 stress combinations, 69 were found to have a negative impact on plants (Priya et al., 2023). Well‐known examples include the detrimental effects of the co‐occurrence of drought and heat, affecting plant health and vulnerability to pests and pathogens (Desaint et al., 2021; Choudhary and Senthil‐Kumar, 2022; Rivero et al., 2022; Zandalinas and Mittler, 2022). Interestingly, not all stress combinations are deleterious, and 20 stress combinations in SCIPDb, termed “positive stress combinations,” were found to have less harmful effects on plants compared to individual stresses (Priya et al., 2023; Pandey et al., 2024).
Various factors influence the outcome of the interaction between two stresses, including the sequence in which they occur. The first stressor can modulate plant defenses, impacting responses to subsequent stressors (Ramegowda and Senthil‐Kumar, 2015). Moreover, the plant's response to combined stresses varies considerably with the developmental stage and tissue type, highlighting the complexity of interactions between stressors and plants. For example, the stomata on the flowers remain open in soybean plants under combined drought and heat stress, whereas those on the leaves close (Sinha et al., 2022). Numerous studies emphasize the importance of understanding combined stress and its implications for plant health and productivity (Ahuja et al., 2010; Atkinson et al., 2013; Prasch and Sonnewald, 2013; Barah et al., 2016; Zandalinas et al., 2024). Although stress combinations can have varied physiological consequences that differ by time and tissue type, plant morphological traits are often good indicators of acclimation to stress conditions. A meta‐analysis of data from 120 research articles on crop responses to combined drought and heat stress revealed the significant impact of the stress combination on several yield‐related parameters (Cohen et al., 2017). Among the different plant performance traits (morphological, physiological, and biochemical), morphological traits can be a good predictor of plant performance under various environmental stresses. For example, leaf traits are essential indicators of plant performance under arid conditions (Guo et al., 2017). Thus, morphological and yield‐related traits can be good predictors of plant performance under combined stress conditions.
Despite significant progress in combined stress research over the past decade, there are still gaps in knowledge, particularly regarding the specific mechanisms underlying combined stress responses, the variability across different plant species and environments, and the practical application of research findings to predict the impact of the stress combinations on plants to improve crop resilience (Priya et al., 2023; Pandey and Senthil‐Kumar, 2024). Addressing these gaps through advanced computational approaches is critical.
As a robust tool for end‐to‐end learning, deep neural networks have been successfully applied across diverse domains, including biological (Han and Liu, 2019), agricultural (Misra et al., 2022), medical (Ha, 2022; Ha and Park, 2023), pharmaceutical (Mak and Pichika, 2019), environmental (Kumari and Pandey, 2023), and engineering applications (Nti et al., 2022). Structurally, a neural network comprises nodes that process numerical inputs from incoming edges and produce numerical outputs. In deep neural networks, multiple layers of nodes are stacked to map inputs to outputs, and training involves adjusting network parameters—including node functions and edge weights—to optimize the accuracy of this mapping.
Because the impact of stress interactions on plants is a complex and nonlinear biological process, it can be effectively studied using artificial neural networks (ANNs), a subset of machine learning that is based on the behavior of neurons in the brain and capable of learning and modeling complex nonlinear relationships (Marsland, 2014). ANNs are applied to cluster, predict, and classify complex datasets, fitting the patterns in the data and exhibiting robustness to noise and the ability to infer unseen relationships (Masters, 2018; Russell and Norvig, 2021).
Machine learning has several applications in plant science, such as species identification from herbarium specimens (Carranza‐Rojas et al., 2017), weed detection in field settings (Yu et al., 2019), and plant disease classification, particularly leaf diseases caused by biotic stresses (Geetharamani and Pandian, 2019). Machine learning models have also been used to segment roots in chicory (Smith et al., 2020), wheat and rapeseed (Yasrab et al., 2019), and Arabidopsis (Gaggion et al., 2021), as well as to enumerate leaves in tobacco (Ubbens and Stavness, 2017). ANN‐based simulation modeling was applied to predict dry root rot incidence in chickpea using weather and experimental data (Sinha et al., 2021). Overall, ANNs have been shown to be effective for modeling complex, non‐linear plant responses to stress, although the optimal choice of model often depends on the crop, stress type, and available dataset.
The model developed in this study aims to predict the impact of combined stresses on plants beyond the available data points and, in bridging this gap, to provide valuable insights to researchers. The ANN model predicts the measurable and agriculturally important parameters that are affected, as well as the percentage change in the respective parameter values, for a given plant–stress combination. Because stress combination data were not available for a large number of stress combinations and plants, we aimed for this ANN model to fill this gap by extrapolating the impact of combined stress to a wide range of stress combinations in herbaceous or crop plant species in which experimental studies may not have been conducted. This intuitive knowledge can expedite the validation process of combined stress data in the laboratory.
METHODS
Machine learning libraries
The machine learning libraries Keras API (https://www.tensorflow.org/api_docs/python/tf/keras) (3.4.1), Python package scikit‐learn v1.4.2 (https://scikit-learn.org/stable/), and Google Tensorflow version 2.17.0 (https://www.tensorflow.org/) were used to implement the deep learning model in this study.
Data mining
The SCIPDb FTP server was utilized to download the morphological dataset for 41 distinct stress combinations (https://db.nipgr.ac.in/plant_complete/downloads.php; accessed on December 2021) (Priya et al., 2023). The dataset integrated into SCIPDb has been obtained through literature mining performed by employing relevant and carefully designed keywords (Appendix S1). The major search engines (Appendix S2) and the inclusion of various keyword variants ensured comprehensive coverage of relevant articles. In addition, the bibliographies of the identified articles were examined to augment the dataset, producing a thorough collection of information on the topic (Appendix S1). Manual curation was done to filter articles belonging to actual combined stress studies. Articles with only morphological data were considered for data extraction.
Data extraction for the model
Among the 939 studies covering 123 stress combinations in the database (Priya et al., 2023), 503 studies covering 41 stress combinations were analyzed (Appendix S1). Of these, 148 studies were further selected, as they provided information on the effects of combined stresses on both vegetative and reproductive plant growth. The chosen studies examined the impact of 16 abiotic–abiotic, 17 abiotic–biotic, and eight biotic–biotic stress combinations across 54 plant species from 16 families (Appendix S1, Figure 1A). The abiotic–abiotic stress combinations category consisted of different combinations of stresses like cold, heat, drought, flooding, high light, low light, heavy metals, shading, ultraviolet (UV) radiation, ozone, and salinity. The abiotic–biotic stress combinations included the effect of abiotic factors like drought, waterlogging, salinity, ozone, UV radiation, heavy metal, and nutrient stress on fungal, bacterial, and viral infections, as well as insect infestations on different plants. The biotic–biotic stress combinations dealt with the impact of co‐infections of viruses, fungi, oomycetes, bacteria, nematodes, and insects on plants.
Figure 1.

The input dataset used to train the model. (A) Schematic representation of the steps in data collection and extraction for feeding into the model. Literature was mined to explore research on 16 abiotic–abiotic, 17 abiotic–biotic, and eight biotic–biotic stress combinations. The studies in which the impact of stress combinations on morphological and yield‐related parameters was discussed were chosen for data extraction. Data values pertaining to 41 parameters were extracted and analyzed to calculate the percentage change in parameter values. These values or data points (748 in total, consisting of 255, 185, and 308 data points from abiotic–abiotic, abiotic–biotic, and biotic–biotic stress combinations, respectively) were fed into the model to predict the impact of the stress combinations. AA, abiotic–abiotic; AB, abiotic–biotic; BB, biotic–biotic stress combinations. (B) Representation of the dataset used for model generation using a Sankey diagram. The first (left side) level depicts stress combinations denoted by numbers. The second level (center) represents the two major parameter classes (biomass and yield). The last (right side) level indicates the individual parameters that belong to this major parameter category used in the current study. The width of the nodes indicates the number of studies corresponding to a parameter type in a particular stress combination. All of the parameters are listed in Appendix S1.
The parameters assessed in these studies were categorized into two main groups: biomass and yield (Table 1). Our objective in categorizing these parameters was to ascertain whether the stress combinations had a more pronounced impact on plant growth, morphology, or yield‐related traits. Data values pertaining to the parameters listed in Table 1 were extracted from the corresponding stress combination data from the SCIPDb FTP server. As explained in the database, these values were either directly taken from the source articles (Appendix S1) or extracted using the GetData Graph Digitizer 2.26 tool (https://getdata-graph-digitizer.software.informer.com/) for better accuracy (Priya et al., 2023). For example, to extract absolute values that were represented graphically rather than numerically for individual and combined stresses, GetData Graph Digitizer 2.26 was employed. To ensure effective and precise data extraction, the process was initiated by capturing a visual image of an enlarged graph using the Snip & Sketch tool; this file was then opened in GetData Graph Digitizer. The scale parameters (xmin, xmax, ymin, and ymax) were then set by identifying and clicking on specific points within the image and manually inputting their corresponding values. Once the scale was configured correctly, the point capture mode was utilized, which automatically displayed the raw values (both x‐ and y‐coordinates) on a panel to the right whenever a point on the bar/line graph was captured. After all of the necessary values were captured, they were copied into an Excel spreadsheet for further analysis. The extraction of raw values from graphs using GetData Graph Digitizer ensured precision and efficiency by accurately capturing the points on the graphs and providing a magnified view to enhance extraction accuracy. The raw values were automatically extracted, and the data were securely saved for future use, providing a more efficient approach than manual extraction.
Table 1.
List of the morphological parameters included in the model.a
| Serial number | Biomass‐related parameters | Yield‐related parameters |
|---|---|---|
| 1 | Shoot weight | 1000 grain weight |
| 2 | Total dry weight/dry weight | Filled grain weight |
| 3 | Total plant weight | Grain set index |
| 4 | Dry weight of stems | Grain weight per plant |
| 5 | Total biomass | Grain yield |
| 6 | Root dry weight | Number of grains per ear |
| 7 | Shoot dry weight | Number of grains per spike |
| 8 | Root weight/root fresh weight | Mean tuber weight |
| 9 | Shoot fresh weight | Number of filled grains per panicle |
| 10 | Leaf dry weight | Number of seeds |
| 11 | Fresh weight | Number of spikelets per panicle |
| 12 | Shoot dry mass | Pod weight |
| 13 | Leaf blade mass | Seed weight |
| 14 | Shoot biomass | Seed yield |
| 15 | Weight per grain | |
| 16 | Yield | |
| 17 | Yield of ripe fruits | |
| 18 | Seed number |
The listed parameters were used to assess the impact of 41 different stress combinations on various plant species, as documented in 148 research articles. The parameters listed under the two broad categories were taken from the research articles to indicate the different ways a parameter can be measured.
Because the data consisted of different parameters measured in various units and values, we used the percentage change value in the parameters compared to the control values, calculated as:
This normalization process allowed us to effectively compare and interpret the data, mitigating variations arising from the inherent diversity of the parameters under investigation.
ANN‐based deep learning model generation
The values corresponding to the percentage change in parameters extracted from the SCIPDb corresponding stress combinations data (Appendix S3) were utilized as inputs for the model to predict the impact of stress combinations. The dataset included the percentage change in affected parameters for different plants under various simultaneously or sequentially applied stress combinations. The stress combinations, plant species, treatments (whether single or combined, simultaneous or sequential), plant performance (percentage change in parameter respective to the control), and parameter under study, along with factors including salinity, average temperature, drought, UV radiation, ozone, and soil concentration of heavy metals such as boron, cadmium, manganese, lead, zinc, and nickel, were included as features. The dataset was one‐hot encoded, whereby categorical columns were converted to binary format (Appendix S4). The categorical variables were transformed into a binary representation using this technique, and each category was assigned a separate binary column, with 1 indicating presence and 0 indicating absence.
Similarly, biotic stress–related features (e.g., nematode, fungus, and weed) were converted into dummy variables, where a value of 1 indicated their presence in a particular combination and 0 indicated their absence. UV data values were set to 1 for all scenarios except UV stress, while boron concentration was kept constant at 30 ppm.
Considering the influence of stress intensity, the studies were categorized based on drought intensity, with moderate and severe drought stress assigned values of 1 and 2, respectively, based on reported effects in the literature. In the same way, light intensity values were set to −1 for low light, 0 for optimal light, and 1 for high light. Additionally, plant family information was incorporated as a feature in the data modeling process, allowing for the capture of family‐specific alterations observed in plants under combined stress.
To develop ANN‐based deep learning models, several configurations were tested by varying the number of layers, the number of neurons, and the types of activation functions (Figure 2A, Appendix S5). Ultimately, the ANN architecture with six layers consistently outperformed other configurations in terms of its ability to capture the complexity of the data and produce meaningful results. Thus, an ANN‐based deep learning model was developed with six hidden layers to analyze the morphological datasets. The model had two dropout layers, two output layers (one for parameter identification and another for predicting the percentage change in the parameter's value), and a normalization layer before the input layer (Figure 2B).
Figure 2.

Schematic pipeline for model design, development, structure, and implementation. (A) Illustration of the step‐by‐step process of feature preparation, dataset partitioning, artificial neural network (ANN) model construction, hyperparameter tuning, and performance evaluation. (B) Final ANN architecture with multiple hidden layers, dropout layers (dropout rates indicated by the numbers 0.3 and 0.2), and regression and/or classification output layers. Overall, the model processes and transforms input data through multiple layers, allowing it to learn patterns and make predictions for both regression and classification tasks.
The training and validation datasets were randomly selected and stratified based on the biological parameters used for the classification model. Different combinations of training:validation sets, such as 75:25, 70:30, 80:20, and 90:10, were tested using the scikit‐learn library in Python. We encountered a limitation in dataset size, where the available data was insufficient to train a deep learning model effectively, and therefore used a data augmentation approach only in the training subset, leaving the test and validation sets entirely composed of original, non‐augmented data. This circumvented the dataset size limitation and also improved model generalization performance by exposing the model to a broader variety of variations, reducing overfitting and improving robustness. In other research, data augmentation by noise injection has been applied to enhance classification accuracy (Kim and Chung, 2024). In the present study, random noise was generated for each data point based on the mean and standard deviation of the original series to produce augmented samples reflecting realistic variability. For every original sample, nine additional samples were generated, resulting in a 10‐fold increase in dataset size. This augmentation ratio exposed the model to a broader set of patterns and temporal variations, leading to improved generalization (Kim and Chung, 2024). We employed a similar technique in our study, wherein a geometric region was defined around each sample point and synthetic samples were generated such that their mean matched that of the original point. Another study used the Geometric Small Data Oversampling Technique (GSDOT) and achieved a notable improvement in classification accuracy compared to both the original small dataset and several widely used artificial data generation methods, as per experimental evaluations (Douzas et al., 2022). These results highlight the effectiveness of using geometry‐driven augmentation in scenarios where training data are scarce (Douzas et al., 2022). We adapted a similar technique for continuous variables, where every row in the dataset was duplicated 10 times; all categorical columns were kept the same, and only the values for continuous numerical columns (plant performance and average temperature) were varied by 0.05 or −0.05, such that the average value for all the repeated rows was exactly the same as the original value provided in the dataset. This adaptation allowed us to increase the dataset size and enhance representation across different stress combinations by generating additional samples in a controlled and statistically consistent manner. Min‐max scaling was applied to reduce the scattered data in the target and minimize the number of outliers in the final training and validation datasets (Figure 2A, Appendix S6). The model's performance was assessed by fine‐tuning the weight initializer, layers, epochs, and activation functions in each epoch. Multiple activation functions, including the rectified linear unit (ReLU), scaled exponential linear unit (SELU), hyperbolic tangent (TANH), and exponential linear unit (ELU), were tested on evaluation metrics such as the root mean square error (RMSE), accuracy, precision, F1, the Matthews correlation coefficient (MCC), specificity, and recall. Furthermore, the model was trained with different loss weights for the two output layers, specifically [1,100], [1,50], [1,10], [1,1], [10,1], [50,1], and [100,1]. To determine the best‐performing loss weights, thorough evaluations were conducted using model metrics such as RMSE, precision, and recall.
Experimental validation of model
Rice (Oryza sativa L.) genotypes AC35027, AC35006, and TIL10 were grown in pots and subjected to individual and combined drought and heat treatments at the reproductive stage as described by Da Costa et al. (2021a). Drought was imposed by withholding irrigation until the field capacity reached 50%. Heat treatment was applied by exposing the plants to 38°C at anthesis for three days in a growth chamber. For the combined stress treatment, drought‐stressed plants at 50% field capacity were subjected to heat stress of 38°C for three days. After the stress treatments, the plants were allowed to recover and grow to physiological maturity for measuring the grain yield.
RESULTS
Dataset design
The stress combinations dataset consisted of 748 data points (abiotic–abiotic: 255, abiotic–biotic: 185, and biotic–biotic: 308) (Figure 1A). The analysis of the studies revealed that plant biomass was the most extensively studied trait, followed by yield; yield was also the most studied parameter (Figure 1B). Thus, biomass and yield are among the most commonly reported and agronomically important traits in the literature, making them highly relevant as modeling inputs.
The amount of data was increased in the training dataset only by duplicating each row 10 times—first incrementing the value by +0.05 successively five times and then decrementing it by 0.05 in the same manner—while ensuring that the average of all 11 rows remained equal to the original value. The repetition function generates new rows by adjusting plant performance and average temperature values iteratively (increasing and decreasing by 0.05), augmenting the dataset with variations. The augmented dataset includes additional rows that reflect the original data with slightly modified values to simulate variability for downstream analysis. This process resulted in approximately 3138 rows (Figure 2A, Appendices S4 and S5). As depicted in Appendix S7, the dataset's mean value post‐augmentation remained unchanged for all three major stress combination categories, ensuring that the augmented data remained statistically congruent with the original dataset. The median values (orange line) were found to be comparable across all categories, indicating that the data were normally distributed and not skewed, with no apparent differences between the two groups. Similar interquartile ranges indicated comparable data dispersion, and the ranges defined by minimum and maximum values showed that both non‐augmented and augmented datasets remained within a similar range without bias (Appendix S7).
Distribution of target variables
Min‐max scaling was applied to reduce target variability and minimize the influence of outliers. Outliers, or extreme values, were recognized as potentially affecting model performance despite a large dataset and robust model. The original unscaled dataset (Appendix S6A) was divided into training and validation sets, maintaining the relative frequencies of stratified parameters to ensure representative samples in both sets (Appendix S6B).
Training and validation split
The 80:20 training and validation set combinations were selected based on the model's best accuracy. Training data comprised 80% of the training samples (3138 data points) and 20% of the validation data (131 data points). An additional test dataset with 21 rows was used to validate the model on an unseen dataset (Appendix S8).
Model training
The model was trained from scratch. This involved initializing the model with random or specific initial values and then optimizing its parameters using a dataset‐specific problem statement. The model was built with [x, y] weights, where x is for classification of parameters and y is for regression of plant performance. Different weights for the two outputs were tested, including [100, 1], [50, 1], [10, 1], [1, 1], [1, 10], [1, 50], and [1, 100]. After careful evaluation of model metrics (Figure 3) and checking the tradeoff in their values, [100, 1] loss weights were chosen for training the model for both tasks (Figure 2A). Losses for model training used are mean squared error for regression and sparse categorical cross‐entropy for classification. Adam Optimizer (Kingma and Ba, 2015) was used with a learning rate of 0.003 and 100 epochs with a batch size of 50 (Appendix S5).
Figure 3.

Depiction of model metrics using the statistical measures root mean square error (RMSE), precision, and recall. (A) A line chart displaying the relationship between RMSE values and loss weights in the model. RMSE is a measure of the model's accuracy, with lower values indicating better performance. Each dot on the plot represents a specific combination of RMSE value and loss weight. (B) A scatter plot displaying precision and recall values at various loss weights in the model. Precision measures the accuracy of positive predictions, while recall assesses the model's ability to identify positive instances. Loss weights balance regression and classification losses. Each data point represents a specific loss weight and its corresponding precision and recall values. [100,1] gives the best tradeoff between the model performance on regression and classification tasks.
Model evaluation
A composite model was developed and evaluated using the testing dataset to accurately predict two target variables: the affected parameter and the percentage change in parameter value. To achieve this multi‐target prediction, a classification and regression model was implemented. Different deep learning models were constructed using various activation functions, including TANH, ReLU, ELU, and SELU, using the aforementioned dataset. Model performance on the test dataset was measured using a confusion matrix to calculate evaluation metrics such as accuracy, precision, recall, F1 score, specificity, and MCC (Figure 4). Among the activation functions tested, TANH consistently demonstrated the best performance across all criteria. The overall accuracy of the ANN model developed in the study was 76.33%; performance of the models is detailed in Table 2. The percentage of predicted values included within the (−0.10 to 0.10) range of the actual value was calculated to decipher the accuracy of the regression prediction task for the ANN model, with 80.2% of predicted values overlapping the actual values in the validation set and 90% matching in the training set within a range of ±0.10 (Figure 5).
Figure 4.

Confusion matrices depicting model classification performance on the training and validation datasets using the hyperbolic tangent (TANH) activation function for two target variables (biomass and yield). Each cell represents the number of instances belonging to a specific combination of true and predicted classes, thus presenting the key elements, including true positives (correctly predicted positives), true negatives (correctly predicted negatives), false positives (incorrectly predicted positives), and false negatives (incorrectly predicted negatives). The confusion matrix y‐axis represents the actual labels (biomass and yield), while the x‐axis represents the predicted class labels. Each cell in the matrix represents the number of instances of specific combinations of true and predicted classes. The confusion matrix for the training dataset (A) shows that the model achieved 81% classification accuracy for biomass and 86% accuracy for yield, while the confusion matrix for the validation dataset (B) shows 76% classification accuracy for biomass and 77% accuracy for yield.
Table 2.
Comparison of model performance with different activation functions.
| Accuracy (%) | Precision (%) | Recall (%) | RMSE | F1 (%) | MCC (%) | Specificity (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Activation functions | Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation |
| ELU | 80.9 | 74 | 83.2 | 77.2 | 80.9 | 74 | 0.067 | 74222.36 | 82.03 | 75.6 | 59.9 | 51.3 | 88 | 81 |
| SELU | 78.2 | 70.2 | 82.1 | 73.7 | 78.2 | 70.2 | 0.07 | 149017.2 | 80.1 | 71.9 | 50.1 | 43.3 | 78 | 77 |
| ReLU | 82.5 | 75.6 | 85.2 | 77.2 | 82.5 | 75.6 | 0.065 | 7039.4 | 83.8 | 76.4 | 66.14 | 52 | 86 | 77 |
| TANH | 82.8 | 76.3 | 84 | 77.7 | 82.8 | 76.3 | 0.063 | 0.08642 | 83.4 | 77 | 67.1 | 53 | 86 | 77 |
Note: ELU = exponential linear unit, MCC = Matthews correlation coefficient, ReLU = rectified linear unit, RMSE = root mean square error, SELU = scaled exponential linear unit, TANH = hyperbolic tangent.
Figure 5.

Scatter plots of actual (shown in blue) and predicted (shown in orange) values depicting the regression performance of the model. Each data point represents a specific observation, with the x‐axis showing the plant performance and the y‐axis displaying the sample index/number. This graphical representation offers an intuitive view of the model's accuracy for the regression task and its ability to capture the underlying patterns within the dataset. (A) 90% of the predicted values overlapped the actual values in the training set, while (B) 80.2% matched in the validation set (range of ±0.10).
Testing of the model
To further evaluate the performance of the model, we compared the actual experimental data procured from greenhouse experiments on rice for the combination of drought and heat stress with the model predictions (Figure 6, Appendices S8 and S9). Our greenhouse experiment revealed that drought, heat, and combined drought and heat stress caused a 34%, 37%, and 48% reduction, respectively, in grain weights in the three genotypes, indicating that the combined stress is the most damaging (Figure 6A). To check the model accuracy, we compared the percentage reduction in grain weight over the control of the three rice genotypes (observed) with the values predicted by the model (predicted). The model was fed with the test data of 21 different stress treatments of individuals and combined drought, heat, and fungal stress combinations (Aym and Zadors, 1979; Hernández‐Delgado et al., 2009; Chekali et al., 2011; Sinha et al., 2019, 2021; Da Costa et al., 2021b). The raw data from the publication were target encoded, and all categorical values were converted into numerical values. These data were then fed into the model to obtain the predictions, producing a two‐column output, where the two columns corresponded to the plant performance prediction (percentage loss or the regression output) and class (i.e., the target variable, whether yield, biomass, or the classification output) (Table 3). For both the individual drought and heat stress and combined stress categories, the observed values and the model predictions were highly comparable. A Pearson correlation coefficient of 0.915 was observed, indicating a strong positive relationship between actual and predicted plant performance values and validating the accuracy of the developed model (Figure 6B).
Figure 6.

Evaluating the performance of the model using greenhouse experiments on rice. (A) Bar graph showing the effect (absolute values) of individual and combined drought and heat on filled grain weight from three rice genotypes (AC35006, AC35027, and TIL10). Data are expressed as the mean of replicates (n = 6), and error bars represent standard error of means. One‐way ANOVA was used followed by Tukey's multiple comparison test. Significant differences with the control are represented as P < 0.01(indicated by **) and P < 0.001 (indicated by ***). (B) Comparison between the actual and predicted values in percentage reduction in yield over the control. Actual values represent the average of percentage reduction in yield over the control in the three genotypes (n = 3).
Table 3.
Model‐predicted output on unseen wet lab experimental data.
| Serial number | Percentage loss/Regression outputa | Predicted class/Classification outputa |
|---|---|---|
| 1 | 40.75 | Yield |
| 2 | 37.3 | Yield |
| 3 | 41.21 | Yield |
| 4 | 40.08 | Biomass |
| 5 | 41.66 | Biomass |
| 6 | 49.83 | Yield |
| 7 | 39.27 | Biomass |
| 8 | 41.54 | Biomass |
| 9 | 46.88 | Yield |
| 10 | 40.29 | Biomass |
| 11 | 41.54 | Biomass |
| 12 | 49.82 | Yield |
| 13 | 39.3 | Biomass |
| 14 | 43.15 | Biomass |
| 15 | 34.48 | Biomass |
| 16 | 40.48 | Yield |
| 17 | 36.8 | Biomass |
| 18 | 51.8 | Yield |
| 19 | 40.75 | Yield |
| 20 | 37.3 | Yield |
| 21 | 55.67 | Yield |
The regression output shows the percentage loss predicted by the model based on the additional test dataset (refer to Appendix S8), while the classification output represents the associated parameter.
Furthermore, published chickpea and wheat datasets under combined drought and fungal stress were incorporated to increase test set variability. We compared the percentage reduction in plant performance as reported in the published papers with the values predicted by the model. We found that the model's predictions closely aligned with experimental values (Table 4).
Table 4.
Model‐predicted output on unseen experimental data on drought and fungal stress combination.
| Experimental dataa | Predicted by the model | ||||
|---|---|---|---|---|---|
| Plantb | Treatments | Parameter | Reduction in plant performance | Parameter | Reduction in plant performance |
| Chickpea1 | Drought and fungus (Macrophomina phaseolina) | Yieldc | 53.05 | Yield | 46.881014 |
| Chickpea1 | Drought and fungus (Macrophomina phaseolina) | Yieldc | 46.43 | Yield | 49.817488 |
| Wheat2 | Drought and fungus (Fusarium culmorum) | Yieldd | 66.06 | Yield | 51.796275 |
The reduction in plant performance was calculated using the data provided in the published articles, following the formula described in the Methods section.
Parameter measured: Total seed weight.
Parameter measured: Number of tillers per plant.
Ablation study
Three approaches to ablation studies—input/feature ablation, functional ablation, and neuronal ablation—were employed to evaluate and optimize the neural network. Hyperparameter tuning was performed to determine optimal configurations of neurons and hidden layers, and the final ANN structure was established (Figure 2). Functional ablation was incorporated using various activation functions (TANH, SELU, ELU, ReLU; Table 2) and adjustments to learning rate, epochs, and batch size based on model metrics and loss plots. Input/feature ablation was conducted to identify critical features influencing model performance. Three features—temperature (heat), drought, and combined drought and heat—were ablated sequentially. The results indicated that drought was the most critical factor across all studies (Table 5), consistent with previous reports identifying drought as the dominant stressor under combined stress conditions (Zhou et al., 2017; Wen et al., 2022).
Table 5.
Feature ablation results showing model performance after removing drought, high temperature, and combined drought–high temperature features.
| Precision | Recall | Accuracy | RMSE | |||||
|---|---|---|---|---|---|---|---|---|
| Ablated features | Train | Validation | Train | Validation | Train | Validation | Train | Validation |
| Temperature | 73.5 | 73.21 | 72.92 | 72.5 | 72.92 | 72.5 | 8.22 | 8.23 |
| Drought | 29.84 | 29.92 | 54.42 | 54.62 | 54.42 | 54.62 | 27.55 | 27.38 |
| Temperature and drought | 72.12 | 72.74 | 71.09 | 71.45 | 71.09 | 71.45 | 7.03 | 6.85 |
Note: RMSE = root mean square error.
DISCUSSION
Understanding the impact of combined stresses is crucial for developing effective crop protection strategies and climate‐resilient crops. Machine learning–based tools offer the ability to predict the effect of combined stresses on plants, beyond the available data. The recently published database SCIPDb provided a timely and biologically organized dataset developed through large‐scale manual curation of experimental data and served as the primary source of accurate training data for developing the ANN model used in this study (Priya et al., 2023).
Initially, we evaluated a total of 12 models using this dataset, including the multilayer perceptron (MLP) model (Popescu et al., 2009) and ANN, as well as the machine learning algorithms LightGBM (Ke et al., 2017), XGBoost (Chen and Guestrin, 2016), CatBoost (Prokhorenkova et al., 2018), random forest (Breiman, 2001), decision tree (Charbuty and Abdulazeez, 2021), support vector machine (Pisner and Schnyer, 2020), k‐nearest neighbors (Mucherino et al., 2009), AdaBoost (Ying et al., 2013), Bernoulli naive Bayes (McCallum, 1998), and Gaussian naive Bayes (Çınar, 2024). Among these, the ANN model demonstrated superior performance across multiple features and was therefore selected for the study.
After post‐augmentation of only a subset of the training data, the dataset's mean value remained unchanged for the plant performance categories (Appendix S7), ensuring that the augmented data remained statistically congruent with the original dataset. This allowed us to preserve the dataset's underlying statistical characteristics while expanding its size through augmentation. The median value being comparable across the non‐augmented and augmented groups indicated that the data distribution was normal and not skewed, and there was likely no difference between the two groups. The similarity of the interquartile ranges also supported that data dispersion between the two groups was similar. Furthermore, the range of the data as depicted by minimum and maximum scores indicated that both the non‐augmented and augmented datasets lie within a comparable range without bias in their distribution. Together, these observations confirm that the augmentation approach expanded the dataset while maintaining its underlying statistical properties (Appendix S7).
For the validation dataset, the proposed TANH activation function achieved the highest accuracy, with values of 76% for biomass and 77% for yield. Similarly, when applied to the training dataset, the model with the TANH activation function exhibited accuracies of 81% for biomass and 86% for yield (Figure 4). The scatter plot analysis offered an intuitive view of the model's accuracy and its ability to capture the underlying patterns within the dataset for the regression task (Figure 5). As in previous studies (Sinha et al., 2021), we evaluated the model's performance by comparing experimental greenhouse data on rice under drought and heat stress with model predictions (Figure 6A, Appendix S8). The model incorporated data from 41 stress combinations, drawn from the literature, specifically for morphological parameters (Priya et al., 2023). The model results substantiated previous findings that combined drought and heat stress reduces yield by lowering harvest index, shortening crop life, and decreasing seed number and size, particularly during reproductive stages (Cohen et al., 2021). These effects were confirmed in greenhouse experiments, with reduced grain weights observed across three rice genotypes. The observed and predicted reductions were closely aligned, validating the model (Figure 6B). Rice, which is highly sensitive to individual and combined drought and heat stress during the reproductive stage (Jagadish et al., 2015; Da Costa et al., 2021a, 2021b), was chosen due to dataset availability; this does not imply generalizability to all species, as morphological data for combined stresses remain limited. Furthermore, published chickpea and wheat datasets under combined drought and fungal stress were included to increase test set variability (Table 4). The predictions closely matched the experimental values, demonstrating the potential for cross‐species applicability and reliable performance estimation beyond the original training set (Table 4). Importantly, the test set from the experiments was unique and not drawn from the training dataset. Moreover, ablation testing, which is widely used in neuroscience (Meyes et al., 2019) and recently applied in plant science (Lawal et al., 2021; Keceli et al., 2022), identified drought as the primary stressor in combined stress scenarios involving drought, consistent with previous studies (Table 5).
Our study addresses the urgent need for predictive tools on combined stresses by providing an ANN‐based model that forecasts crop responses across conditions. By predicting changes in biomass and yield under specific stress combinations, the model guides targeted crop improvement. For example, the model can help identify which combinations of stresses are most detrimental to crop productivity, enabling researchers to prioritize those interactions in field studies and breeding trials. In addition, it serves as a valuable resource for agronomists and crop modelers to simulate outcomes under future climatic or agronomic conditions, improving preparedness and resilience planning.
While our ANN‐based framework shows promise in predicting plant responses to combined abiotic and biotic stresses, we acknowledge several limitations that may affect its generalizability and accuracy. A primary limitation is the availability of training data. The current model is trained on publicly available datasets, which are often unevenly distributed across crop species, stress types, and experimental conditions. As a result, certain plant species, particularly under‐researched crops and stress combinations, may be underrepresented, potentially introducing bias into the model's predictions. As such, we view this study as a first step toward building a more comprehensive and dynamic predictive platform. Moreover, while our model accounts for major abiotic and biotic stresses, it does not yet incorporate environmental and edaphic variables such as soil type, nutrient status, or microbial interactions, which can significantly influence plant responses. Including these factors in future iterations of the model could improve its predictive accuracy and biological relevance. Another challenge lies in the complexity of stress interactions. Some combined stress effects are not additive or even synergistic and may involve intricate physiological tradeoffs that are difficult to capture without mechanistic data.
CONCLUSIONS
In this study, we have developed an advanced composite deep learning model for predicting the parameters affected and percentage changes in the respective parameter values under various abiotic and biotic stress combinations (Appendix S10). Our approach sidesteps many of the issues encountered in traditional machine learning applications and provides a novel computational tool for making use of multivariate and complex combined stress datasets in future studies.
The model's performance has been validated using actual wet lab data to predict stress impact under drought and heat stress combinations in plants. We conducted a comparative analysis of different activation functions, including TANH, SELU, ELU, and ReLU, and our results showed that the TANH activation function outperformed other competitive models used in this study.
AUTHOR CONTRIBUTIONS
M.S.‐K. conceived and planned the study, designed the experiments, and provided the necessary resources. Pi.P., M.K., and R.J. developed the model, analyzed data, and presented the output data. S.J. was involved in the initial design and structure of the model. Pr.P. contributed to data collection and biological verification of output data, while Pr.P. and R.C. contributed to the phenomics data collection, organization, and analysis component. V.R. provided experimental validation. Pi.P. and Pr.P. drafted the manuscript, and M.S.‐K. finalized the manuscript. All authors contributed to the article editing and approved the final version.
CONFLICT OF INTEREST STATEMENT
Muthappa Senthil‐Kumar is a senior associate editor of Applications in Plant Sciences, but took no part in the peer review and decision‐making processes for this paper.
Supporting information
Appendix S1. Details of the search terms and literature used for analysis in the model.
Appendix S2. Search engines used for literature mining for the combined stress datasets.
Appendix S3. Data extraction example.
Appendix S4. Raw and processed training and validation dataset.
Appendix S5. Artificial neural network model development pipeline.
Appendix S6. Distribution of target variables.
Appendix S7. Box plot depicting the distribution of the non‐augmented and augmented datasets used for model training.
Appendix S8. Input test dataset for the artificial neural network model.
Appendix S9. Raw data related to experimental validation of the impact of combined drought and heat on rice grain yield.
Appendix S10. Overall framework of the artificial neural network model development for predicting the impact of combined stress in plants.
ACKNOWLEDGMENTS
The authors acknowledge the support provided by the Biotechnology Research and Innovation Council (BRIC)–National Institute of Plant Genome Research (NIPGR) through core funding to M.S.‐K. Pi.P. acknowledges the fellowship support from the Council of Scientific and Industrial Research (CSIR) (No. 13 (9106‐A)/2020‐Pool), while S.J. acknowledges the INSA‐IASc‐NASI Summer Research Fellowship (MATS233/2021). The authors are grateful to the Department of Biotechnology (DBT) eLibrary Consortium, India, and the NIPGR library for providing access to e‐resources. The authors also acknowledge the computational facilities provided by the Indian Biological Data Centre BRAHM‐HPC and the DBT‐DISC facility at NIPGR for sharing resources. The authors extend their appreciation to Mr. Aswin Reddy Chilakala and Mr. Shubhashish Ranjan from our lab for scrutinizing the raw data and internally reviewing the manuscript. The authors acknowledge the contribution of Mrs. Shikha Tuteja Chandna and Ms. Pranavi Jampa for their inputs in Appendix S5.
Priya, P. , Pandey P., Jain R., Kandpal M., Jain S., Chaudhury R., Ramegowda V., and Senthil‐Kumar M.. 2026. An artificial neural network–based deep learning model to predict combined stress impact and interaction in plants. Applications in Plant Sciences 14(2): e70047. 10.1002/aps3.70047
Piyush Priya, Prachi Pandey, Rubi Jain, and Manu Kandpal contributed equally to this work.
DATA AVAILABILITY STATEMENT
The scripts, Jupyter Notebooks, quick start guide, and example datasets used in this study are freely available at GitHub (https://github.com/scipdatabase/Prediction_model). The literature sources used for data extraction and for training the ANN model are provided in the Supporting Information. For details on various stress combinations and input data features, readers may refer to the Stress Combinations and their Interactions in Plants Database (SCIPDb) (Priya et al., 2023), available at https://db.nipgr.ac.in/plant_complete/index_orangesunset.php.
REFERENCES
- Ahuja, I. , De Vos R. C. H., Bones A. M., and Hall R. D.. 2010. Plant molecular stress responses face climate change. Trends in Plant Science 15: 664–674. [DOI] [PubMed] [Google Scholar]
- Atkinson, N. J. , Lilley C. J., and Urwin P. E.. 2013. Identification of genes involved in the response of Arabidopsis to simultaneous biotic and abiotic stresses. Plant Physiology 162: 2028–2041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aym, P. G. , and Zadors J. C.. 1979. Combined effects of powdery mildew disease and soil water level on the water relations and growth of barley. Physiological Plant Pathology 14: 347–361. [Google Scholar]
- Barah, P. , Naika M. B. N., Jayavelu N. D., Sowdhamini R., Shameer K., and Bones A. M.. 2016. Transcriptional regulatory networks in Arabidopsis thaliana during single and combined stresses. Nucleic Acids Research 44: 3147–3164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breiman, L. 2001. Random forests. Machine Learning 45: 5–32. [Google Scholar]
- Carranza‐Rojas, J. , Goeau H., Bonnet P., Mata‐Montero E., and Joly A.. 2017. Going deeper in the automated identification of herbarium specimens. BMC Evolutionary Biology 17: e181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charbuty, B. , and Abdulazeez A.. 2021. Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends 2: 20–28. [Google Scholar]
- Chekali, S. , Gargouri S., Paulitz T., Nicol J. M., Rezgui M., and Nasraoui B.. 2011. Effects of Fusarium culmorum and water stress on durum wheat in Tunisia. Crop Protection 30: 718–725. [Google Scholar]
- Chen, T. , and Guestrin C.. 2016. XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. Association for Computing Machinery, San Francisco, California, USA.
- Chilakala, A. R. , Mali K. V., Irulappan V., Patil B. S., Pandey P., Rangappa K., Ramegowda V., et al. 2022. Combined drought and heat stress influences the root water relation and determine the dry root rot disease development under field conditions: A study using contrasting chickpea genotypes. Frontiers in Plant Science 13: e890551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choudhary, A. , and Senthil‐Kumar M.. 2022. Drought attenuates plant defence against bacterial pathogens by suppressing the expression of CBP60g/SARD1 during combined stress. Plant, Cell & Environment 45: 1127–1145. [DOI] [PubMed] [Google Scholar]
- Çınar, A. 2024. Multi‐class classification with the Gaussian naive Bayes algorithm. Journal of Data Applications 2024: 1–13. [Google Scholar]
- Cohen, I. , Zandalinas S. I., Huck C., Fritschi F. B., and Mittler R.. 2021. Meta‐analysis of drought and heat stress combination impact on crop yield and yield components. Physiologia Plantarum 171: 66–76. [DOI] [PubMed] [Google Scholar]
- Cohen, S. P. , Liu H., Argueso C. T., Pereira A., Vera Cruz C., Verdier V., and Leach J. E.. 2017. RNA‐Seq analysis reveals insight into enhanced rice Xa7‐mediated bacterial blight resistance at high temperature. PLoS ONE 12: e0187625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Da Costa, M. V. J. , Ramegowda Y., Ramegowda V., Karaba N. N., Sreeman S. M., and Udayakumar M.. 2021a. Combined drought and heat stress in rice: Responses, phenotyping and strategies to improve tolerance. Rice Science 28: 233–242. [Google Scholar]
- Da Costa, M. V. J. , Ramegowda V., Sreeman S., and Nataraja K. N.. 2021b. Targeted phytohormone profiling identifies potential regulators of spikelet sterility in rice under combined drought and heat stress. International Journal of Molecular Sciences 22: e11690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desaint, H. , Aoun N., Deslandes L., Vailleau F., Roux F., and Berthomé R.. 2021. Fight hard or die trying: When plants face pathogens under heat stress. New Phytologist 229: 712–734. [DOI] [PubMed] [Google Scholar]
- Douzas, G. , Lechleitner M., and Bacao F.. 2022. Improving the quality of predictive models in small data GSDOT: A new algorithm for generating synthetic data. PLoS ONE 17: e0265626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fang, X. , Zhang C., Wang Z., Duan T., Yu B., Jia X., Pang J., et al. 2021. Co‐infection by soil‐borne fungal pathogens alters disease responses among diverse alfalfa varieties. Frontiers in Microbiology 12: e664385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gaggion, N. , Ariel F., Daric V., Lambert E., Legendre S., Roulé T., Camoirano A., et al. 2021. ChronoRoot: High‐throughput phenotyping by deep segmentation networks reveals novel temporal parameters of plant root system architecture. GigaScience 10: giab052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geetharamani, G. A. P. J. , and Pandian A.. 2019. Identification of plant leaf diseases using a nine‐layer deep convolutional neural network. Computers and Electrical Engineering 76: 323–338. [Google Scholar]
- Guo, C. , Ma L., Yuan S., and Wang R.. 2017. Morphological, physiological and anatomical traits of plant functional types in temperate grasslands along a large‐scale aridity gradient in northeastern China. Scientific Reports 7: e40900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ha, J. 2022. MDMF: Predicting miRNA–disease association based on matrix factorization with disease similarity constraint. Journal of Personalized Medicine 12: e885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ha, J. , and Park S.. 2023. NCMD: Node2vec‐based neural collaborative filtering for predicting miRNA‐disease association. IEEE/ACM Transactions on Computational Biology and Bioinformatics 20: 1257–1268. [DOI] [PubMed] [Google Scholar]
- Han, H. , and Liu W.. 2019. The coming era of artificial intelligence in biological data science. BMC Bioinformatics 20: e712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hernández‐Delgado, S. , Cantú‐Almaguer M. A., Arroyo‐Becerra A. L., Villalobos‐López M. A., González‐Prieto J. M., and Mayek‐Pérez N.. 2009. Influence of water stress and Macrophomina phaseolina in growth and grain yield of common beans under controlled and field conditions. COOPERATIVE 2009: 92. [Google Scholar]
- Jagadish, S. V. K. , Murty M. V. R., and Quick W. P.. 2015. Rice responses to rising temperatures – Challenges, perspectives and future directions. Plant, Cell & Environment 38: 1686–1698. [DOI] [PubMed] [Google Scholar]
- Ke, G. , Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., et al. 2017. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30, 4–9 December 2017, Long Beach, California, USA.
- Keceli, A. S. , Kaya A., Catal C., and Tekinerdogan B.. 2022. Deep learning‐based multi‐task prediction system for plant disease and species detection. Ecological Informatics 69: e101679. [Google Scholar]
- Kim, G. I. , and Chung K.. 2024. Extraction of features for time series classification using noise injection. Sensors 24: e6402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kingma, D. P. , and Ba J.. 2015. Adam: A method for stochastic optimization. Third International Conference for Learning Representations (ICLR), San Diego, California, USA.
- Kumari, N. , and Pandey S.. 2023. Application of artificial intelligence in environmental sustainability and climate change. In Srivastav A. L., Dubey A. K., Kumar A., Narang S. K., and Khan M. A. [eds.], Visualization techniques for climate change with machine learning and artificial intelligence, 293–316. Elsevier, Amsterdam, the Netherlands. [Google Scholar]
- Lawal, O. M. , Huamin Z., and Fan Z.. 2021. Ablation studies on YOLOFruit detection algorithm for fruit harvesting robot using deep learning. IOP Conference Series: Earth and Environmental Science 922: e012001. [Google Scholar]
- Mak, K.‐K. , and Pichika M. R.. 2019. Artificial intelligence in drug development: Present status and future prospects. Drug Discovery Today 24: 773–780. [DOI] [PubMed] [Google Scholar]
- Marsland, S. 2014. Machine learning: An algorithmic perspective, 2nd ed. Chapman and Hall/CRC, Boca Raton, Florida, USA. [Google Scholar]
- Masters, T. 2018. Deep belief nets in C++ and CUDA C, vol. 1. Apress, Berkeley, California, USA. [Google Scholar]
- McCallum, A . 1998. A comparison of event models for naive Bayes text classification. Proceedings of the AAAI‐98 Workshop on Learning for Text Categorization, 41–48. Madison, Wisconsin, USA.
- Meyes, R. , Lu M., de Puiseau C. W., and Meisen T.. 2019. Ablation studies in artificial neural networks. arXiv 1901.08644 [preprint]. Available at https://arxiv.org/abs/1901.08644 [posted 24 January 2019; accessed 19 February 2026].
- Misra, N. N. , Dixit Y., Al‐Mallahi A., Bhullar M. S., Upadhyay R., and Martynenko A.. 2022. IoT, big data, and artificial intelligence in agriculture and food industry. IEEE Internet of Things Journal 9: 6305–6324. [Google Scholar]
- Mucherino, A. , Papajorgji P. J., and Pardalos P. M.. 2009. k‐Nearest neighbor classification. Data Mining in Agriculture, 83–106. Springer, New York, New York, USA. [Google Scholar]
- Nti, I. K. , Adekoya A. F., Weyori B. A., and Nyarko‐Boateng O.. 2022. Applications of artificial intelligence in engineering and manufacturing: A systematic review. Journal of Intelligent Manufacturing 33: 1581–1601. [Google Scholar]
- Pandey, P. , and Senthil‐Kumar M.. 2024. Unmasking complexities of combined stresses for creating climate‐smart crops. Trends in Plant Science 29: 1172–1175. [DOI] [PubMed] [Google Scholar]
- Pandey, P. , Patil M., Priya P., and Senthil‐Kumar M.. 2024. When two negatives make a positive: The favorable impact of the combination of abiotic stress and pathogen infection on plants. Journal of Experimental Botany 75: 674–688. [DOI] [PubMed] [Google Scholar]
- Pisner, D. A. , and Schnyer D. M.. 2020. Support vector machine. In Mechelli A. and Vieira S. [eds.], Machine learning: Methods and applications to brain disorders, 101–121. Elsevier, Amsterdam, the Netherlands. [Google Scholar]
- Popescu, M.‐C. , Balas V. E., Perescu‐Popescu L., and Mastorakis N.. 2009. Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems 8: 579–588. [Google Scholar]
- Prasch, C. M. , and Sonnewald U.. 2013. Simultaneous application of heat, drought, and virus to Arabidopsis plants reveals significant shifts in signaling networks. Plant Physiology 162: 1849–1866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Priya, P. , Patil M., Pandey P., Singh A., Babu V. S., and Senthil‐Kumar M.. 2023. Stress combinations and their interactions in plants database: A one‐stop resource on combined stress responses in plants. The Plant Journal 116: 1097–1117. [DOI] [PubMed] [Google Scholar]
- Prokhorenkova, L. , Gusev G., Vorobev A., Dorogush A. V., and Gulin A.. 2018. CatBoost: Unbiased boosting with categorical features. Advances in Neural Information Processing Systems 31, 3–8 December 2018, Montreal, Canada. [Google Scholar]
- Ramegowda, V. , and Senthil‐Kumar M.. 2015. The interactive effects of simultaneous biotic and abiotic stresses on plants: Mechanistic understanding from drought and pathogen combination. Journal of Plant Physiology 176: 47–54. [DOI] [PubMed] [Google Scholar]
- Rivero, R. M. , Mittler R., Blumwald E., and Zandalinas S. I.. 2022. Developing climate‐resilient crops: Improving plant tolerance to stress combination. The Plant Journal 109: 373–389. [DOI] [PubMed] [Google Scholar]
- Russell, S. J ., and Norvig P.. 2021. Artificial intelligence: A modern approach, vol. 4. Pearson, London, United Kingdom. [Google Scholar]
- Sinha, R. , Irulappan V., Mohan‐Raju B., Suganthi A., and Senthil‐Kumar M.. 2019. Impact of drought stress on simultaneously occurring pathogen infection in field‐grown chickpea. Scientific Reports 9: e5577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinha, R. , Irulappan V., Patil B. S., Reddy P. C. O., Ramegowda V., Mohan‐Raju B., Rangappa K., et al. 2021. Low soil moisture predisposes field‐grown chickpea plants to dry root rot disease: Evidence from simulation modeling and correlation analysis. Scientific Reports 11: e6568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinha, R. , Zandalinas S. I., Fichman Y., Sen S., Zeng S., Gómez‐Cadenas A., Joshi T., et al. 2022. Differential regulation of flower transpiration during abiotic stress in annual plants. New Phytologist 235: 611–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith, A. G. , Petersen J., Selvan R., and Rasmussen C. R.. 2020. Segmentation of roots in soil with U‐Net. Plant Methods 16: e13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ubbens, J. R. , and Stavness I.. 2017. Deep plant phenomics: A deep learning platform for complex plant phenotyping tasks. Frontiers in Plant Science 8: e1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wen, W. , Timmermans J., Chen Q., and Van Bodegom P. M.. 2022. Monitoring the combined effects of drought and salinity stress on crops using remote sensing in the Netherlands. Hydrology and Earth System Sciences 26: 4537–4552. [Google Scholar]
- Yasrab, R. , Atkinson J. A., Wells D. M., French A. P., Pridmore T. P., and Pound M. P.. 2019. RootNav 2.0: Deep learning for automatic navigation of complex plant root architectures. GigaScience 8: giz123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ying, C. , Qi‐Guang M., Jia‐Chen L., and Lin G.. 2013. Advance and prospects of AdaBoost algorithm. Acta Automatica Sinica 39: 745–758. [Google Scholar]
- Yu, J. , Schumann A. W., Cao Z., Sharpe S. M., and Boyd N. S.. 2019. Weed detection in perennial ryegrass with deep learning convolutional neural network. Frontiers in Plant Science 10: e1422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zandalinas, S. I. , and Mittler R.. 2022. Plant responses to multifactorial stress combination. New Phytologist 234: 1161–1167. [DOI] [PubMed] [Google Scholar]
- Zandalinas, S. I. , Peláez‐Vico M. Á., Sinha R., Pascual L. S., and Mittler R.. 2024. The impact of multifactorial stress combination on plants, crops, and ecosystems: How should we prepare for what comes next? The Plant Journal 117: 1800–1814. [DOI] [PubMed] [Google Scholar]
- Zhou, R. , Yu X., Ottosen C.‐O., Rosenqvist E., Zhao L., Wang Y., Yu W., et al. 2017. Drought stress had a predominant effect over heat stress on three tomato cultivars subjected to combined stress. BMC Plant Biology 17: e24. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix S1. Details of the search terms and literature used for analysis in the model.
Appendix S2. Search engines used for literature mining for the combined stress datasets.
Appendix S3. Data extraction example.
Appendix S4. Raw and processed training and validation dataset.
Appendix S5. Artificial neural network model development pipeline.
Appendix S6. Distribution of target variables.
Appendix S7. Box plot depicting the distribution of the non‐augmented and augmented datasets used for model training.
Appendix S8. Input test dataset for the artificial neural network model.
Appendix S9. Raw data related to experimental validation of the impact of combined drought and heat on rice grain yield.
Appendix S10. Overall framework of the artificial neural network model development for predicting the impact of combined stress in plants.
Data Availability Statement
The scripts, Jupyter Notebooks, quick start guide, and example datasets used in this study are freely available at GitHub (https://github.com/scipdatabase/Prediction_model). The literature sources used for data extraction and for training the ANN model are provided in the Supporting Information. For details on various stress combinations and input data features, readers may refer to the Stress Combinations and their Interactions in Plants Database (SCIPDb) (Priya et al., 2023), available at https://db.nipgr.ac.in/plant_complete/index_orangesunset.php.
