Decoding living systems: Reassessing crop model frontiers via biological dynamics and optimized phenotype

Edgar S Correa

doi:10.1371/journal.pone.0343530

. 2026 Mar 11;21(3):e0343530. doi: 10.1371/journal.pone.0343530

Decoding living systems: Reassessing crop model frontiers via biological dynamics and optimized phenotype

Edgar S Correa ^1,^2,^3,^*

Editor: Paulo Eduardo Teodoro⁴

¹Pontificia Universidad Javeriana, School of Engineering, Bogota, Colombia

²UMR AGAP Institut, Univ. Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France

³CIRAD, UMR AGAP Institut, Montpellier, France

⁴Federal University of Mato Grosso do Sul, BRAZIL

Competing Interests: The author has read the journal’s policy on competing interests and declares the following. The author declares no personal financial competing interests beyond the funding sources listed in the Funding section. For transparency, the author discloses prior professional and supervisory/mentoring relationships within the doctoral research context in which this work was developed. The individuals named did not contribute to the manuscript’s study design, data analysis, interpretation, decision to publish, or writing, and therefore do not meet authorship criteria for this article; they are disclosed solely because these relationships could reasonably be perceived as relevant to the peer review and editorial process, including the identification of potential non-independent evaluators. María Camila Rebolledo has had prior professional relationships with the author within the doctoral research context, including scientific discussions on project needs and potential applications; may have academic/professional interests related to this research. Julian Ramirez-Villegas and Alexandre B. Heinemann have had close professional ties and collaborations within the relevant research network for this work; these relationships could reasonably be perceived as affecting independence in any evaluative role related to this work. Myriam Adam and Julian Colorado are former mentors/supervisors during earlier stages of the doctoral research. The author affirms sole authorship of this work. No other competing interests are declared. There are no patents, products in development or marketed products associated with this research to declare. This does not alter the author’s adherence to PLOS ONE policies on sharing data and materials.

^✉

* E-mail: e_correa@javeriana.edu.co

Roles

Edgar S Correa: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

Paulo Eduardo Teodoro: Editor

PMCID: PMC12978445 PMID: 41811935

Abstract

Modeling and optimizing phenotypic performance of biological systems demands understanding how physiological processes mediate genotype-by-environment interactions. While AI-driven approaches achieve predictive accuracy, they often function as black boxes that obscure biological causality. Process-based models address this limitation through explicit mechanistic representation, enabling both quantitative optimization and biological interpretation. This study contributes an inverse engineering framework with three integrated layers: sensitivity analysis validating biological coherence, genetic algorithm exploring virtual phenotypes to identify adaptive strategies, and similarity analysis quantifying routes from computational optima to field-validated cultivars. Sensitivity analysis identified eight genetic-based coefficients governing yield with robust rankings (95% CI width = 0.04). The genetic algorithm explored 5,364 virtual cultivars across 40 generations, revealing two strategies: extended growth (116 days) achieving 4,837 kg/ha under higher water availability (815 mm, field capacity 0.30), and shortened cycles (100–103 days) maintaining high efficiency (HI: 0.55–0.58) under water deficit (540 mm, field capacity 0.23)—covering 89% of the cultivation area. Similarity analysis against 21 field-validated cultivars identified WAB56−50 (70.7%) and DKAP2 (67.2%) as breeding candidates, quantifying a 22–30% genetic gap between current germplasm and computational optima. The framework, built upon 3 years of field characterization, compressed the evaluation and selection cycle, enabling adaptation across regional precipitation gradients identified through GMM-based classification. The principles demonstrated here extend across biological scales—from organismal phenotyping to cellular systems where biological dynamics can be modeled and traits measured.

Introduction

Predicting and optimizing phenotypic performance in living systems requires mechanistic and dynamic understanding of how biological processes mediate genotype-by-environment interactions. This challenge spans biological scales, from cellular stress responses to organism-level adaptation, and demands integrative frameworks combining process-based modeling with computational optimization.

The implications extend from crop resilience under climate change to cellular dynamics in disease progression. As global food security demands rise and precision medicine advances, deciphering these genotype-environment interactions has become essential across the life sciences.

Recent advances in artificial intelligence, image processing, pattern recognition, and high-throughput phenotyping have accelerated data acquisition and predictive modeling [1–8]. Yet while AI-driven approaches achieve predictive accuracy, they often function as black boxes that obscure biological causality. Process-based models address this limitation through explicit mechanistic representation, enabling both quantitative optimization and biological interpretation.

Agriculture exemplifies this challenge: breeders must rapidly phenotype large populations and accurately target the most promising progeny [9–11]. Yet accurately predicting crop performance under diverse and variable conditions remains difficult due to complex genotype-by-environment (G×E) interactions [12–14].

Biological process-based models offer a systematic approach to this challenge by encoding biological mechanisms rather than relying solely on correlative patterns. Mechanistic crop models (MCMs) exemplify this approach, providing a structured framework to simulate plant development as a dynamic system, integrating physiological processes such as photosynthesis, biomass allocation, phenology, and soil–plant–atmosphere interactions [15,16]. MCMs such as DSSAT, APSIM, STICS, and AquaCrop simulate crop responses under different environmental and management scenarios through five core modules: assimilation, water/nutrient balance, biomass partitioning, grain formation, and phenology (Fig 1) [17–19].

Fig 1 — Source: Original diagram created by the author using AI-assisted tools.

MCMs have proven effective for climate adaptation research [20,21]. While they do not capture gene dynamics or 3D architecture at molecular scales, their capacity to simulate functional trait responses—phenology, biomass partitioning, stress tolerance—provides the biological foundation for ideotype design.

Traditional plant breeding has relied on two empirical strategies. The first, defect correction, eliminates undesirable traits such as disease susceptibility or lodging through backcrossing—improving what exists without defining what is optimal. The second, yield selection, identifies high-performing variants through iterative field trials across environments—discovering superior genotypes without predicting why they perform. Both approaches require 10–15 years and substantial resources, yet operate without explicit performance targets: breeders know what they find, not what they seek [22].

C. M. Donald proposed a third approach: predictive ideotype design. Rather than selecting from existing variation or correcting defects, this strategy defines optimal phenotypic configurations a priori using biological process-based models, then identifies or creates genotypes capable of achieving them [10,23,24]. The approach inverts traditional breeding logic—from “select the best available” to “define the target and engineer toward it.”

Genetic algorithms enable systematic exploration of this trait space through simulation. While molecular approaches optimize at the genotype level, process-based modeling captures genotype-by-environment causality, enabling regional-scale optimization across climate and soil gradients identified through systematic GMM-based environmental classification [25–28].

This study demonstrates an inverse engineering approach to phenotype optimization, using rainfed rice as a model system. The framework integrates three analytical layers: sensitivity analysis identifying control parameters, genetic algorithm optimization exploring fitness landscapes, and similarity analysis mapping routes from computational optima to existing germplasm. Unlike previous approaches centered primarily on yield-driven morphological optimization, a dual-performance metric is introduced, integrating Harvest Index (HI) and Water Use Efficiency (WUE), to quantify grain conversion efficiency relative to water availability.

Global sensitivity analysis using the Morris method identifies the most influential genotype-based parameters. These are optimized via Genetic Algorithm across 5,364 virtual cultivars and four distinct environments, systematically selected to represent 89% of the regional cultivation area based on soil-climate gradients [25]. The optimized ideotype is then compared against 21 field-characterized rice cultivars, spanning indica, japonica, and hybrid groups, using multidimensional similarity metrics [29,30]. This analysis identifies cultivars closest to computational optima, quantifying the genetic gap that breeding programs must traverse.

The integration of process-based modeling, AI-driven optimization, and genotypic similarity analysis provides a foundation that extends beyond agriculture to biological systems from organismal to cellular scales where measurable traits determine functional performance under environmental constraints.

Materials and methods

Biological-based growth modeling setup

The CERES-Rice model represents genotypic variation through eleven crop coefficients that govern phenological development, carbon partitioning, and stress responses (Fig 2). Of these, eight parameters (P1, P5, P2O, P2R, G1, G2, G3, PHINT) were selected for optimization based on sensitivity analysis, while thermal stress thresholds (THOT, TCLDP, TCLDF) were excluded due to context-dependent sensitivity. These coefficients translate genetic information into physiological behavior, enabling in-silico exploration of genotype × environment interactions.

Fig 2 — Parameter interdependencies create a high-dimensional optimization landscape where sensitivity analysis guides parameter selection for genetic algorithm exploration. Source: Original diagram created by the author using AI-assisted tools. Microscopy images used for illustration are reprinted from Fan et al. (2023) [31], Liu et al. (2016) [32], and Yang et al. (2025) [33] under a CC BY license, with permission from the respective publishers, original copyright 2023, 2016, and 2025. All files are freely available online for use, distribution, and reproduction with proper attribution.

Plant growth progression is quantified through Growing Degree Days (GDD), representing the thermal time required for phenological advancement. This thermal accumulation framework—calculated as the daily difference between mean temperature and a base temperature below which growth ceases—provides the mechanistic basis for simulating developmental plasticity across environments [34–37].

The genetic coefficients are functionally organized into three groups reflecting their physiological roles:

Phenological development parameters. P1 (thermal time to panicle initiation, GDD) and P5 (grain filling duration, GDD) collectively determine crop cycle length from emergence to physiological maturity. P1 governs vegetative phase duration, directly influencing anthesis timing; extended vegetative periods promote greater biomass accumulation, enhanced stem reserves, and increased spikelet number. P2O (critical photoperiod, hours) and P2R (photoperiod-induced delay, GDD) modulate developmental rate in response to daylength [38,39].

Source-sink partitioning parameters. During vegetative growth, three parameters determine yield potential: PHINT (phyllochron interval, GDD) controls leaf appearance rate and canopy expansion [40,41]; G1 defines spikelet number per panicle; and G3 (tillering coefficient) governs productive tiller formation. During grain filling, G2 (potential grain weight, g) determines individual grain sink strength.

Thermal stress parameters. Spikelet sterility is modulated by temperature extremes through THOT (heat-induced sterility threshold), TCLDF (cold-induced sterility), and TCLDP (cold-induced panicle delay). Low temperatures (15–19°C) during panicle development impair pollen formation [42–45], while high temperatures (>35°C) during flowering disrupt fertilization [46–49]. These thresholds enable simulation of climate-induced yield reductions critical for adaptation assessment.

The strong interdependencies among parameters—where changes in vegetative duration (P1) cascade through tillering (G3), spikelet formation (G1), and grain filling dynamics (G2, P5)—create a high-dimensional optimization challenge that requires systematic computational exploration, motivating the sensitivity analysis and genetic algorithm framework described in the following sections.

Model input data: Environment and cultivars

Process-based modeling captures genotype-by-environment interactions through a defined data architecture: environmental drivers as inputs, genetic coefficients as modulators, and phenotypic traits as outputs. This structure enables mechanistic simulation of how genotypes respond differentially to environmental conditions—a framework validated here through field experimentation where 21 cultivars were characterized across diverse soil-climate combinations.

This research is based on rainfed field experiments conducted between 2012 and 2014 along the west coast of Africa, covering the Casamance and Eastern Senegal, including Ziguinchor, Sédhiou, Kolda [50]. This study examines the crop’s response to environmental factors by integrating soil properties and climatic variables with a comprehensive database that includes growth responses such as yield and biomass measurements across 21 varieties.

Soil data: Soil properties were analyzed through laboratory tests at a depth of 30 cm. In contrast, the satellite-based SoilGrids database provides soil characteristics at three depth levels (15 cm, 30 cm, and 100 cm), which are used as input variables for the crop growth model [51]. The properties include soil texture, essential nutrients, and other critical parameters, as outlined in Table 1. Additionally, the crop growth model requires soil hydraulic properties such as the permanent wilting point (SLLL), field capacity (SDUL), saturation (SSAT), and saturated hydraulic conductivity (SSKS), which are derived using Saxton and Rawls’ pedotransfer functions [52–55].

Table 1. Environmental inputs and phenotypic outputs for process-based modeling. Soil, climate, and crop parameters used in the crop growth model.

Environment		Cultivar
Soil Parameters	Climate Parameters	Crop Observations
Clay content (g/kg) – SLCL	Minimum Temperature (°C)	Biomass (kg/ha)
Silt content (g/kg) – SLSI	Maximum Temperature (°C)	Grain Yield (kg/ha)
Soil Sand (g/kg) – Soil Sand	Solar Radiation (MJ/m²/day)	Number of Grains (#/m²)
Bulk density (cg/cm³) – SBDM	Wind Speed (m/s)	Number of Tillers (#/m²)
Coarse fraction (cm³/dm³) – SLCF	Relative Humidity (%)	Anthesis (days)
Soil Nitrogen (cg/kg) – SLNI	Precipitation (mm/day)	Maturity (days)
Total Organic Carbon (dg/kg) – SLOC
pH (unitless) – SLHW

Open in a new tab

Climate data: The crops and the mechanistic crop model representation (MCM) is driven by critical inputs such as minimum and maximum temperature (°C), solar radiation (MJ/m²/day), wind speed (m/s), relative humidity (%), and precipitation (mm/day), all of which are derived from the rainfed rice study [50]. For regional-scale applications, these meteorological variables are sourced from the NASA POWER dataset [56].

Crop observations: To study crop growth in response to the environment (GxE), it is crucial to understand the response of each genotype to environmental conditions. The rainfed rice study [50] analyzes 21 varieties and reports the crop’s response in terms of variables such as organic matter accumulation including grain yield, biomass, number of grains, and number of tillers, as well as phenology variables like anthesis and maturity.

GMM-based environmental classification

A study conducted in Casamance and Eastern Senegal identified twelve distinctive environments using Principal Component Analysis (PCA) and a Gaussian Mixture Model (GMM) in three dimensions, considering all soil and meteorological variables [25]. Fig 3 shows that 89% of the region is represented by four environments, covering two soil types and two climatic zones. The convex hull is the polygon that encloses all given points in an environment, and the closest point to the centroid is selected as the representative sample for that environment.

Fig 3 — Base map layers (coastlines and administrative boundaries) were derived from publicly available datasets distributed with MATLAB Mapping Toolbox and are compatible with CC BY 4.0 licensing. Environmental classification and data layers were adapted from Correa et al. (2025), published under a CC BY license [25]. Map generated using MATLAB R2024b with Mapping Toolbox and Image Processing Toolbox.

Soil type 1, predominantly located in the eastern part of the region, is characterized by higher pH (5.91), bulk density (1.50 g/cm³), silt content (30.53 g/kg), and clay content (26.33 g/kg). It also presents intermediate hydraulic properties related to water retention, with SLLL (0.16 cm³/cm³), SDUL (0.29 cm³/cm³), SSAT (0.44 cm³/cm³), and a moderate saturated hydraulic conductivity (SSKS, 0.75 cm/h). In contrast, soil type 2 is classified as sandy (56.59 g/kg), with higher saturated hydraulic conductivity (SSKS, 1.09 cm/h) and a slightly lower pH (5.87). Due to its soil composition, soil type 2 exhibits low water retention capacity, which facilitates rapid water infiltration and drainage when saturated.

Climate Type 1 predominates in the southern region and is characterized by higher relative humidity (80.33%), lower average solar radiation (19.16 MJ/m²/day) with lower variability (3.44 MJ/m²/day), and moderate temperatures, including minimum (22.19°C), maximum (30.60°C), and average (26.39°C) values, along with greater accumulated precipitation (917 mm). Climate Type 2 is mainly located in the northern part of the study area and is distinguished by 44% lower accumulated rainfall (513.24 mm) and 18% lower average relative humidity (67.98%), along with higher average solar radiation (19.62 MJ/m²/day), and higher minimum (22.51°C), maximum (33.04°C), and average (27.78°C) temperatures compared to the southern climate. The vapor pressure deficit (VPD), calculated from temperature and humidity, was 77% higher in northern climates (1.19 kPa) compared to southern climates (0.67 kPa), indicating substantially greater atmospheric water demand that intensifies drought stress during critical reproductive stages.

Software simulation setup

All simulations were performed on a workstation equipped with an Intel Core i7-12700H processor (12th Gen, 2.30 GHz) and 32 GB RAM, running Windows 11 Professional (Build 26100.7171). CERES-Rice simulations were executed using DSSAT v4.8 [57,58]. Genetic algorithm optimization and post-processing analyses were implemented in MATLAB R2024b [59] and R v4.4.2 [60].

Sensitivity analysis of CERES-Rice model

Quantifying parameter influence on system performance is essential for process-based biological modeling, enabling identification of key drivers that govern phenotypic outcomes.

The Morris method was selected for its computational efficiency and ability to screen multiple parameters simultaneously [61,62]. Unlike local sensitivity approaches that perturb one parameter from a fixed baseline, the Morris method explores the entire parameter space through randomized trajectories, capturing potential interactions between physiological processes encoded in the model.

The method operates by generating r trajectories through the 11-dimensional parameter space, where each trajectory consists of $k + 1$ points (Algorithm 1, Fig 4a). Consecutive points differ in exactly one parameter by a step size Δ, allowing computation of elementary effects—the change in model output attributable to each parameter perturbation. For the 11 CERES-Rice physiological coefficients, this design required $r (k + 1) = r \times 12$ model evaluations per trajectory, substantially fewer than factorial designs while maintaining comprehensive parameter coverage.

Algorithm 1. Morris method for sensitivity analysis of CERES-Rice genetic coefficients

1: Input: Genetic coefficients $k = 11$ , trajectories r, step size Δ (Fig 4b)

2: Output: Elementary effects for each coefficient-output combination

3: Initialize: Random base point $x^{*}$ within coefficient bounds

4: for each trajectory $t = 1$ to r do

5: Generate initial point $x^{(1)}$ from $x^{*}$

6: for $i = 2$ to $k + 1$ do

7: Modify one coefficient by $\pm Δ$ to obtain $x^{(i)}$

8: Evaluate CERES-Rice model at $x^{(i)}$

9: Compute elementary effect for modified coefficient

10: end for

11: end for

12: Return: Distribution of elementary effects per genetic coefficient

Parameter bounds and step sizes (Fig 4b) were defined based on physiological ranges reported for rice cultivars, ensuring that perturbations remained within biologically plausible limits. Each parameter was perturbed across its full range, from early-maturing to late-maturing phenotypes for phenological traits (P1, P5), and from low to high tillering capacity for reproductive traits (G3).

To ensure robust sensitivity estimates, 20 independent replications were performed with randomized parameter sequences, providing confidence intervals for parameter rankings (S1 Table).

The Relative Sensitivity Index (RSI) was computed to standardize sensitivity across output variables with different units and magnitudes. For each parameter-output combination, RSI quantifies the mean absolute change in output relative to the maximum observed change (Eq 1):

RSI = \frac{1}{n} \sum_{i = 1}^{n} \frac{| Δ Y_{i} |}{max (| Δ Y_{i} |)}

(1)

where $Δ Y_{i}$ represents the change in model output (grain yield, biomass, anthesis, maturity) resulting from the i-th elementary effect. RSI values range from 0 (no influence) to 1 (maximum influence), enabling direct comparison of parameter importance across outputs with different units (kg/ha for yield, days for phenology).

This standardized metric enables identification of which physiological traits—developmental timing, canopy formation, or reproductive capacity—most strongly govern system performance, providing the foundation for targeted parameter optimization.

Genetic algorithm optimization

The Genetic Algorithm (GA) is an artificial intelligence approach based on heuristic search for optimization problems. It is used with the CERES-Rice crop growth model to characterize the ideotype that optimizes grain conversion efficiency and water use efficiency by optimizing genetic crop growth parameters through the evolution of a population of candidate solutions over multiple generations for each environment. The algorithm starts by initializing a population of individuals, each representing a potential set of genetic parameters for the crop growth model. It evaluates the fitness of these individuals by assessing the harvest index (HI) and water use efficiency (WUE), which are integrated into the fitness function. The selection process favors the best-performing individuals, which are then used to generate the next generation through crossover and mutation operations. Mutation introduces diversity into the population, preventing premature convergence.

Fitness Metric: HI-WUE Integrated Index.

To guide ideotype optimization and evaluate each set of genetic crop parameter combinations, an integrated efficiency index (HI-WUE) is employed as the fitness metric. This index combines two critical agronomic performance indicators: the Harvest Index (HI) and Water Use Efficiency (WUE), as defined in Eq 3. The HI quantifies the efficiency of assimilate partitioning by representing the ratio of grain yield to total biomass. In contrast, the WUE reflects the efficiency of water utilization, expressed as the amount of grain produced per unit of accumulated evapotranspiration. The HI-WUE index enables an integrated assessment of genotypic performance in terms of both yield potential and resource use efficiency.

To ensure equal contribution of both metrics to the fitness function, WUE was normalized to [0,1] scale using physiologically grounded bounds from the literature:

{WUE}_{norm} = \frac{WUE - {WUE}_{min}}{{WUE}_{max} - {WUE}_{min}}

(2)

Where WUE_min and WUE_max represent physiological bounds (2 and 15 kg ha⁻¹ mm⁻¹, respectively) reported for aerobic rice under drought adaptation [63]. The integrated fitness function is then:

HI-WUE = HI + {WUE}_{norm}

(3)

This formulation ensures balanced optimization with equal weighting (50%–50%) between grain conversion efficiency and water use efficiency.

Algorithm 2. Genetic Algorithm for ideotyping in CERES-Rice

1: Initialize population P with Num_ind individuals

2: Set global best ind_best and min_dist_global to infinity

3: for generation = 1 to Num_P do

4: Evaluate fitness of each individual in P

5: for each individual i in P do

6: if fitness of individual i is better than current best then

7: Update $i n d_{b e s t}$ min_dist_global

8: end if

9: end for

10: P ← Select_parents(P)

11: P ← Crossover_parents(P)

12: P ← Mutation(P, Thr_mut)

13: Track and plot dist_hist and dist_hist_mean

14: end for

15: Save $i n d_{b e s t}$ and plot final fitness results

The Genetic Algorithm for ideotyping in the CERES-Rice crop model is outlined in Algorithm 2. It iterates over multiple generations, optimizing genetic-based crop parameters to enhance resource efficiency and yield performance across different environments. Throughout this process, the algorithm refines the population by selecting parents, applying crossover, and introducing mutations to evolve toward the optimal ideotype. Key performance metrics, including maximum fitness, mean population fitness, and the best genotype’s fitness, are continuously tracked to assess progress and convergence.

The algorithm uses the following parameters: the number of generations, set to 40; the initial population size, set to 15; and the mutation probability, set to 0.7. These parameters ensured adequate exploration of the 8-dimensional parameter search space (P1, P5, P2R, PHINT, P2O, G1, G2, G3). The high mutation rate facilitated broad search space coverage, while convergence analysis (S1 Fig) demonstrates fitness stabilization by generations 5–13 across environments, confirming successful identification of optimal parameter combinations.

Each individual in the GA represents a potential solution, corresponding to a specific combination of genetic crop parameters for the CERES-Rice crop model. The individual is defined as $X_{i} = {P 1, P 5, P 2 R, P H I N T, P 2 O, G 1, G 2, G 3}$ , where each parameter takes a value within a predefined range, as shown in Fig 4-b.

The GA employs roulette wheel selection for parent selection, arithmetic crossover for recombination, and adaptive mutation targeting high-sensitivity parameters identified through sensitivity analysis. Detailed mathematical formulations and implementation of these operators are provided in Supplementary Methods.

Similarity analysis with field-characterized cultivars

Fig 5 presents 21 rice cultivars from a rainfed experiment conducted in the region [50], classified into three genetic groups: indica, japonica, and hybrid. Each cultivar is characterized by eight genetic coefficients (P1, P5, P2R, PHINT, P2O, G1, G2, G3). The optimized ideotypes were compared against these cultivars to identify genotypic similarities and inform breeding recommendations.

Similarity between ideotypes and field-characterized cultivars was assessed using three complementary distance metrics: Euclidean distance measuring direct separation in parameter space, Manhattan distance summing absolute differences along each dimension, and Cosine distance assessing angular similarity between coefficient vectors. These metrics capture different aspects of genetic proximity—geometric distance, coordinate-wise deviation, and directional alignment—providing robust cultivar ranking through metric consensus.

The similarity index for each ideotype-cultivar pair was calculated as the normalized average across all distance metrics (Eq 4). Distance values were inverted and scaled to yield similarity scores in the interval [0, 1], where higher values indicate greater genetic proximity.

Similarity = 1 - \frac{{\bar{d}}_{i j}}{max (\bar{d})}

(4)

where ${\bar{d}}_{i j}$ represents the average distance between ideotype i and cultivar j across all metrics, and $max (\bar{d})$ is the maximum average distance observed.

Frequency analysis quantified how often each cultivar appeared among the top-ranked matches across all ideotypes and metrics, identifying cultivars with consistent proximity to multiple optimized phenotypes. Principal Component Analysis (PCA) was applied to the combined dataset of ideotypes and cultivars to visualize genetic distances in reduced dimensional space, with Euclidean distance in PCA space providing an independent validation of similarity rankings.

Algorithm 3 presents the structured approach for identifying the most genetically similar cultivars. The similarity index and frequency serve as complementary metrics: similarity quantifies genetic proximity, while frequency indicates relevance across the ideotype population.

Algorithm 3. Identification of the most similar cultivars

1: Input: Virtual cultivar dataset V, real cultivar dataset R, distance metrics D

2: Output: Frequency analysis and similarity index visualization ▷ Step 1: Compute distance and find closest cultivars

3: for each metric $d \in D$ do

4: for each virtual cultivar $v \in V$ do

5: Compute distances: $d i s t [v, r] \leftarrow d (v, r), \forall r \in R$

6: Sort R by $d i s t [v, r]$ in ascending order

7: Select the top-k closest cultivars and store in $T [v, d]$

8: end for

9: end for ▷ Step 2: Aggregate closest cultivars for frequency analysis

10: for each virtual cultivar $v \in V$ do

11: Concatenate $T [v, :]$ into a single list of cultivars

12: Compute frequency of each cultivar

13: Sort cultivars by frequency in descending order

14: end for ▷Step 3: Compute normalized similarity index

15: for each virtual cultivar $v \in V$ do

16: for each real cultivar $r \in T [v, :]$ do

17: Compute similarity: $similarity [v, r] = 1 - \frac{avg_distance [v, r]}{max (avg_distance)}$

18: end for

19: Normalize similarity scores relative to the highest value

20: end for

21: End Algorithm

Results

This section presents the key findings of the optimization of the CERES-Rice crop model. Sensitivity analysis quantifies the impact of genetic-based crop parameters on biomass accumulation, grain yield, grain number, tiller count, and critical phenological stages, including anthesis and maturity. The ideotyping optimization process determines the optimal combination of genetic parameters, while similarity analysis, based on field-characterized cultivars, provides targeted recommendations for selecting the most promising progeny and potential genetic modifications to enhance resilience across diverse environments in the region.

Sensitivity analysis of genetic parameters

Sensitivity analysis, conducted across 20 independent replications with randomized parameter sequences, quantified the influence of 11 genetic coefficients on six model outputs (Fig 6). Parameter rankings were robust (mean 95% CI width = 0.04; S1 Table).

Fig 6 — Phenological parameters (P1, PHINT) dominate developmental timing and biomass accumulation, while reproductive parameters (G3, G2) govern yield component formation. Thermal stress parameters (TCLDP, TCLDF) exhibit negligible sensitivity under the studied conditions. Relative Sensitivity Index (RSI) for CERES-Rice model outputs: a) Biomass, b) Grain Yield, c) Number of Grains, d) Number of Tillers, e) Anthesis, f) Maturity. Values represent means across 20 independent replications; complete statistics with 95% CI in S1 Table.

Biomass and yield sensitivity.

Biomass accumulation was primarily governed by parameters controlling canopy development and vegetative phase duration. PHINT (phyllochron interval) exhibited the highest sensitivity (RSI = 0.70 ± 0.02), reflecting its direct regulation of leaf appearance rate and consequent light interception capacity. P1 (vegetative thermal time) ranked second (RSI = 0.55 ± 0.04), determining the duration of the photosynthetically active phase. G3 (tillering coefficient) demonstrated moderate influence (RSI = 0.48 ± 0.05) through its effect on tiller-derived leaf area contribution. P2O and P5 contributed moderately (RSI = 0.45 and 0.42, respectively). Grain-related parameters G1 and G2 exhibited negligible sensitivity (RSI $<$ 0.07), indicating that sink traits do not exert feedback regulation on source accumulation within the CERES-Rice model structure.

Grain yield exhibited a more distributed sensitivity pattern, reflecting its dependence on both source and sink processes. G3 demonstrated the highest sensitivity (RSI = 0.59 ± 0.04), as tiller number directly determines panicle density and yield potential. P1 ranked second (RSI = 0.46 ± 0.04), influencing yield through vegetative biomass accumulation and spikelet differentiation. THOT (heat-induced sterility threshold) exhibited substantial sensitivity (RSI = 0.45 ± 0.06), indicating potential yield reduction under elevated temperature conditions during flowering. Number of grains was most sensitive to G2 (potential grain weight, RSI = 0.64 ± 0.03) and G3 (RSI = 0.59 ± 0.04), confirming the predominance of sink-related parameters. Number of tillers was predominantly controlled by G3 (RSI = 0.93 ± 0.01), followed by P1 (RSI = 0.88 ± 0.02) and PHINT (RSI = 0.84 ± 0.03), demonstrating strong genetic determination of tillering capacity.

Phenological timing.

Anthesis timing was governed by phenological parameters with minimal estimation uncertainty. P1 exhibited the highest sensitivity across all output-parameter combinations (RSI = 0.93 ± 0.02), followed by PHINT (RSI = 0.86 ± 0.02) and P2O (RSI = 0.79 ± 0.12). Reproductive parameters (G1, G2, G3) exhibited zero sensitivity for anthesis, consistent with the model structure wherein flowering date is independent of sink-related traits.

Maturity timing displayed a broader sensitivity distribution, with P2O ranking highest (RSI = 0.64 ± 0.08) followed by P1 (RSI = 0.57 ± 0.19). The elevated variability observed for P1 sensitivity on maturity (CI width = 0.38) compared to anthesis (CI width = 0.03) reflects uncertainty accumulation through the grain-filling phase.

Thermal stress parameters.

Cold stress parameters TCLDP and TCLDF exhibited near-zero sensitivity across all outputs (RSI ≈ 0.00), indicating that cold-induced spikelet sterility was not a limiting factor under the thermal conditions characteristic of Senegal (minimum temperatures 22–23°C during the reproductive phase). THOT demonstrated moderate sensitivity exclusively for grain yield and grain number (RSI ≈ 0.45), suggesting localized susceptibility to heat stress during anthesis that merits consideration under projected climate change scenarios.

The narrow confidence intervals (mean CI width = 0.04) across 20 replications confirm robust parameter rankings, validating the selection of P1, P2O, P2R, P5, G1, G2, G3, PHINT as optimization targets for the genetic algorithm.

Ideotype optimization through genetic algorithm

The genetic algorithm explored a phenotypic landscape of 5,364 virtual cultivars across 40 generations, systematically identifying optimal genetic coefficient combinations that maximize the HI-WUE index—a composite metric integrating harvest index and water use efficiency. This computational exploration systematically evaluated unique genotype-environment combinations, demonstrating the potential of process-based modeling coupled with AI optimization.

Optimization was conducted across four distinct environments identified through GMM-based environmental classification [25], collectively representing 89% of the Casamance and Eastern Senegal cultivation area. These environments capture the critical climate-soil gradients constraining rainfed rice production: southern humid zones (Env 1, 3: precipitation ∼815 mm) versus northern drought-prone regions (Env 2, 4: precipitation ∼540 mm), crossed with soil water retention capacity ranging from high (Env 1: SDUL = 0.30 cm³/cm³) to low (Env 4: SDUL = 0.23 cm³/cm³).

Efficiency-yield correlations.

Fig 7 illustrates the relationship between the HI-WUE index and agronomic performance across all evaluated genotypes. Correlation analysis revealed a clear functional hierarchy: traits directly incorporated in the fitness function—harvest index (R² = 0.88–0.97) and water use efficiency (R² = 0.86–0.93)—exhibited strong positive correlation with HI-WUE. Grain yield demonstrated consistently strong correlation (R² = 0.78–0.86, p $<$ 0.001), confirming that optimizing physiological efficiency translates to productivity gains. Number of grains showed moderate, environment-dependent correlation (R² = 0.39–0.60), reflecting sink contribution to yield formation.

In contrast, biomass (R² $<$ 0.04), leaf area index (R² $<$ 0.05), root density (R² $<$ 0.02), and phenological timing (R² $<$ 0.08) showed negligible correlation with HI-WUE within the optimization. This pattern is consistent with sensitivity analysis findings where reproductive parameters (G1, G2, G3) controlled yield components independently of vegetative biomass accumulation (RSI $<$ 0.07). Complete correlation statistics are provided in S2 Table (Supplementary Material).

Environment-specific convergence dynamics.

Convergence analysis revealed optimization dynamics shaped by environmental constraints (S1 Fig, Supplementary Material). Southern environments with higher precipitation showed contrasting patterns: Env 1 (highest water retention: SDUL = 0.30 cm³/cm³) required 23 generations to reach 95% of maximum fitness, while Env 3 (sandy soil with high organic carbon) converged fastest at generation 10, achieving the highest fitness value (HI-WUE = 0.973).

Northern environments under drought stress (Env 2: 557 mm; Env 4: 525 mm precipitation) both converged at generation 20 with similar fitness values (0.955 and 0.948, respectively). These environments yielded identical optimal genetic coefficients despite differing soil properties (SDUL: 0.28 vs 0.23 cm³/cm³).

Two distinct adaptive strategies.

Fig 8 presents the predicted performance of optimized ideotypes. Two distinct strategies emerged from the optimization:

Strategy A—Extended growth (Env 1, 30% of cultivation area): Under favorable conditions combining high soil water retention with adequate precipitation, the optimal ideotype featured extended grain filling duration (P5 = 372 GDD) and moderate phyllochron (PHINT = 74 GDD). This strategy achieved the highest yield (4,837 kg/ha) with HI = 0.54 and WUE = 6.17 $k g {h a}^{- 1} {m m}^{- 1}$ across a 116-day cycle.

Strategy B—Shortened cycle (Env 2–4, 59% of cultivation area): Under water-limited conditions, the algorithm converged toward reduced grain filling duration (P5 = 207–248 GDD) and similar phyllochron (PHINT = 72–74 GDD), completing the reproductive cycle in 100–103 days. These ideotypes achieved yields of 3,743–4,213 kg/ha with harvest index of 0.55–0.58 and water use efficiency of 5.84–5.97 $k g {h a}^{- 1} {m m}^{- 1}$ .

Reproductive coefficient stability.

Across all environments, reproductive coefficients remained stable (G1 ≈ 62, G2 ≈ 0.025 g, G3 ≈ 0.90) while phenological parameters varied with environmental conditions (S3 Table, Supplementary Material).

Similarity and affinity of optimized ideotypes

Each variety and ideotype is characterized by a set of genetically-driven crop growth parameters: $P 1, P 5, P 2 R, P H I N T, P 2 O, G 1, G 2, G 3$ . Fig 9-a presents a 3D visualization of the Principal Component Analysis (PCA) conducted on these genetic parameters, which accounts for 85.8% of the total variance. The four optimized ideotypes (ID1–ID4), representing the highest-performing genetic configurations for each environment, are shown in red. Field-validated cultivars are classified into three genetic groups: indica (green), japonica (blue), and hybrids (orange). Dotted lines connect each ideotype to its four nearest cultivars based on Euclidean distance in PCA space. Fig 9-b depicts the similarity scores and frequencies of these varieties across all distance metrics, illustrating the alignment of genetic traits.

A single variety, WAB56−50, achieved the highest global similarity (70.7%) and appeared consistently across all four ideotypes, making it the most promising candidate for breeding programs. DKAP2 ranked second with 67.2% average similarity and was identified as the nearest cultivar for Ideotypes 2, 3, and 4 based on PCA distance.

For Ideotype 1 (favorable environment), the top matches were WAB56−50 (77.9%), RD 23 (74.0%), and NERICA17 (73.5%). Ideotypes 2 and 4 (drought-dominated environments) converged to identical genetic configurations, with DKAP2 (72.8%), WABC165 (67.4%), and WAB56−50 (66.9%) as the closest cultivars. Ideotype 3 showed intermediate characteristics with WAB56−50 (71.1%), NERICA17 (68.9%), and DKAP2 (66.0%) as top matches.

The complete similarity matrix between all ideotypes and field-validated cultivars is presented in Fig 10. Red boxes highlight the top five cultivars with highest average similarity across all ideotypes. Three varieties appeared in three or more ideotype rankings: WAB56−50 (all 4 ideotypes), DKAP2 (3 ideotypes), and RD 23 (2 ideotypes). The convergence of Ideotypes 2 and 4 suggests that drought stress is the dominant selective pressure in these environments, leading to similar optimal genetic configurations centered around shortened reproductive phases and enhanced water use efficiency.

Fig 11 presents the parameter-by-parameter comparison between optimized ideotypes and the five highest-affinity cultivars. The 22–30% genetic gap separating current cultivars from computational optima is not uniformly distributed: phenological parameters (P1, P5, P2R) show the largest divergence, while reproductive coefficients (G1, G2, G3) are already near-optimal. This hierarchy defines breeding priorities—sink traits can be maintained through marker-assisted selection while phenological adaptation requires targeted crossing with environment-specific donors.

Discussion

Three findings structure this discussion: the biological validation provided by sensitivity analysis, the adaptive strategies revealed by optimization, and the breeding roadmap defined by similarity analysis. Each analytical layer informed the next, progressively translating computational predictions into actionable recommendations.

Sensitivity analysis as biological validation

The hierarchical sensitivity patterns observed—phenological parameters governing development, reproductive parameters controlling yield components, and thermal parameters showing context-dependent influence—serve as internal validation of the model’s physiological coherence. A process-based model that failed to reproduce these expected functional separations would indicate structural deficiencies in its biological representation. This interpretability distinguishes process-based approaches from purely statistical or black-box methods, where predictor influence can be quantified but its biological meaning depends on external interpretation rather than explicit model structure.

The near-zero sensitivity of cold stress parameters (TCLDP, TCLDF) under Senegalese conditions, contrasted with moderate heat stress sensitivity (THOT), demonstrates that sensitivity analysis identifies which biological processes are active versus inactive in specific environmental contexts. This diagnostic capacity extends beyond parameter ranking to inform model applicability across environmental gradients—critical for process-based modeling of living systems under variable conditions. The narrow confidence intervals achieved through 20 replications (mean CI width = 0.04) confirm that sensitivity rankings are robust features of the model structure rather than random fluctuations, ensuring reliable parameter selection for subsequent optimization.

From parameter influence to phenotypic optimization

The sensitivity results directly informed the optimization strategy: the eight parameters exhibiting highest sensitivity indices (P1, P5, P2R, P2O, G1, G2, G3, PHINT) were selected for genetic algorithm exploration, while thermal stress parameters were excluded based on their negligible influence under Senegalese conditions. This principled parameter selection reduced search space dimensionality while preserving the degrees of freedom most influential for ideotype performance.

The genetic algorithm evaluated 5,364 virtual cultivars across 40 generations. Strong correlations between the HI-WUE fitness index and grain yield (R² = 0.78–0.86, p $<$ 0.001) validate the framework’s capacity to translate physiological efficiency into productivity gains. Two distinct adaptive strategies emerged: extended growth cycles (P5 = 372 GDD, 116 days) achieving maximum yields (4,837 kg/ha) in favorable southern environments, and shortened drought-escape cycles (P5 = 207–248 GDD, 100–103 days) maintaining high efficiency (HI: 0.55–0.58) in water-limited northern regions.

The climatic contrast between regions extends beyond precipitation totals. Vapor pressure deficit (VPD)—the driving force for transpiration—reaches 1.19 kPa in northern environments compared to 0.67 kPa in the south, representing 77% greater atmospheric water demand (Supplementary Note: VPD Calculation). This elevated VPD accelerates soil water depletion during the growing season, explaining why phenological escape rather than extended growth optimizes performance under northern conditions.

The convergence of Environments 2 and 4 to identical genetic coefficients despite differing soil properties (SDUL: 0.28 vs 0.23 cm³/cm³) reveals that precipitation deficit dominates over soil water-holding capacity in determining optimal phenological strategy. This finding simplifies breeding program targeting: a single drought-adapted cultivar can serve 59% of the Casamance and Eastern Senegal cultivation area. The stability of reproductive coefficients (G1 ≈ 62, G2 ≈ 0.025 g, G3 ≈ 0.90) across all environments, while phenological parameters varied substantially, provides a mechanistic basis for breeding: sink traits can be selected universally, phenological adaptation requires environment-specific tuning.

Bridging computational ideotypes to breeding targets

Optimization identifies optimal phenotypes; translating these into breeding targets requires mapping the genetic distance between computational ideotypes and existing germplasm. The 70.7% maximum similarity achieved by WAB56-50 indicates that even the closest cultivar requires traversing 29.3% of the genetic parameter space to reach ideotypic configuration. This quantified gap, not a limitation but a roadmap, defines the breeding challenge with unprecedented precision.

The distinct clustering of optimized ideotypes in PCA space (Fig 9a), explaining 85.8% of total variance (PC1: 57.2%, PC2: 16.8%, PC3: 11.9%), reveals that optimal genetic configurations are not represented in current germplasm. This finding carries dual implications: existing cultivars, despite decades of empirical selection, have not converged toward computationally identified optima; and substantial genetic gains remain accessible through targeted parameter modification. The consistency between PCA distance rankings (WAB56-50: d = 0.79 to ID1; DKAP2: d = 1.08–1.28 to ID2–4) and similarity scores computed from multiple metrics (Euclidean, Manhattan, Cosine) provides cross-validation that identified breeding candidates are robust to methodological choice.

WAB56-50 and DKAP2 emerge as complementary breeding candidates: WAB56−50 (japonica) aligns with the extended-growth strategy for favorable environments (Ideotype 1, 30% of cultivation area), while DKAP2 (hybrid) matches the drought-escape phenology required for water-limited zones (Ideotypes 2–4, 59% of area). The parameter-by-parameter comparison reveals a clear hierarchy for breeding intervention: phenological parameters require major adjustment (P1: + 29%, P2R: + 56%, P5: –21%), while reproductive parameters are already near-optimal (G1: + 4%, G3: –1%, P2O: + 3%). This decoupling—first identified in sensitivity analysis, confirmed in optimization convergence, and now quantified against existing germplasm—enables accelerated breeding: sink traits maintained through marker-assisted selection, phenological adaptation achieved through targeted crossing with environment-specific donors.

Model boundaries and future directions

The framework’s predictive capacity operates within CERES-Rice boundaries. The model captures dominant agronomic relationships but simplifies dynamic source-sink feedback, root architectural plasticity, and leaf-level physiological processes. The negligible correlation between biomass and HI-WUE (R² $<$ 0.04) suggests that optimized ideotypes achieve efficiency through improved assimilate partitioning rather than enhanced source capacity—a pattern that must be interpreted within these structural constraints. Future developments integrating these mechanisms could refine ideotype predictions under conditions where belowground traits and source-sink signaling determine yield stability.

The similarity analysis is constrained by the 21-cultivar reference panel. Expanding this panel to include additional African germplasm—particularly drought-tolerant Sahelian landraces—could identify cultivars with higher similarity to water-limited ideotypes. Furthermore, linking genetic coefficients to quantitative trait loci (QTL) would enable marker-assisted selection toward ideotypic configurations, bridging the phenotypic parameters used here with genomic breeding tools.

The 22–30% genetic gap between existing cultivars and optimized ideotypes quantifies both opportunity and challenge. Conventional breeding requires 10–15 years to traverse such distances through recurrent selection. The computational framework demonstrated here—sensitivity analysis informing optimization, optimization identifying targets, similarity analysis mapping routes—compresses the discovery phase, allowing breeding resources to focus on the irreducible biological timescales of crossing, selection, and field validation. This acceleration, demonstrated here for drought-stressed rice, extends to any biological system where physiological processes can be modeled and traits measured.

Conclusion

AI-driven optimization of biological processes

This study demonstrates that AI-driven optimization, coupled with process-based modeling, can transform how we approach complex biological systems. The framework compressed a discovery phase, evaluating 5,364 virtual phenotypes across 40 generations. This acceleration does not replace biological understanding—it amplifies it. The genetic algorithm did not operate as a black box; it explored a fitness landscape defined by explicit physiological mechanisms, ensuring that identified optima are biologically interpretable and experimentally tractable.

The integration of sensitivity analysis, optimization, and similarity analysis creates a pipeline where each computational layer informs the next: mechanistic understanding guides parameter selection, optimization reveals adaptive strategies, and similarity analysis maps implementation routes through existing biological material. This architecture—pattern recognition constrained by mechanistic structure—represents a paradigm for applying AI to living systems without sacrificing interpretability.

Case study findings

Applied to drought-stressed rice across four contrasting environments in Senegal, the framework revealed three actionable insights:

(i) Phenological parameters (P1, P5, P2R, PHINT) govern environmental adaptation and require local calibration, while reproductive parameters (G1, G2, G3) remain stable across conditions—enabling a two-track breeding strategy of universal sink selection with environment-specific developmental tuning.
(ii) Precipitation deficit overrides soil water-holding capacity as the dominant selective pressure. Ideotypes 2 and 4 converged to identical genetic coefficients despite differing soil properties, demonstrating that a single drought-adapted cultivar can serve 59% of the regional cultivation area.
(iii) WAB56-50 (70.7% similarity) and DKAP2 (67.2%) emerged as complementary breeding candidates, with a quantified 22–30% genetic gap defining the selection pressure required to achieve computational optima—transforming vague improvement goals into measurable breeding targets.

Modeling living systems: Limitations and horizons

Living systems do not follow simple rules. They adapt, compensate, and exhibit emergent behaviors that challenge simplified modeling approaches. CERES-Rice captures dominant agronomic relationships but simplifies source-sink feedback, root plasticity, and molecular stress responses. These simplifications constrain prediction where belowground traits and cellular signaling determine outcomes—precisely where the frontier lies.

Platforms like DSSAT—integrating models such as CERES-Rice—have served as frameworks for decades, enabling agronomic research worldwide. Yet widespread adoption has paradoxically constrained scientific progress: users apply these models as established tools rather than testable hypotheses, accepting default parameterizations without questioning underlying assumptions. Biological expertise without computational scrutiny accepts model outputs uncritically; engineering expertise without biological insight optimizes parameters blindly. The frontier lies where both converge—yet this intersection remains largely unexplored. Model boundaries stay unexamined, gaps unquantified, limitations unaddressed.

Cellular systems offer the next horizon. Proliferation rates, metabolic fluxes, stress responses—measured with increasing precision through advances in microscopy, high-throughput phenotyping, and single-cell technologies—generate data streams that process-based models have yet to integrate. The convergence of observation technologies with AI-driven pattern recognition creates unprecedented potential: data revealing what equations cannot capture and mechanistic frameworks ensuring biological coherence.

The principles demonstrated here are scale-independent. What changes across biological scales is not the analytical logic but the resolution of observation. Bridging this gap—from crop canopy to cellular architecture, from field phenotyping to subcellular dynamics—represents both the challenge and the opportunity for quantitative biology driven by technological advances and AI in the coming decade.

Supporting information

S1 File. Supplementary Material.

Supporting figure and additional details (includes S1 Fig with panels A–D: genetic algorithm convergence across four environments).

(PDF)

pone.0343530.s001.pdf^{(254.9KB, pdf)}

S1 Table. Relative Sensitivity Index (RSI) from 20 Morris method replications.

Bold indicates the highest sensitivity per output.

(PDF)

pone.0343530.s002.pdf^{(60.1KB, pdf)}

S2 Table. Pearson correlation (R²) between the HI–WUE index and phenotypic outputs across four environments.

Statistical significance: *** p $<$ 0.001, ** p $<$ 0.01, * p $<$ 0.05, ns p≥0.05. Correlations with R² $<$ 0.10 are considered practically negligible despite statistical significance due to large sample sizes (n>1,300 per environment).

(PDF)

pone.0343530.s003.pdf^{(74.8KB, pdf)}

S3 Table. Optimal genetic coefficients and predicted crop performance for each environment.

Env1: high water retention, high precipitation (30% area); Env2: medium retention, low precipitation (18%); Env3: medium retention, high precipitation (21%); Env4: low retention, low precipitation (20%). Values represent the best individual identified across 40 generations of genetic algorithm optimization.

(PDF)

pone.0343530.s004.pdf^{(88.4KB, pdf)}

Acknowledgments

The author expresses deep gratitude to the School of Engineering at Pontificia Universidad Javeriana for providing a full Ph.D. scholarship and to the UMR-AGAP Institute (Genetic Improvement and Adaptation of Mediterranean and Tropical Plants) for institutional support and access to research facilities. The author also thanks CIRAD for providing the Ph.D. scholarship that significantly contributed to this research. Special recognition is given to Michael Dingkuhn for his valuable guidance and generosity in sharing expertise in crop modelling and scientific communication. The author also thanks Edward Gerardeaux for providing crop parameters and observational data, as detailed in Gerardeaux et al. (2021), and Loyola Rodríguez Perez for helpful feedback and insightful discussions in plant physiology during meetings at the Plant Physiology Laboratory (Pontificia Universidad Javeriana, Bogotá).

Data Availability

All relevant data are within the paper and its Supporting information files. Additional simulation outputs, environmental input files, and all code for replication are publicly available from the Zenodo repository (https://doi.org/10.5281/zenodo.18094655). The GitHub repository contains organized code to replicate all figures and tables: https://github.com/EdgarStevenC/Crop-Growth-Modelling. This study used publicly available datasets: soil data from SoilGrids (https://soilgrids.org), climate data from NASA POWER (https://power.larc.nasa.gov), and crop observations from Gérardeaux et al. (2021), who obtained all necessary permits for the original data collection.

Funding Statement

This research was funded by the Agropolis Fondation through the CropModAdapt project (Contract No. 2201-026, 2023–2024), and by the ClimBeR initiative – France–CGIAR Action Plan on Climate Change (ICARDA Agreement No. 200303, 2023–2024). The funders had no role in study design, data analysis, decision to publish, or preparation of the manuscript.

References

1.Arvanitis KG, Symeonaki EG. Agriculture 4.0: the role of innovative smart technologies towards sustainable farm management. TOASJ. 2020;14(1):130–5. doi: 10.2174/1874331502014010130 [DOI] [Google Scholar]
2.Zhai Z, Martínez JF, Beltran V, Martínez NL. Decision support systems for agriculture 4.0: survey and challenges. Computers and Electronics in Agriculture. 2020;170:105256. doi: 10.1016/j.compag.2020.105256 [DOI] [Google Scholar]
3.Noshita K, Murata H, Kirie S. Model-based plant phenomics on morphological traits using morphometric descriptors. Breed Sci. 2022;72(1):19–30. doi: 10.1270/jsbbs.21078 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Colorado JD, Calderon F, Mendez D, Petro E, Rojas JP, Correa ES, et al. A novel NIR-image segmentation method for the precise estimation of above-ground biomass in rice crops. PLoS One. 2020;15(10):e0239591. doi: 10.1371/journal.pone.0239591 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Jimenez-Sierra DA, Correa ES, Benítez-Restrepo HD, Calderon FC, Mondragon IF, Colorado JD. Novel feature-extraction methods for the estimation of above-ground biomass in Rice Crops. Sensors (Basel). 2021;21(13):4369. doi: 10.3390/s21134369 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Correa ES, Calderon F, Colorado JD. GFkuts: a novel multispectral image segmentation method applied to precision agriculture. In: 2020 Virtual Symposium in Plant Omics Sciences (OMICAS). 2020. p. 1–6. 10.1109/omicas52284.2020.9535659 [DOI]
7.Correa ES, Calderon FC, Colorado JD. A Novel Multi-camera Fusion Approach at Plant Scale: From 2D to 3D. SN COMPUT SCI. 2024;5(5). doi: 10.1007/s42979-024-02849-7 [DOI] [Google Scholar]
8.Correa ES, Parra CA, Vizcaya PR, Calderon FC, Colorado JD. Complex Object Detection Using Light-Field Plenoptic Camera. Communications in Computer and Information Science. Springer International Publishing. 2022. p. 119–33. 10.1007/978-3-031-07005-1_12 [DOI] [Google Scholar]
9.McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, et al. Genetic properties of the maize nested association mapping population. Science. 2009;325(5941):737–40. doi: 10.1126/science.1174320 [DOI] [PubMed] [Google Scholar]
10.Araus JL, Cairns JE. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19(1):52–61. doi: 10.1016/j.tplants.2013.09.008 [DOI] [PubMed] [Google Scholar]
11.Farooq MA, Gao S, Hassan MA, Huang Z, Rasheed A, Hearne S, et al. Artificial intelligence in plant breeding. Trends Genet. 2024;40(10):891–908. doi: 10.1016/j.tig.2024.07.001 [DOI] [PubMed] [Google Scholar]
12.Correa ES. Runoff Potential Index (RPI): 3D modelling of surface-driven hydrological dynamics for drought resilience. Sci Rep. 2026;16(1):4509. doi: 10.1038/s41598-025-34699-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Fatichi S, Pappas C, Ivanov VY. Modeling plant–water interactions: an ecohydrological overview from the cell to the global scale. WIREs Water. 2015;3(3):327–68. doi: 10.1002/wat2.1125 [DOI] [Google Scholar]
14.Wagner GP, Altenberg L. PERSPECTIVE: COMPLEX ADAPTATIONS AND THE EVOLUTION OF EVOLVABILITY. Evolution. 1996;50(3):967–76. doi: 10.1111/j.1558-5646.1996.tb02339.x [DOI] [PubMed] [Google Scholar]
15.Pasley H, Brown H, Holzworth D, Whish J, Bell L, Huth N. How to build a crop model. A review. Agron Sustain Dev. 2022;43(1). doi: 10.1007/s13593-022-00854-9 [DOI] [Google Scholar]
16.Yin X, Struik PC. Modelling the crop: from system dynamics to systems biology. J Exp Bot. 2010;61(8):2171–83. doi: 10.1093/jxb/erp375 [DOI] [PubMed] [Google Scholar]
17.Muller B, Martre P. Plant and crop simulation models: powerful tools to link physiology, genetics, and phenomics. J Exp Bot. 2019;70(9):2339–44. doi: 10.1093/jxb/erz175 [DOI] [PubMed] [Google Scholar]
18.Stöckle CO, Donatelli M, Nelson R. CropSyst, a cropping systems simulation model. European Journal of Agronomy. 2003;18(3–4):289–307. doi: 10.1016/s1161-0301(02)00109-0 [DOI] [Google Scholar]
19.Ritchie J, Singh U, Godwin D, Bowen W. Cereal growth, development and yield. Understanding options for agricultural production. 1998. p. 79–98.
20.Boote KJ, Sau Sau F, Porter CH, Dzotsi K, Tollenaar M, Kumudini SV. Testing of crop models for accurate predictions of evapotranspiration and crop water use. 2014.
21.Amiri E, Rezaei M, Rezaei EE, Bannayan M. Evaluation of Ceres-Rice, Aquacrop and Oryza2000 Models in Simulation of Rice Yield Response to Different Irrigation and Nitrogen Management Strategies. Journal of Plant Nutrition. 2014;37(11):1749–69. doi: 10.1080/01904167.2014.888750 [DOI] [Google Scholar]
22.Donald CM. The breeding of crop ideotypes. Euphytica. 1968;17(3):385–403. doi: 10.1007/bf00056241 [DOI] [Google Scholar]
23.Chardon F, Noël V, Masclaux-Daubresse C. Exploring NUE in crops and in Arabidopsis ideotypes to improve yield and seed quality. J Exp Bot. 2012;63(9):3401–12. doi: 10.1093/jxb/err353 [DOI] [PubMed] [Google Scholar]
24.Redden R. New Approaches for Crop Genetic Adaptation to the Abiotic Stresses Predicted with Climate Change. Agronomy. 2013;3(2):419–32. doi: 10.3390/agronomy3020419 [DOI] [Google Scholar]
25.Correa ES, Calderon FC, Colorado JD. Ml-enhanced mechanistic crop modeling to address noise-induced uncertainty for drought environmental monitoring in rice. Discov Food. 2025;5(1). doi: 10.1007/s44187-025-00611-3 [DOI] [Google Scholar]
26.Tardieu F, et al. Plant phenomics, from sensors to knowledge. Nature Biotechnology. 2017;35:453–62. [DOI] [PubMed] [Google Scholar]
27.Hammer G, Messina C, Wu A, Cooper M. Biological reality and parsimony in crop models—why we need both in crop improvement!. in silico Plants. 2019;1(1). doi: 10.1093/insilicoplants/diz010 [DOI] [Google Scholar]
28.Jubery TZ, Ganapathysubramanian B, Gilbert ME, Attinger D. In silico design of crop ideotypes under a wide range of water availability. Food and Energy Security. 2019;8(3). doi: 10.1002/fes3.167 [DOI] [Google Scholar]
29.Li Z, Ding Q, Zhang W. A Comparative Study of Different Distances for Similarity Estimation. Communications in Computer and Information Science. Springer Berlin Heidelberg. 2011. p. 483–8. 10.1007/978-3-642-18129-0_75 [DOI] [Google Scholar]
30.Kokare M, Chatterji BN, Biswas PK. In: 2003. 571–5.
31.Fan P, Xu J, Wang Z, Liu G, Zhang Z, Tian J, et al. Phenotypic differences in the appearance of soft rice and its endosperm structural basis. Front Plant Sci. 2023;14:1074148. doi: 10.3389/fpls.2023.1074148 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Liu M, Li H, Su Y, Li W, Shi C. G1/ELE Functions in the Development of Rice Lemmas in Addition to Determining Identities of Empty Glumes. Front Plant Sci. 2016;7:1006. doi: 10.3389/fpls.2016.01006 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Yang C, Sheng H, Kolbinson KT, Shaterian H, Ashe P, Gao P, et al. A stomata imaging and segmentation pipeline incorporating generative AI to reduce dependency on manual groundtruthing. Plant Methods. 2025;21(1):148. doi: 10.1186/s13007-025-01451-z [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Grigorieva E, Matzarakis A, de Freitas C. Analysis of growing degree-days as a climate impact indicator in a region with extreme annual air temperature amplitude. Clim Res. 2010;42(2):143–54. doi: 10.3354/cr00888 [DOI] [Google Scholar]
35.McMaster G. Growing degree-days: one equation, two interpretations. Agricultural and Forest Meteorology. 1997;87(4):291–300. doi: 10.1016/s0168-1923(97)00027-0 [DOI] [Google Scholar]
36.Fraisse CW, Paula-Moraes SV. Degree-Days: Growing, Heating, and Cooling. EDIS. 2018;2018(2). doi: 10.32473/edis-ae428-2018 [DOI] [Google Scholar]
37.Zhou G, Wang Q. A new nonlinear method for calculating growing degree days. Sci Rep. 2018;8(1):10149. doi: 10.1038/s41598-018-28392-z [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Yan W. Simulation and Prediction of Plant Phenology for Five Crops Based on Photoperiod×Temperature Interaction. Annals of Botany. 1998;81(6):705–16. doi: 10.1006/anbo.1998.0625 [DOI] [Google Scholar]
39.Yang W, Wu T, Zhang X, Song W, Xu C, Sun S, et al. Critical Photoperiod Measurement of Soybean Genotypes in Different Maturity Groups. Crop Science. 2019;59(5):2055–61. doi: 10.2135/cropsci2019.03.0170 [DOI] [Google Scholar]
40.Cao W, Moss DN. Temperature Effect on Leaf Emergence and Phyllochron in Wheat and Barley. Crop Science. 1989;29(4):1018–21. doi: 10.2135/cropsci1989.0011183x002900040038x [DOI] [Google Scholar]
41.Frank AB, Bauer A. Phyllochron differences in Wheat, Barley, and Forage Grasses. Crop Science. 1995;35(1):19–23. doi: 10.2135/cropsci1995.0011183x003500010004x [DOI] [Google Scholar]
42.Gunawardena TA, Fukai S, Blamey FPC. Low temperature induced spikelet sterility in rice. II. Effects of panicle and root temperatures. Australian Journal of Agricultural Research. 2003;54(10):947–56. doi: 10.1071/ar03076 [DOI] [Google Scholar]
43.Zeng Y, Zhang Y, Xiang J, Uphoff NT, Pan X, Zhu D. Effects of low temperature stress on Spikelet-related parameters during anthesis in Indica-Japonica hybrid rice. Front Plant Sci. 2017;8:1350. doi: 10.3389/fpls.2017.01350 [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Shimono H, Okada M, Kanda E, Arakawa I. Low temperature-induced sterility in rice: evidence for the effects of temperature before panicle initiation. Field Crops Research. 2007;101(2):221–31. doi: 10.1016/j.fcr.2006.11.010 [DOI] [Google Scholar]
45.Shimono H, Hasegawa T, Moriyama M, Fujimura S, Nagata T. Modeling spikelet sterility induced by low temperature in rice. Agronomy Journal. 2005;97(6):1524–36. doi: 10.2134/agronj2005.0043 [DOI] [Google Scholar]
46.Xu Y, Chu C, Yao S. The impact of high-temperature stress on rice: challenges and solutions. The Crop Journal. 2021;9(5):963–76. doi: 10.1016/j.cj.2021.02.011 [DOI] [Google Scholar]
47.Satake T, Yoshida S. High Temperature-Induced Sterility in Indica Rices at Flowering. Jpn J Crop Sci. 1978;47(1):6–17. doi: 10.1626/jcs.47.6 [DOI] [Google Scholar]
48.Jagadish SVK, Craufurd PQ, Wheeler TR. High temperature stress and spikelet fertility in rice (Oryza sativa L.). J Exp Bot. 2007;58(7):1627–35. doi: 10.1093/jxb/erm003 [DOI] [PubMed] [Google Scholar]
49.Ren H, Bao J, Gao Z, Sun D, Zheng S, Bai J. How rice adapts to high temperatures. Front Plant Sci. 2023;14:1137923. doi: 10.3389/fpls.2023.1137923 [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Gérardeaux E, Falconnier G, Gozé E, Defrance D, Kouakou P-M, Loison R, et al. Adapting rainfed rice to climate change: a case study in Senegal. Agron Sustain Dev. 2021;41(4). doi: 10.1007/s13593-021-00710-2 [DOI] [Google Scholar]
51.Hengl T, Mendes de Jesus J, Heuvelink GBM, Ruiperez Gonzalez M, Kilibarda M, Blagotić A. SoilGrids250m: global gridded soil information based on machine learning. 2017. https://soilgrids.org [DOI] [PMC free article] [PubMed]
52.Romero CC, Hoogenboom G, Baigorria GA, Koo J, Gijsman AJ, Wood S. Reanalysis of a global soil database for crop and environmental modeling. Environmental Modelling & Software. 2012;35:163–70. doi: 10.1016/j.envsoft.2012.02.018 [DOI] [Google Scholar]
53.Alaya I, Masmoudi MM, Lagacherie Ph, Coulouma G, Jacob F, Ben Mechlia N. Performance of saxton and rawls pedotransfer functions for estimating soil water properties in the Cap Bon Region-Northern Tunisia. Water and land security in drylands. Springer; 2017. p. 77–85. 10.1007/978-3-319-54021-4_8 [DOI] [Google Scholar]
54.Sung CTB, Iba J. Accuracy of the Saxton-Rawls method for estimating the soil water characteristics for mineral soils of Malaysia. Pertanika J Trop Agric Sci. 2010;33(2):297–302. [Google Scholar]
55.Rawls WJ, Brakensiek DL, Saxtonn K. Estimation of soil water properties. Transactions of the ASAE. 1982;25(5):1316–1320. [Google Scholar]
56.Hodges R, et al. NASA POWER (Prediction Of Worldwide Energy Resources) Data. 2017.
57.Jones JW, Hoogenboom G, Porter CH, Boote KJ, Batchelor WD, Hunt LA, et al. The DSSAT cropping system model. European Journal of Agronomy. 2003;18(3–4):235–65. doi: 10.1016/s1161-0301(02)00107-7 [DOI] [Google Scholar]
58.Hoogenboom G, Porter CH, Boote KJ, Shelia V, Wilkens PW, Singh U, et al. The DSSAT crop modeling ecosystem. Burleigh Dodds Series in Agricultural Science. Burleigh Dodds Science Publishing. 2019. p. 173–216. 10.19103/as.2019.0061.10 [DOI] [Google Scholar]
59.MathWorks Inc. MATLAB version 24.2.0.2712019 (R2024b). 2024.
60.R CT. R: A Language and Environment for Statistical Computing. 2024.
61.Morris MD. Factorial Sampling Plans for Preliminary Computational Experiments. Technometrics. 1991;33(2):161–74. doi: 10.1080/00401706.1991.10484804 [DOI] [Google Scholar]
62.Pang Z, O’Neill Z, Li Y, Niu F. The role of sensitivity analysis in the building performance analysis: A critical review. Energy and Buildings. 2020;209:109659. doi: 10.1016/j.enbuild.2019.109659 [DOI] [Google Scholar]
63.Fukai S, Mitchell J. Factors determining water use efficiency in aerobic rice. Crop and Environment. 2022;1(1):24–40. doi: 10.1016/j.crope.2022.03.008 [DOI] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0343530.r001

Decision Letter 0

Paulo Eduardo Teodoro

8 Sep 2025

Dear Dr. Correa,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 23 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Paulo Eduardo Teodoro, Dr.

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. In your Methods section, please provide additional information regarding the permits you obtained for the work. Please ensure you have included the full name of the authority that approved the field site access and, if no permits were required, a brief statement explaining why. 3. Please update your submission to use the PLOS LaTeX template. The template and more information on our requirements for LaTeX submissions can be found at http://journals.plos.org/plosone/s/latex. 4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 5. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 6. Thank you for uploading your study's underlying data set. Unfortunately, the repository you have noted in your Data Availability statement does not qualify as an acceptable data repository according to PLOS's standards. At this time, please upload the minimal data set necessary to replicate your study's findings to a stable, public repository (such as figshare or Dryad) and provide us with the relevant URLs, DOIs, or accession numbers that may be used to access these data. For a list of recommended repositories and additional information on PLOS standards for data deposition, please see https://journals.plos.org/plosone/s/recommended-repositories. 7. We note that Figure 1 in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or remove the figures from your submission: a. You may seek permission from the original copyright holder of Figure 1 to publish the content specifically under the CC BY 4.0 license. We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].” b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only. 8. We note that Figure 2 in your submission contain map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or remove the figures from your submission: a. You may seek permission from the original copyright holder of Figure 2 to publish the content specifically under the CC BY 4.0 license. We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].” b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.The following resources for replacing copyrighted map figures may be helpful: USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/ Landsat: http://landsat.visibleearth.nasa.gov/ USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/# Natural Earth (public domain): http://www.naturalearthdata.com/ 9. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. ?

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

Reviewer #1: Review Report

Title: Mechanistic crop modelling and AI for ideotype optimization: Crop-scale advances to enhance yield and water use efficiency

The manuscript presents a novel integration of mechanistic crop modeling (CERES-Rice) and AI (Genetic Algorithm) for ideotype optimization. It is methodologically sound, well-organized, and offers valuable insights for precision agriculture.

Major Comments with Line References

1. Line 51–56: Justify the selection of 1,884 virtual cultivars and 5,692 runs; clarify stopping criteria and convergence.

2. Lines 208–215: Genetic Algorithm parameters (population size = 15, generations = 20) seem low; justify or consider alternatives.

3. Algorithm 2: No mention of overfitting control; suggest cross-validation or bootstrapping.

4. Line 52 & 417: Number of environments (n=4) may limit generalizability; validate across more diverse settings.

5. Fig. 6 & Lines 327–366: Add confidence intervals or significance testing to support sensitivity rankings.

6. Fig. 7 & 8: Report statistical significance of correlation results.

7. Line 423–426: Promising cultivar claims based on similarity metrics alone; field validation needed.

8. Line 457: Present PCA variance (e.g., 86.05%) in tabular form.

9. Lines 1–60 (Abstract): Too dense; trim to emphasize key findings and novelty.

10. Lines 545–560 (Conclusion): Avoid repetition; stress broader implications for breeding.

11. Line 376–393: Break long paragraphs for better readability.

12. Lines 20–40: Include foundational references (e.g., Donald’s ideotype concept).

13. Line 68–87: Avoid excessive reuse of [32]; clarify citation purpose.

14. Line ~500: Ensure ethics statement is clearly included in manuscript.

15. Line 536–541: Briefly discuss potential future ethical considerations.

General Notes

16. Ensure all figures (e.g., 7 & 8) are fully labeled with units.

17. Include summary statistics (e.g., mean ± SD) in figure captions.

18. Add version numbers of DSSAT, Python packages, etc.

19. Include a reproducibility checklist or flowchart as supplementary material.

Final Recommendation: Major Revision

The study is innovative and promising but requires additional methodological justification, statistical clarity, and formatting refinements before acceptance.

Reviewer #2: Dear author,

On the whole, I enjoyed reading your paper, and the detailed modelling and statistical approach taken. I do however feel beyond the results there is not enough insight (yet) in the discussion, and would like you to expand that a little bit, and make the paper less descriptive. See my comments below.

Methods:

So regarding the AI model, basically the starting population is 166? (Delta 1 to 6 and 11 parameters?).

But how does the AI know the WUE and HI? Is the input data include the model output values of WUE and HI that correspond to each of the Fig 4B combinations?

Maybe clarify the above a little bit please.

Line 247: the fitness factor here is HI-WUE?

Why are 5 methods of dissimilarity testing are needed??? You also mention “virtual cultivars”. I thought only on virtual cultivar, the best from the Genetic Algorithm output, is compared to the 21 “real” cultivars? I feel some extra clarification required here too.

Results:

Fig 6 RSI axis label sometimes over imposed on axis label

Maybe possible to make Results shorter and incorporate some conclusion statements after each section? Like at tend of results what characterizes the sum of the changes required in the cultivars? Later flowering and panicle initiation etc….in simpler words.

Discussion:

Line 499: Why would lower humidity be better for photosynthesis?

Line 511: Why did the ideotype be better at conserving water, do you think?

So far Discussion is just rehashing results.

Line 534-535: When you signing critical for aligning crop development with environmental conditions, what do you mean exactly? How does the alignment help? Please expand on this, explain the mechanisms. There is very little physiology in the discussion so far.

Line 548: Phenological parameters being influential is not a groundbreaking finding, but you need to incorporate this knowledge with the environment and use it to explain physiological responses that are specific to environment or genotype.

I also think you need to discuss, a bit, WUE itself and how it can be counterproductive, and how maybe incorporating with HI can be beneficial.

Finally, I understand the focus of this paper is methodological but as shown by my comments above you need a more physiologically grounded story in your conclusion and discussion so we know how your modelling approach contributed to our understanding of what’s required for growth in the region of your study. For example, what other traits, other than the ones you tested, do you think can be influential to improve crops in your region? Use your modelling and the significant parameters to ruminate on that and incorporate a more physiological understanding specific to your environment.

Reviewer #3: My overall recommendation: Major Revision.

1. Is the manuscript technically sound, and do the data support the conclusions?

Partly. Technically promising, but key methodological gaps (incomplete Morris design specifics, no multi-seed GA convergence evidence, ad-hoc HI+0.1·WUE objective without robustness/Pareto analysis, and a problematic similarity normalization) mean the current analyses don’t fully support the strength of the conclusions.

2. Has the statistical analysis been performed appropriately and rigorously?

No.

Why:

• Morris sensitivity analysis is under-specified (no numeric k,p,Δ,rk, p, \Delta, rk,p,Δ,r; no design matrix or distribution of elementary effects), so stability/precision can’t be assessed.

• GA optimization lacks multi-seed replication and convergence diagnostics; a stochastic optimizer needs replicate runs with summary stats (mean±SD of best fitness, variability of θ∗\theta^*θ∗).

• The objective J=HI+0.1×WUEJ=\text{HI}+0.1\times\text{WUE}J=HI+0.1×WUE uses an ad-hoc weight without scale normalization or sensitivity/Pareto analysis, so results may hinge on the arbitrary α\alphaα.

• The similarity analysis averages heterogeneous distance metrics without standardization and uses a non-monotone normalization (higher similarity → smaller score), with no robustness checks on rankings.

• Uncertainty reporting is limited (few CIs/SDs across sites/years or GA seeds), so inferential strength is unclear.

These gaps mean the statistical/analytical treatment isn’t yet rigorous enough to support the strength of the conclusions.

3. Have the authors made all data underlying the findings in their manuscript fully available?

Yes

The manuscript’s Data Availability statement points to public repositories (Mendeley Data for datasets and GitHub for code) with no access restrictions, which satisfies PLOS ONE’s policy. As a best-practice enhancement (not a blocker), I recommend archiving a tagged code release with a DOI (e.g., Zenodo) and including per-figure CSVs of the data underlying plotted summaries.

4. Is the manuscript presented in an intelligible fashion and written in standard English?

No.

The manuscript is generally understandable, but it does not yet meet PLOS ONE’s “clear, correct, unambiguous” English standard. Representative issues that should be corrected at revision:

• Grammar/articles/verb agreement: “a individual point” → an individual point; “the sensitivity analysis assess …” → assesses.

• Typos/word choice: “rainfeed” → rainfed.

• Figure captions/labels: duplicated panel label in one figure (“f) Anthesis” should be Maturity); missing spaces around symbols (e.g., “0.50±0.07” → 0.50 ± 0.07).

• Units & style: non-standard units and casing (e.g., Biomass(Kg/H), #/H)—use kg ha⁻¹ and no. ha⁻¹; WUE shown as kg/mm should be kg ha⁻¹ mm⁻¹; LAI listed as mm²/mm² should be m² m⁻² (or “–”).

• Notation consistency: parameter code P2O occasionally appears as P20; acronym in a section header shows AG instead of GA.

• Equation formatting/clarity: the “normalized similarity” formula is line-broken and currently maps higher similarity to smaller values—both the typesetting and the monotonicity should be corrected.

With a focused language/units pass and caption clean-up, the paper can reach the required standard.

Please, see attached Letter to the Author and Letter to the Editor as PDF attachment files.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: Yes: Dr. Shahzad Akhtar

Reviewer #2: No

Reviewer #3: Yes: Ronald Maldonado Rodriguez

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review Report.docx

pone.0343530.s005.docx^{(15.7KB, docx)}

Attachment

Submitted filename: Letter to the Author - PONE-D-25-26754.pdf

pone.0343530.s006.pdf^{(781.7KB, pdf)}

Attachment

Submitted filename: Letter to the Editor - PONE-D-25-26754.pdf

pone.0343530.s007.pdf^{(881.8KB, pdf)}

PLoS One. 2026 Mar 11;21(3):e0343530. doi: 10.1371/journal.pone.0343530.r002

Author response to Decision Letter 1

21 Jan 2026

Dear Editor and Reviewers,

I sincerely thank the reviewers for their thorough evaluation and constructive feedback. Their insightful comments have significantly improved the clarity, rigor, and presentation of this manuscript. I have carefully addressed each point raised, incorporating substantial revisions to the methodology description, statistical analysis, and figure quality.

A comprehensive point-by-point response document is attached as "Response to Reviewers" file, detailing all modifications with specific line references. All changes are highlighted in green throughout the revised manuscript.

SUMMARY OF MAJOR REVISIONS:

Reviewer 1 - Sensitivity Analysis:

- Added 20 replications with 95% confidence intervals (mean CI width = 0.04)

- Included Table S1 with complete RSI statistics

- Enhanced Figure 6 with diagrams showing uncertainty bounds

Reviewer 2 - Fitness Function:

- Clarified WUE normalization using literature-based physiological bounds (2-15 kg/ha/mm)

- Added Equation 4 with explicit normalization formula

- Justified equal weighting (50%-50%) for HI and WUE components

Reviewer 3 - Similarity Analysis:

- Methodology with three distance metrics (Euclidean, Manhattan, Cosine)

- Quantified genetic gap (22-30%) with breeding implications

- Added Figure 9 combining PCA and top cultivar matches

Journal Requirements (4.1-4.9):

- Data repository migrated to Zenodo (DOI: 10.5281/zenodo.18094654)

- Figure copyright permissions documented (CC BY sources cited)

- Competing interests and funding statements updated

- All supplementary materials formatted per PLOS guidelines

Best regards,

Edgar S. Correa

Attachment

Submitted filename: 1.Response to Reviewers 29_12_2025.pdf

pone.0343530.s008.pdf^{(496.2KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0343530.r003

Decision Letter 1

Paulo Eduardo Teodoro

4 Feb 2026

Dear Dr. Correa,

Please submit your revised manuscript by Mar 21 2026 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

A letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Paulo Eduardo Teodoro, Dr.

Academic Editor

PLOS One

Journal Requirements:

If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

Reviewer #1: Dear Editor,

I have reviewed the revised manuscript and note a clear improvement in overall quality and presentation. The major issues have been satisfactorily addressed. A few minor points remain, specifically related to editorial polishing, clarification of ideotype interpretation versus direct breedability, and minor figure and terminology consistency, which the authors may incorporate.

From my side, the manuscript is acceptable. I leave the incorporation and verification of these minor points, as well as the final acceptance decision, to the Editor. If there are additional reviewer comments, the manuscript may be accepted after their completion.

Kind regards,

Reviewer #2: Thanks for your very thorough response. I think you have quite a nice little paper now with sufficient methodological and mechanistic novelty.

Reviewer #3: Dear Author,

Thank you for submitting the revised manuscript (dated 16 January 2026, 27 pages). The revision is moving in the right direction, and the inclusion of funding, competing interests, and a stable public archive strengthens compliance. However, I recommend publication only after the corrections below are completed, as several items still limit reproducibility and full compliance with PLOS ONE formatting requirements.

A) Required corrections (major — reproducibility and methodological rigor)

1) Resolve the GA generation count contradiction (reproducibility blocker)

The manuscript currently reports two different values for the GA run length:•

“generations, set to 20” [Page: 10, Line: 237–239]•

“explored … across 40 generations” [Page: 13, Line: 340–342]•

“across 40 generations” repeated in the Conclusion [Page: 20, Line: 546–547]

Action required: Please reconcile the true number of generations (20 vs 40) and ensure consistency across Methods, Results, Conclusion, figure captions, supplementary files, and the archived scripts/configs.

2) Provide the explicit objective function (HI–WUE) as a reproducible equation

The manuscript states that HI and WUE are integrated into the fitness function [Page: 9, Line: 223–224], and refers to “HI–WUE” as a composite metric [Page: 13, Line: 342–343], but the exact objective equation (including units, scaling/normalization, and weighting) is not clearly provided in the manuscript text.

Action required: Add the explicit objective function equation in the Methods and define:•

how HI and WUE are computed (with units),•

any normalization bounds (if used),•

the weights used (if claiming equal contribution, show it explicitly in the equation),•

and a brief justification for the chosen formulation.

3) Add multi-seed GA replication and uncertainty reporting (stochastic robustness)

The manuscript includes a convergence statement (“Convergence analysis revealed…”) [Page: 15, Line: 369–370], but does not report:•

the number of independent GA runs (random seeds),•

variability of best fitness across seeds (mean ± SD, or median/IQR),•

or stability/variability of the optimized parameter vector(s) across seeds.

Action required: Because the GA is stochastic, please run multiple seeds per environment and report summary statistics of fitness and parameter stability. Single-run convergence descriptions are insufficient to support robust conclusions.

4) Complete Morris design specification for full reproducibility

The Methods describe the Morris “step size Δ” conceptually [Page: 8, Line: 191–193] and mention 20 replications [Page: 8, Line: 200–202], but the Morris design should be fully specified in the Methods (e.g., k, p, numeric Δ\DeltaΔ, r, and how ranges were chosen) rather than relying on figure references.

Action required: Add one concise Methods sentence explicitly stating k, p, numeric Δ\DeltaΔ, and r, and how parameter ranges were defined.

5) Fix parameter notation inconsistency (P2O vs P20)

The manuscript’s parameter vector lists “P20” [Page: 10, Line: 244–245]. Ensure this notation is consistent across the manuscript, figures/tables, and code, and matches the intended CERES-Rice parameter name.

Action required: Standardize the parameter label everywhere (and verify that the repository code uses the same naming).

6) Clarify/standardize WUE units

The Results report WUE in the form “WUE = 6.17 kg/mm” [Page: 16, Line: 385–387], while yield is expressed as kg/ha in the same section [Page: 16, Line: 385–386].

Action required: Express WUE in a standard agronomic form (commonly kg ha⁻¹ mm⁻¹) or explicitly define what “kg/mm” means (area basis, computation method, and conversion).

B) Required corrections (PLOS ONE formatting / style)

7) Convert in-text citations to the required square-bracket numeric format

PLOS ONE requires numeric citations in square brackets in the text. The manuscript currently uses parentheses for numeric citations, e.g., “… what they seek (22).” [Page: 3, Line: 38–40]

Action required: Adjust LaTeX citation settings so the manuscript uses [22] style consistently and verify citation order/numbering throughout.

8) References section: ensure strict PLOS formatting and completeness

The reference list begins at the end of the line-numbered portion [Page: 23, Line: 653] and continues through the final pages [Page: 23–27, Line: n/a] (note: the manuscript does not display line numbers on these pages).

Action required: Please ensure:•

reference entries are complete (authors, title, journal, year, volume, pages),•

DOI/URL formatting is consistent and correct,•

journal names/abbreviations are consistent (including correct capitalization),•

and the bibliography style matches PLOS ONE requirements.

C) Figures / maps licensing and caption transparency (PLOS compliance)

9) Map/base-layer provenance must be explicit for CC BY compatibility

A figure caption states content was “adapted … under CC BY” and notes it was generated using MATLAB Mapping Toolbox [Page: 7, Line: n/a]. For PLOS CC BY 4.0 compliance, authors must identify the provenance/licensing of all map layers and base data used.

Action required: For any maps or adapted figures, update the captions to explicitly state the base map/data sources and their licenses, confirming they are compatible with PLOS’s CC BY publication license.

Closing

Once these corrections are implemented, the manuscript will be much stronger and can be reconsidered for publication. The most urgent items are the GA 20 vs 40 generations inconsistency [Page: 10, Line: 237–239; Page: 13, Line: 340–342; Page: 20, Line: 546–547], the explicit objective function definition [Page: 9, Line: 223–224; Page: 13, Line: 342–343], and the multi-seed GA robustness reporting [Page: 15, Line: 369–370].

Sincerely,

Reviewer

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: Yes: Dr Shahzad Akhtar

Reviewer #2: No

Reviewer #3: No

**********

To ensure your figures meet our technical requirements, please review our figure guidelines: https://journals.plos.org/plosone/s/figures

You may also use PLOS’s free figure tool, NAAS, to help you prepare publication quality figures: https://journals.plos.org/plosone/s/figures#loc-tools-for-figure-preparation.

NAAS will assess whether your figures meet our technical requirements by comparing each figure against our figure specifications.

PLoS One. 2026 Mar 11;21(3):e0343530. doi: 10.1371/journal.pone.0343530.r004

Author response to Decision Letter 2

5 Feb 2026

Dear Editor and Reviewers,

Thank you for your careful review and constructive suggestions. The manuscript has been revised to address all comments thoroughly. Key updates include:

• Clarification of methodological details, including consistent reporting of 40 GA generations and an explicit formulation of the HI–WUE objective function.

• Enhanced reproducibility and robustness description, including specification of the Morris sensitivity analysis implementation and correction of parameter notation (P2O).

• Standardization of units and formatting, such as expressing water use efficiency as kg (ha·mm)⁻¹ and updating in-text numeric citations and the reference list to conform with PLOS ONE’s style.

• Improved figure captions to explicitly indicate data provenance and licensing for map and environmental figures, ensuring CC BY 4.0 compliance.

• Verification and adjustment of references and bibliography style in accordance with PLOS ONE requirements.

All revisions are reflected in the manuscript and documented in the detailed point-by-point response. I appreciate the reviewers’ time and feedback, which have strengthened the clarity and presentation of the research.

Sincerely,

Edgar S. Correa

Attachment

Submitted filename: 1. Response to Reviewers 04_02_2026.pdf

pone.0343530.s009.pdf^{(183.2KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0343530.r005

Decision Letter 2

Paulo Eduardo Teodoro

8 Feb 2026

Decoding Living Systems: Reassessing Crop Model Frontiers via Biological Dynamics and Optimized Phenotype

PONE-D-25-26754R2

Dear Dr. Correa,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Paulo Eduardo Teodoro, Dr.

Academic Editor

PLOS One

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0343530.r006

Acceptance letter

Paulo Eduardo Teodoro

PONE-D-25-26754R2

PLOS One

Dear Dr. Correa,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Paulo Eduardo Teodoro

Academic Editor

PLOS One

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File. Supplementary Material.

Supporting figure and additional details (includes S1 Fig with panels A–D: genetic algorithm convergence across four environments).

(PDF)

pone.0343530.s001.pdf^{(254.9KB, pdf)}

S1 Table. Relative Sensitivity Index (RSI) from 20 Morris method replications.

Bold indicates the highest sensitivity per output.

(PDF)

pone.0343530.s002.pdf^{(60.1KB, pdf)}

S2 Table. Pearson correlation (R²) between the HI–WUE index and phenotypic outputs across four environments.

(PDF)

pone.0343530.s003.pdf^{(74.8KB, pdf)}

S3 Table. Optimal genetic coefficients and predicted crop performance for each environment.

(PDF)

pone.0343530.s004.pdf^{(88.4KB, pdf)}

Attachment

Submitted filename: Review Report.docx

pone.0343530.s005.docx^{(15.7KB, docx)}

Attachment

Submitted filename: Letter to the Author - PONE-D-25-26754.pdf

pone.0343530.s006.pdf^{(781.7KB, pdf)}

Attachment

Submitted filename: Letter to the Editor - PONE-D-25-26754.pdf

pone.0343530.s007.pdf^{(881.8KB, pdf)}

Attachment

Submitted filename: 1.Response to Reviewers 29_12_2025.pdf

pone.0343530.s008.pdf^{(496.2KB, pdf)}

Attachment

Submitted filename: 1. Response to Reviewers 04_02_2026.pdf

pone.0343530.s009.pdf^{(183.2KB, pdf)}

Data Availability Statement

[pone.0343530.ref001] 1.Arvanitis KG, Symeonaki EG. Agriculture 4.0: the role of innovative smart technologies towards sustainable farm management. TOASJ. 2020;14(1):130–5. doi: 10.2174/1874331502014010130 [DOI] [Google Scholar]

[pone.0343530.ref002] 2.Zhai Z, Martínez JF, Beltran V, Martínez NL. Decision support systems for agriculture 4.0: survey and challenges. Computers and Electronics in Agriculture. 2020;170:105256. doi: 10.1016/j.compag.2020.105256 [DOI] [Google Scholar]

[pone.0343530.ref003] 3.Noshita K, Murata H, Kirie S. Model-based plant phenomics on morphological traits using morphometric descriptors. Breed Sci. 2022;72(1):19–30. doi: 10.1270/jsbbs.21078 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref004] 4.Colorado JD, Calderon F, Mendez D, Petro E, Rojas JP, Correa ES, et al. A novel NIR-image segmentation method for the precise estimation of above-ground biomass in rice crops. PLoS One. 2020;15(10):e0239591. doi: 10.1371/journal.pone.0239591 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref005] 5.Jimenez-Sierra DA, Correa ES, Benítez-Restrepo HD, Calderon FC, Mondragon IF, Colorado JD. Novel feature-extraction methods for the estimation of above-ground biomass in Rice Crops. Sensors (Basel). 2021;21(13):4369. doi: 10.3390/s21134369 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref006] 6.Correa ES, Calderon F, Colorado JD. GFkuts: a novel multispectral image segmentation method applied to precision agriculture. In: 2020 Virtual Symposium in Plant Omics Sciences (OMICAS). 2020. p. 1–6. 10.1109/omicas52284.2020.9535659 [DOI]

[pone.0343530.ref007] 7.Correa ES, Calderon FC, Colorado JD. A Novel Multi-camera Fusion Approach at Plant Scale: From 2D to 3D. SN COMPUT SCI. 2024;5(5). doi: 10.1007/s42979-024-02849-7 [DOI] [Google Scholar]

[pone.0343530.ref008] 8.Correa ES, Parra CA, Vizcaya PR, Calderon FC, Colorado JD. Complex Object Detection Using Light-Field Plenoptic Camera. Communications in Computer and Information Science. Springer International Publishing. 2022. p. 119–33. 10.1007/978-3-031-07005-1_12 [DOI] [Google Scholar]

[pone.0343530.ref009] 9.McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, et al. Genetic properties of the maize nested association mapping population. Science. 2009;325(5941):737–40. doi: 10.1126/science.1174320 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref010] 10.Araus JL, Cairns JE. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19(1):52–61. doi: 10.1016/j.tplants.2013.09.008 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref011] 11.Farooq MA, Gao S, Hassan MA, Huang Z, Rasheed A, Hearne S, et al. Artificial intelligence in plant breeding. Trends Genet. 2024;40(10):891–908. doi: 10.1016/j.tig.2024.07.001 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref012] 12.Correa ES. Runoff Potential Index (RPI): 3D modelling of surface-driven hydrological dynamics for drought resilience. Sci Rep. 2026;16(1):4509. doi: 10.1038/s41598-025-34699-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref013] 13.Fatichi S, Pappas C, Ivanov VY. Modeling plant–water interactions: an ecohydrological overview from the cell to the global scale. WIREs Water. 2015;3(3):327–68. doi: 10.1002/wat2.1125 [DOI] [Google Scholar]

[pone.0343530.ref014] 14.Wagner GP, Altenberg L. PERSPECTIVE: COMPLEX ADAPTATIONS AND THE EVOLUTION OF EVOLVABILITY. Evolution. 1996;50(3):967–76. doi: 10.1111/j.1558-5646.1996.tb02339.x [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref015] 15.Pasley H, Brown H, Holzworth D, Whish J, Bell L, Huth N. How to build a crop model. A review. Agron Sustain Dev. 2022;43(1). doi: 10.1007/s13593-022-00854-9 [DOI] [Google Scholar]

[pone.0343530.ref016] 16.Yin X, Struik PC. Modelling the crop: from system dynamics to systems biology. J Exp Bot. 2010;61(8):2171–83. doi: 10.1093/jxb/erp375 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref017] 17.Muller B, Martre P. Plant and crop simulation models: powerful tools to link physiology, genetics, and phenomics. J Exp Bot. 2019;70(9):2339–44. doi: 10.1093/jxb/erz175 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref018] 18.Stöckle CO, Donatelli M, Nelson R. CropSyst, a cropping systems simulation model. European Journal of Agronomy. 2003;18(3–4):289–307. doi: 10.1016/s1161-0301(02)00109-0 [DOI] [Google Scholar]

[pone.0343530.ref019] 19.Ritchie J, Singh U, Godwin D, Bowen W. Cereal growth, development and yield. Understanding options for agricultural production. 1998. p. 79–98.

[pone.0343530.ref020] 20.Boote KJ, Sau Sau F, Porter CH, Dzotsi K, Tollenaar M, Kumudini SV. Testing of crop models for accurate predictions of evapotranspiration and crop water use. 2014.

[pone.0343530.ref021] 21.Amiri E, Rezaei M, Rezaei EE, Bannayan M. Evaluation of Ceres-Rice, Aquacrop and Oryza2000 Models in Simulation of Rice Yield Response to Different Irrigation and Nitrogen Management Strategies. Journal of Plant Nutrition. 2014;37(11):1749–69. doi: 10.1080/01904167.2014.888750 [DOI] [Google Scholar]

[pone.0343530.ref022] 22.Donald CM. The breeding of crop ideotypes. Euphytica. 1968;17(3):385–403. doi: 10.1007/bf00056241 [DOI] [Google Scholar]

[pone.0343530.ref023] 23.Chardon F, Noël V, Masclaux-Daubresse C. Exploring NUE in crops and in Arabidopsis ideotypes to improve yield and seed quality. J Exp Bot. 2012;63(9):3401–12. doi: 10.1093/jxb/err353 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref024] 24.Redden R. New Approaches for Crop Genetic Adaptation to the Abiotic Stresses Predicted with Climate Change. Agronomy. 2013;3(2):419–32. doi: 10.3390/agronomy3020419 [DOI] [Google Scholar]

[pone.0343530.ref025] 25.Correa ES, Calderon FC, Colorado JD. Ml-enhanced mechanistic crop modeling to address noise-induced uncertainty for drought environmental monitoring in rice. Discov Food. 2025;5(1). doi: 10.1007/s44187-025-00611-3 [DOI] [Google Scholar]

[pone.0343530.ref026] 26.Tardieu F, et al. Plant phenomics, from sensors to knowledge. Nature Biotechnology. 2017;35:453–62. [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref027] 27.Hammer G, Messina C, Wu A, Cooper M. Biological reality and parsimony in crop models—why we need both in crop improvement!. in silico Plants. 2019;1(1). doi: 10.1093/insilicoplants/diz010 [DOI] [Google Scholar]

[pone.0343530.ref028] 28.Jubery TZ, Ganapathysubramanian B, Gilbert ME, Attinger D. In silico design of crop ideotypes under a wide range of water availability. Food and Energy Security. 2019;8(3). doi: 10.1002/fes3.167 [DOI] [Google Scholar]

[pone.0343530.ref029] 29.Li Z, Ding Q, Zhang W. A Comparative Study of Different Distances for Similarity Estimation. Communications in Computer and Information Science. Springer Berlin Heidelberg. 2011. p. 483–8. 10.1007/978-3-642-18129-0_75 [DOI] [Google Scholar]

[pone.0343530.ref030] 30.Kokare M, Chatterji BN, Biswas PK. In: 2003. 571–5.

[pone.0343530.ref031] 31.Fan P, Xu J, Wang Z, Liu G, Zhang Z, Tian J, et al. Phenotypic differences in the appearance of soft rice and its endosperm structural basis. Front Plant Sci. 2023;14:1074148. doi: 10.3389/fpls.2023.1074148 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref032] 32.Liu M, Li H, Su Y, Li W, Shi C. G1/ELE Functions in the Development of Rice Lemmas in Addition to Determining Identities of Empty Glumes. Front Plant Sci. 2016;7:1006. doi: 10.3389/fpls.2016.01006 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref033] 33.Yang C, Sheng H, Kolbinson KT, Shaterian H, Ashe P, Gao P, et al. A stomata imaging and segmentation pipeline incorporating generative AI to reduce dependency on manual groundtruthing. Plant Methods. 2025;21(1):148. doi: 10.1186/s13007-025-01451-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref034] 34.Grigorieva E, Matzarakis A, de Freitas C. Analysis of growing degree-days as a climate impact indicator in a region with extreme annual air temperature amplitude. Clim Res. 2010;42(2):143–54. doi: 10.3354/cr00888 [DOI] [Google Scholar]

[pone.0343530.ref035] 35.McMaster G. Growing degree-days: one equation, two interpretations. Agricultural and Forest Meteorology. 1997;87(4):291–300. doi: 10.1016/s0168-1923(97)00027-0 [DOI] [Google Scholar]

[pone.0343530.ref036] 36.Fraisse CW, Paula-Moraes SV. Degree-Days: Growing, Heating, and Cooling. EDIS. 2018;2018(2). doi: 10.32473/edis-ae428-2018 [DOI] [Google Scholar]

[pone.0343530.ref037] 37.Zhou G, Wang Q. A new nonlinear method for calculating growing degree days. Sci Rep. 2018;8(1):10149. doi: 10.1038/s41598-018-28392-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref038] 38.Yan W. Simulation and Prediction of Plant Phenology for Five Crops Based on Photoperiod×Temperature Interaction. Annals of Botany. 1998;81(6):705–16. doi: 10.1006/anbo.1998.0625 [DOI] [Google Scholar]

[pone.0343530.ref039] 39.Yang W, Wu T, Zhang X, Song W, Xu C, Sun S, et al. Critical Photoperiod Measurement of Soybean Genotypes in Different Maturity Groups. Crop Science. 2019;59(5):2055–61. doi: 10.2135/cropsci2019.03.0170 [DOI] [Google Scholar]

[pone.0343530.ref040] 40.Cao W, Moss DN. Temperature Effect on Leaf Emergence and Phyllochron in Wheat and Barley. Crop Science. 1989;29(4):1018–21. doi: 10.2135/cropsci1989.0011183x002900040038x [DOI] [Google Scholar]

[pone.0343530.ref041] 41.Frank AB, Bauer A. Phyllochron differences in Wheat, Barley, and Forage Grasses. Crop Science. 1995;35(1):19–23. doi: 10.2135/cropsci1995.0011183x003500010004x [DOI] [Google Scholar]

[pone.0343530.ref042] 42.Gunawardena TA, Fukai S, Blamey FPC. Low temperature induced spikelet sterility in rice. II. Effects of panicle and root temperatures. Australian Journal of Agricultural Research. 2003;54(10):947–56. doi: 10.1071/ar03076 [DOI] [Google Scholar]

[pone.0343530.ref043] 43.Zeng Y, Zhang Y, Xiang J, Uphoff NT, Pan X, Zhu D. Effects of low temperature stress on Spikelet-related parameters during anthesis in Indica-Japonica hybrid rice. Front Plant Sci. 2017;8:1350. doi: 10.3389/fpls.2017.01350 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref044] 44.Shimono H, Okada M, Kanda E, Arakawa I. Low temperature-induced sterility in rice: evidence for the effects of temperature before panicle initiation. Field Crops Research. 2007;101(2):221–31. doi: 10.1016/j.fcr.2006.11.010 [DOI] [Google Scholar]

[pone.0343530.ref045] 45.Shimono H, Hasegawa T, Moriyama M, Fujimura S, Nagata T. Modeling spikelet sterility induced by low temperature in rice. Agronomy Journal. 2005;97(6):1524–36. doi: 10.2134/agronj2005.0043 [DOI] [Google Scholar]

[pone.0343530.ref046] 46.Xu Y, Chu C, Yao S. The impact of high-temperature stress on rice: challenges and solutions. The Crop Journal. 2021;9(5):963–76. doi: 10.1016/j.cj.2021.02.011 [DOI] [Google Scholar]

[pone.0343530.ref047] 47.Satake T, Yoshida S. High Temperature-Induced Sterility in Indica Rices at Flowering. Jpn J Crop Sci. 1978;47(1):6–17. doi: 10.1626/jcs.47.6 [DOI] [Google Scholar]

[pone.0343530.ref048] 48.Jagadish SVK, Craufurd PQ, Wheeler TR. High temperature stress and spikelet fertility in rice (Oryza sativa L.). J Exp Bot. 2007;58(7):1627–35. doi: 10.1093/jxb/erm003 [DOI] [PubMed] [Google Scholar]

[pone.0343530.ref049] 49.Ren H, Bao J, Gao Z, Sun D, Zheng S, Bai J. How rice adapts to high temperatures. Front Plant Sci. 2023;14:1137923. doi: 10.3389/fpls.2023.1137923 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0343530.ref050] 50.Gérardeaux E, Falconnier G, Gozé E, Defrance D, Kouakou P-M, Loison R, et al. Adapting rainfed rice to climate change: a case study in Senegal. Agron Sustain Dev. 2021;41(4). doi: 10.1007/s13593-021-00710-2 [DOI] [Google Scholar]

[pone.0343530.ref051] 51.Hengl T, Mendes de Jesus J, Heuvelink GBM, Ruiperez Gonzalez M, Kilibarda M, Blagotić A. SoilGrids250m: global gridded soil information based on machine learning. 2017. https://soilgrids.org [DOI] [PMC free article] [PubMed]

[pone.0343530.ref052] 52.Romero CC, Hoogenboom G, Baigorria GA, Koo J, Gijsman AJ, Wood S. Reanalysis of a global soil database for crop and environmental modeling. Environmental Modelling & Software. 2012;35:163–70. doi: 10.1016/j.envsoft.2012.02.018 [DOI] [Google Scholar]

[pone.0343530.ref053] 53.Alaya I, Masmoudi MM, Lagacherie Ph, Coulouma G, Jacob F, Ben Mechlia N. Performance of saxton and rawls pedotransfer functions for estimating soil water properties in the Cap Bon Region-Northern Tunisia. Water and land security in drylands. Springer; 2017. p. 77–85. 10.1007/978-3-319-54021-4_8 [DOI] [Google Scholar]

[pone.0343530.ref054] 54.Sung CTB, Iba J. Accuracy of the Saxton-Rawls method for estimating the soil water characteristics for mineral soils of Malaysia. Pertanika J Trop Agric Sci. 2010;33(2):297–302. [Google Scholar]

[pone.0343530.ref055] 55.Rawls WJ, Brakensiek DL, Saxtonn K. Estimation of soil water properties. Transactions of the ASAE. 1982;25(5):1316–1320. [Google Scholar]

[pone.0343530.ref056] 56.Hodges R, et al. NASA POWER (Prediction Of Worldwide Energy Resources) Data. 2017.

[pone.0343530.ref057] 57.Jones JW, Hoogenboom G, Porter CH, Boote KJ, Batchelor WD, Hunt LA, et al. The DSSAT cropping system model. European Journal of Agronomy. 2003;18(3–4):235–65. doi: 10.1016/s1161-0301(02)00107-7 [DOI] [Google Scholar]

[pone.0343530.ref058] 58.Hoogenboom G, Porter CH, Boote KJ, Shelia V, Wilkens PW, Singh U, et al. The DSSAT crop modeling ecosystem. Burleigh Dodds Series in Agricultural Science. Burleigh Dodds Science Publishing. 2019. p. 173–216. 10.19103/as.2019.0061.10 [DOI] [Google Scholar]

[pone.0343530.ref059] 59.MathWorks Inc. MATLAB version 24.2.0.2712019 (R2024b). 2024.

[pone.0343530.ref060] 60.R CT. R: A Language and Environment for Statistical Computing. 2024.

[pone.0343530.ref061] 61.Morris MD. Factorial Sampling Plans for Preliminary Computational Experiments. Technometrics. 1991;33(2):161–74. doi: 10.1080/00401706.1991.10484804 [DOI] [Google Scholar]

[pone.0343530.ref062] 62.Pang Z, O’Neill Z, Li Y, Niu F. The role of sensitivity analysis in the building performance analysis: A critical review. Energy and Buildings. 2020;209:109659. doi: 10.1016/j.enbuild.2019.109659 [DOI] [Google Scholar]

[pone.0343530.ref063] 63.Fukai S, Mitchell J. Factors determining water use efficiency in aerobic rice. Crop and Environment. 2022;1(1):24–40. doi: 10.1016/j.crope.2022.03.008 [DOI] [Google Scholar]

PERMALINK

Decoding living systems: Reassessing crop model frontiers via biological dynamics and optimized phenotype

Edgar S Correa

Roles

Abstract

Introduction

Fig 1. Conceptual framework of biological process-based crop modeling: genotype-by-environment interactions unfold as environmental inputs drive physiological processes governed by genetic-based coefficients.

Materials and methods

Biological-based growth modeling setup

Fig 2. Genetic coefficients in CERES-Rice organized by physiological function: phenological development (P1, P2O, P2R, P5), source-sink partitioning (PHINT, G1, G2, G3), and thermal stress thresholds (THOT, TCLDP, TCLDF).

Model input data: Environment and cultivars

Table 1. Environmental inputs and phenotypic outputs for process-based modeling. Soil, climate, and crop parameters used in the crop growth model.

GMM-based environmental classification

Fig 3. Environmental categories in the Casamance and Eastern Senegal region.

Software simulation setup

Sensitivity analysis of CERES-Rice model

Fig 4. Morris method sampling strategy for CERES-Rice sensitivity analysis.

Genetic algorithm optimization

Fitness Metric: HI-WUE Integrated Index.

Similarity analysis with field-characterized cultivars

Fig 5. Genetic crop growth coefficients for 21 rice cultivars characterized through rainfed field experiments.

Results

Sensitivity analysis of genetic parameters

Fig 6. Sensitivity analysis reveals hierarchical parameter control over crop performance.

Biomass and yield sensitivity.

Phenological timing.

Thermal stress parameters.

Ideotype optimization through genetic algorithm

Efficiency-yield correlations.

Fig 7. Phenotypic landscape explored by genetic algorithm optimization across four contrasting environments.

Environment-specific convergence dynamics.

Two distinct adaptive strategies.

Fig 8. Optimized ideotype performance across four environments characterized by the intersection of soil water retention (high to low) and precipitation regime (815 mm in the south vs. 540 mm in the north).

Reproductive coefficient stability.

Similarity and affinity of optimized ideotypes

Fig 9. Genetic similarity between optimized ideotypes and field-validated cultivars.

Fig 10. Similarity heatmap between optimized ideotypes and field-validated cultivars.

Fig 11. Genetic crop growth parameters of optimized ideotypes and highest-affinity cultivars.

Discussion

Sensitivity analysis as biological validation

From parameter influence to phenotypic optimization

Bridging computational ideotypes to breeding targets

Model boundaries and future directions

Conclusion

AI-driven optimization of biological processes

Case study findings

Modeling living systems: Limitations and horizons

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Paulo Eduardo Teodoro

Roles

Author response to Decision Letter 1

Decision Letter 1

Paulo Eduardo Teodoro

Roles

Author response to Decision Letter 2

Decision Letter 2

Paulo Eduardo Teodoro

Roles

Acceptance letter

Paulo Eduardo Teodoro

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases