Skip to main content
Heliyon logoLink to Heliyon
. 2024 Nov 29;10(24):e40836. doi: 10.1016/j.heliyon.2024.e40836

Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and sustainability

Md Abu Jabed a,b, Masrah Azrifah Azmi Murad a,
PMCID: PMC11667600  PMID: 39720079

Abstract

The agriculture sector is confronted with numerous challenges in the quest for accurate crop yield estimation, which is essential for efficient resource management and mitigating food scarcity in a rapidly growing global population. This research paper delves into the application of advanced Artificial Intelligence (AI) techniques to enhance crop yield estimation in the context of diverse agricultural challenges. Through a systematic literature review and analysis of relevant studies, this paper explores the role of AI methods, such as Machine Learning (ML) and Deep Learning (DL), in addressing the complexities posed by geographical variations, crop diversity, and cultivation areas. The review identifies a wealth of AI-powered solutions employed in crop yield prediction, emphasizing the importance of precise environmental and agricultural data. Key factors contributing to accurate estimation include temperature, rainfall, soil type, humidity, and various vegetation indices, such as NDVI, EVI, LAI, and NDWI. The research paper also examines the algorithms frequently utilized in the machine learning domain, including Random Forest (RF), Artificial Neural Networks (ANN), and Support Vector Machine (SVM). In the realm of deep learning, Convolutional Neural Networks (CNN), Long-Short Term Memory (LSTM), and Deep Neural Networks (DNN) emerge as promising candidates. The findings of this study shed light on the transformative potential of advanced AI techniques in improving crop yield estimation accuracy, ultimately enhancing agricultural planning and resource management. By addressing the challenges posed by geographical diversity, crop heterogeneity, and changing environmental conditions, AI-driven models offer new avenues for sustainable agriculture in an ever-evolving world. This research paper provides valuable insights and directions for future studies, highlighting the critical role of AI in ensuring food security and sustainability in agriculture.

Index terms: Systematic literature review, Crop yield production, Decision support system, Machine learning, Deep learning

1. Introduction

Agriculture, a focus of scientific and technological innovation, has seen significant transformations due to technology (Montero et al., 2020). Crop yield estimation is a critical aspect, depending on various factors, including agrometeorological variables like soil properties, climate, and irrigation [1]. Potential and actual yields, as well as the yield gap, are essential concepts in this context [1].

Traditionally, crop performance assessments relied on observations by experts, influencing decision-making and livelihoods [2]. However, errors in these predictions can lead to food shortages. Modern precision agriculture faces challenges in predicting crop production, with many effective models proposed [3]. Addressing these challenges requires extensive datasets [4] and a continuous drive for improved accuracy [5].

Machine learning techniques are being widely adopted in agriculture to monitor and predict environmental factors [6,7]. These techniques build prediction models using data and have applications across various domains (Udousoro, 2020). However, developing accurate and interpretable models in crop production prediction remains challenging due to multifaceted influences [8]. Various methods, including field surveys, statistical models, crop growth models, and remote sensing, have been employed in crop yield prediction [9].

Accurate crop yield prediction is essential at national and regional levels, enabling informed decisions by farmers [4,10]. Multiple methods exist for forecasting agricultural yields, necessitating extensive datasets due to the complex nature of agricultural production influenced by climate and soil factors [4]. The choice between predictive and descriptive machine learning models depends on the research context [3,11].

This study conducted a systematic literature review (SLR) to explore the use of machine learning and deep learning in predicting crop yields. The SLR aims to identify research gaps and provide guidance to practitioners and academics interested in this field [3]. Relevant papers were retrieved from electronic databases and synthesized following established SLR methodology to address the research questions. The organization of this study is as follows: Section II delves into the background, Section III offers an overview of crop yield prediction, Section IV presents the results and discussion, Section V introduces practical applications, section VI focuses on recommendations for future research, and Section VII serves as the conclusion of the paper.

2. Section II. Related work

In this section, we summarize the existing review articles on yield estimation. Prioritizing crop production enhancement is vital for efficient decision-making at national and regional levels. Various methods exist for forecasting agricultural yields, and it is a challenging task in precision agriculture, with several effective models introduced [4]. A comprehensive understanding of crop production growth patterns is essential for farmers. This review article delves into the research on crop production prediction utilizing machine learning and deep learning techniques found in the literature.

Chlingaryan et al. [12] reviewed nitrogen status estimation in agriculture using machine learning to improve crop yield predictions. They anticipate the development of cost-effective agricultural solutions through advances in sensing technologies and machine learning, with the emergence of hybrid systems based on machine learning approaches.

Liakos et al. [13] examined a review of the application of machine learning in agriculture, analyzing publications related to water resources, livestock, crop, and land management. Their study highlights the potential benefits of machine learning technologies for the agricultural sector.

Young [14] examined essential methods in official statistics, remote sensing, and surveys for crop yield forecasting, highlighting prediction uncertainties and research gaps. However, it lacks coverage of prevalent machine learning techniques and specific crop yield models for different crops, potentially limiting its relevance to some readers.

Elavarasan et al. [15] conducted a review of papers focusing on machine learning models for forecasting agricultural production using meteorological factors. The research indicates the need for further investigation into additional factors influencing crop productivity.

Kamilaris & Prenafeta-Boldú [16] surveyed 40 research initiatives using deep learning in agriculture and food production. They found that deep learning is highly effective, outperforming traditional image processing methods, and can address diverse agricultural challenges with high accuracy.

Beulah [17] assessed various data mining approaches used in crop production prediction and concluded that data mining techniques can be applied to address this issue.

Koirala et al. [18] focused on deep learning for fruit identification and localization in agriculture, advocating for standardized metrics in model comparison. They highlighted the efficacy of deep learning methods and recommended CNN detectors, deep regression, and LSTM for fruit load assessment.

van Klompenburg et al. [3] conducted a systematic review of crop yield prediction techniques, focusing on information extraction without in-depth analysis or recommendations. They identified common deep learning algorithms used in the field, with CNN, LSTM, and DNN being the most preferred methods for crop yield prediction.

Häni et al. [19] presented a system for apple yield estimation through deep learning-based fruit detection and counting. Their study compared semi-supervised and deep learning methods, revealing that Gaussian mixture models outperformed deep learning methods like U-NET, Fast R-CNN, and CNN in most datasets for yield detection.

Zhang et al. [20] explored Deep Learning (DL) for agricultural tasks in dense scenes, showing its efficiency and accuracy in activities like recognition and yield estimation. DL, particularly Convolutional Neural Networks (CNNs), outperformed alternative methods in dense agricultural scenes, emphasizing its importance in computer vision tasks.

Dharani et al. [21] explore the use of deep learning methods, emphasizing their role in crop yield prediction for agriculture. They propose that artificial intelligence can improve crop management and yield forecasting, with recurrent neural networks and hybrid networks like RNN-LSTM showing superior accuracy in predictions compared to other networks.

Darwin et al. [22] review deep learning and computer vision in smart agriculture for crop yield estimation, highlighting automation benefits in image analysis and remote sensing. They conclude that using deep learning with machine vision improves the accuracy of automated agricultural systems.

Maheswari et al. [23] investigate deep learning-based semantic segmentation for fruit yield estimation in orchards, emphasizing its advantages over traditional methods. They address challenges in fruit yield estimation and suggest that DL-based techniques outperform traditional approaches by incorporating human cognition into the architecture.

Monteiro et al. [24] provide a brief overview of scientific and technological tools in precision agriculture and their applications in crop and livestock farming. They emphasize resource optimization and how precision agriculture can meet food demand while ensuring sustainability.

Rashid et al. [25] conducted a comprehensive review on machine learning-based palm oil yield prediction, highlighting the need for diverse features and prediction techniques in future research. They note the common use of ML algorithms like LR, RF, and NN, as well as some DL models in crop yield forecasting.

Benos et al. [26] assess current scholarly literature on machine learning in agriculture, providing insights for stakeholders interested in its potential advantages. The study highlights the use of various sensors on satellites, ground vehicles, and aerial vehicles for gathering reliable input data for machine learning in agriculture.

Hasan et al. [27] provide an overview of deep learning-based algorithms for weed identification and classification in agriculture, covering data collection, dataset preparation, deep learning techniques, and assessment metrics approaches.

Modi et al. [1] conduct a comprehensive investigation into non-destructive techniques for yield prediction, focusing on data collection, pre-processing, properties, techniques, and outcomes, while identifying commonly used methods and recommending model integration for improved accuracy.

Muruganantham et al. [28] highlight the benefits of using deep learning in crop yield forecasting and recommend remote sensing technologies based on data requirements and influencing factors. The study acknowledges challenges like enhancing model accuracy, practical application, and addressing the opacity of deep learning models.

Oikonomidis et al. [29] conduct a systematic review on deep learning applications in crop yield prediction, analyzing motivations, target crops, methods, features, and data sources in the identified papers.

Bouguettaya et al. [30] review the use of deep learning techniques for crop and plant classification from UAV-based remote sensing imagery, emphasizing the importance of robust tools like Convolutional Neural Networks (CNN) and the potential of combining UAV data and deep learning for accurate crop classification.

Ojo & Zahid [31] explore the recent developments, challenges, and future prospects of using deep learning in controlled environment agriculture (CEA). They highlight DL applications in CEA, challenges, research directions, and commonly used models, focusing on CNNs and RNN-LSTM for time series forecasting in CEA.

The purpose of this systematic literature review is to analyze the influence of environmental factors on crop growth and identify research gaps in machine learning and deep learning technology. It explores the benefits of using machine learning and deep learning for crop yield prediction, identifies appropriate remote sensing technologies, and considers factors affecting crop yield, offering fresh insights into current research.

3. Section III. Methodology

3.1. Review protocol

This study conducts a systematic literature review (SLR) on crop yield estimation, focusing solely on journal articles. Its purpose is to identify research gaps in the machine learning and deep learning methods within this specific field [28]. The review not only encompasses all journal research but also aligns with our study's research questions. Such comprehensive literature reviews are vital for evaluating theories and data accuracy within a field [32]. Article selection adheres to PRISMA guidelines [33], using databases like Scopus, Google Scholar, Science Direct, PubMed, Web of Science, Mendeley Research Networks, and Wiley. Pertinent research is screened and evaluated based on inclusion and quality standards [3]. All pertinent data is collected from these studies and synthesized to address our research questions.

3.2. Research questions

The following research questions are developed to guide the systematic review:

  • 1.

    What machine learning and deep learning techniques are employed for crop yield prediction?

  • 2.

    What features or variables are utilized for crop yield prediction through machine learning and deep learning methods?

  • 3.

    What criteria and methodologies are applied to evaluate crop yield prediction?

  • 4.

    What obstacles exist in crop yield prediction using machine learning and deep learning techniques?

Q1 helps us assess both the benefits and drawbacks of employing machine learning and deep learning techniques for crop yield prediction. Q2 assists us in gaining insight into the diverse factors that impact the utilization of machine learning and deep learning methods in crop yield prediction. Q3 provides valuable insights into the assessment criteria and methods applied in the context of crop yield prediction. Q4 facilitates our comprehension of the constraints and difficulties associated with existing approaches.

3.3. Technique for article searches

The systematic literature review's objective forms the foundation for the article search methodology. The search is carried out with a specific emphasis on the key concepts relevant to this review's parameters. The primary focus of this review is shaped by the terms "machine learning or deep learning" and "crop yield prediction." The scope is restricted to publications from January 2018 to April 2023. Upon identifying a pertinent study, an examination of its references was conducted to identify any additional studies missed in the initial search. This iterative process continued until no more relevant studies were uncovered.

The exclusion criteria employed during the analysis of retrieved publications included the determination of whether they were survey papers or general review articles. The subsequent section addresses the topics associated with the excluded publications.

Exclusion Criteria The studies underwent a comprehensive assessment and were systematically categorized according to predefined exclusion criteria. These criteria were set to delineate the boundaries of the systematic review and to eliminate studies that did not meet the relevant criteria. The subsequent list outlines the exclusion criteria (EC):

First criterion Articles pertaining to agriculture but not specifically focused on crop yield prediction.

Second criterion Publications that have been previously acquired or are duplicates.

Third criterion Studies without full-text accessibility.

Fourth criterion Articles authored in languages other than English, conference papers, book chapters, reviews, surveys, Master's theses, or PhD dissertations.

Fifth criterion Publications published before the year 2018.

Following the application of the exclusion criteria, a total of 176 articles were ultimately chosen. Subsequently, the removal of duplicate entries from the selected databases led to the evaluation of 115 unique articles. It is worth emphasizing that only journal articles met the inclusion criteria for this review. The methodology employed in this review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines, as illustrated in Fig. 1 (PRISMA, accessed on February 1, 2021).

Fig. 1.

Fig. 1

Flowchart displaying search results.

Table 1 displays the initial tally of retrieved papers, the quantity that met the selection criteria, and the total number of articles remaining after the removal of duplicates. Notably, a substantial portion of the papers originated from the Scopus, Science Direct, and Google Scholar databases. The primary focus of data analysis revolved around assessing adherence to the exclusion criteria. In the data synthesis stage, all accumulated information was compiled and integrated.

Table-1.

Distribution of articles is based on the databases.

Database Number of articles that were first retrieved Number of papers remaining after exclusion criteria Number of articles after removing the repeated articles
Science Direct 87 38 26
Web of Science 68 12 8
Scopus 255 72 53
Google Scholar 294 33 20
PubMed 36 9 5
Mendeley Research Networks 20 5 1
Wiley 35 7 2
Total 795 176 115

3.4. Approaches used in literature discussion

We have identified and highlighted a total of 115 research papers published in various journals from 2018 to April 2023, all of which are relevant to our study. These papers have been meticulously compiled into a comprehensive Literature Review Table. This table encompasses essential details, including the databases utilized, author names, publication years, data sources, features employed, methodologies applied, measured attributes, research objectives, findings, and any limitations noted in these articles.

Furthermore, we have dedicated an Appendix section that presents a detailed analysis of the 20 most closely related papers from the aforementioned selection. These papers offer a more in-depth exploration of the subject matter for reference and further examination.

The Appendix (Literature Review Table) offers an overview of the various approaches employed for predicting crop yields, encompassing machine learning, deep learning, and hybrid methods. Unique approaches like Extreme Learning Machine (ELM), Stochastic Gradient Descent (SGD), Sequential Minimal Optimization Regression (SMOreg), Elastic Net (EN), Interaction Regression (IR), Deep Recurrent Q-Network (DRQN), Cubist, and YieldNet are discussed. Frequently used machine learning algorithms include Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Networks (ANN). K-Nearest Neighbors (K-NN), Decision Trees (DT), and Multi Linear Regression (MLR) are also common choices.

In the realm of deep learning, CNN, DNN, and LSTM are frequently employed for crop yield prediction. Deep learning falls under the broader umbrella of machine learning and is gaining significant attention for addressing crop yield prediction challenges. Additionally, these deep learning approaches are often combined, such as in CNN-LSTM, RNN-LSTM, and CNN-RNN multilevel deep learning systems with multiple layers, along with multimodal fusion techniques. Transfer learning (TL) is another method by which pre-trained deep learning models can be adapted for accurate crop yield prediction across various agricultural settings.

Notably, when remote sensing data is incorporated, simple neural networks are less commonly used for agricultural yield forecasting. Data pre-processing is a prevalent component integrated into the majority of these methodologies.

3.4.1. Machine learning approach

In contrast to rule-based methods, machine learning utilizes statistical analysis to detect data patterns. Traditional machine learning approaches like SVM, LR, RF, K-NN, and DT are trained on labeled datasets for predicting new labels. RF, SVM, and ANN have shown excellent performance in crop yield prediction [34]. RF is an efficient method widely used in agricultural studies [[35], [36], [37], [38], [39], [40]]. SVM is suitable for yield prediction using weather and MODIS-based vegetation indices [41]. ANNs are often used for crop production prediction models [42]. Machine learning predictors for classification include PSO-SVM, KNN, and RF [43].

Conventional machine learning methods require pre-processing stages, including data pre-processing, feature extraction, and feature selection, to effectively train algorithms [44]. A study [45] combined six algorithms to enhance crop identification based on soil type. ML models demonstrate strong performance using 10-fold cross-validation [46].

3.4.2. Deep learning-based crop yield prediction

Deep learning (DL), a subset of machine learning, excels at analyzing labeled and unstructured data [47,48]. It's widely applied in agriculture due to its ability to handle large datasets, discover variable correlations, and employ nonlinear functions [49,50]. Deep learning outperforms traditional methods in feature extraction, crucial for agricultural yield prediction [48].

Notably, Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM) and Deep Neural Network (DNN) are the most widely used deep learning methods in crop yield prediction. LSTM, a variant of Recurrent Neural Networks (RNN), is particularly valuable for capturing time-dependent information [51]. Deep neural networks consist of multiple nonlinear layers, extracting information at each level [52]. Discovering nonlinear associations between input and response variables is a key role of deep neural networks, often with various hidden layers [52].

Enhancing deep learning algorithms' performance involves using techniques like stochastic gradient descent (SGD), batch normalization, and dropout. Some of these deep learning methods are briefly outlined below:

Deep Neural Networks (DNN) DNN methods, apart from the number of hidden layers, closely resemble traditional ANN algorithms [53]. Both DNN and ANN networks typically feature multiple fully connected hidden layers [54]. In contrast, other deep learning algorithms like CNN incorporate various types of layers, including convolutional and pooling layers [3].

Convolutional Neural Networks (CNN) CNN comprises fundamental units arranged between input and output layers, including convolutional, pooling, and activation layers [30,55]. In the convolution layer, local filters perform convolution operations on input data, while the pooling layer generates reduced-dimensional data through operations like max-pooling and average-pooling. The activation layer's nonlinear operations enhance CNN's ability for nonlinear fitting [55]. CNN updates weights using Backpropagation (BP), similar to the Backpropagation Neural Network (BPNN) approach.

Long-Short Term Memory (LSTM) LSTM is a type of Recurrent Neural Network (RNN) that uses gradient-based algorithms to learn time-dependent data. LSTM consists of a chain structure consisting of an input layer, one or more LSTM layers, and an output layer [25,28].

Recurrent Neural Networks (RNN) Recurrent Neural Networks (RNNs) are a type of neural network that processes data in sequences [56]. It is a sort of artificial neural network in which the temporal dependencies of nodes are represented by a directed graph [57]. RNN is useful for sequence modeling; This approach makes RNN more efficient because the sequence data used is better understood [50].

Transfer Learning (TL) Transfer learning is a machine learning technique where a model trained on one task is repurposed on a second related task. It is particularly useful when working with limited labeled data because it allows us to leverage knowledge from one domain and apply it to another [58]. In deep learning, transfer learning involves taking a pre-trained model (trained on a large dataset) and fine-tuning it on a new dataset or task [58,59]. This process can significantly speed up training and improve the performance of the model, especially when the new task has limited available data [59].

3.4.3. Miscellaneous approaches in crop yield prediction

Various classification and regression techniques have proven successful in agricultural production prediction, including Linear Regression (LR) [60], Naive Bayes (NB) [61], Extreme Gradient Boosting (XGBoost) [[62], [63], [64], [65]], Gradient boosting (GB) (Anbananthen et al., 2022), Interaction Regression [8], AdaBoost [66], Logistic Regression (LR) [61], Levenberg–Marquardt (LM) [67], Extreme Learning Machine (ELM) [68], Deep Learning Multi-Layer Perceptron (DLMLP) [69], Deep recurrent Q-learning (DRQN) [70], General Regression Neural Networks (GRNN) (S. V. [71]; V. [72]), Deep Convolutional Regression Network (DCRN) [73], Light Gradient Boosting Machine (LightGBM) [74], Linear Discriminant Analysis (LDA) (Mupangwa et al., 2020), Multi-parametric Deep Neural Network (MDNN) (Kalaiarasi and Anbarasi, 2021), Special-Spectral-Temporal Neural Network (SSTNN) [75], YieldNet [76], Ensemble methods [36,45,[77], [78], [79], [80], [81]], Stochastic Gradient Descent (SGD) [82], and Cubist [83], all of which have been effectively used in crop yield prediction.

3.4.4. Hybrid approach to crop yield prediction

Combining the strengths of various machine learning and deep learning algorithms is a common practice. Anbananthen et al. [36] integrated gradient boosting, random forest, and LASSO regression for localized crop yield forecasting. Raja et al. [80] integrated RF, Bagging, K-NN, SVM, DT, and NB to improve prediction accuracy for cereals, potatoes, and energy crops. Sajid et al. [84] combined LR, Lasso Regression, RF, XG Boost, and Light GBM for maize yield prediction. Srivastava et al. [85] integrated CNNs and FCNNs for winter wheat yield prediction. Oikonomidis et al. [65] combined CNN-LSTM, CNN-DNN, CNN-XGBoost, and CNN-RNN for soybean yield prediction. Khaki et al. [57] integrated CNN-RNN for corn and soybean yield prediction. Olofintuyi et al. [50] used CNN-RNN with LSTM for cocoa yield prediction. Bali & Singla [49] combined RNN-LSTM for wheat yield prediction. Shook et al. [86] integrated SVR-RBF for soybean yield prediction. Jui et al. [87] used DRS-RF for tea yield prediction. Neural network-based crop yield prediction models have shown promising accuracy ([47,65,81,88,89]; Wang et al., 2018). However, their black-box nature hinders interpretability [90]. To address the interpretability problem in the future, deep learning-based techniques can be utilized to solve fractional models [91,92] and differential equations [93] in the field of crop yield prediction.

3.5. Performance evaluation metrics used in crop yield prediction analysis

The evaluation of model outputs against actual data is a crucial step in estimation. This section discusses the commonly used evaluation metrics and techniques.

Performance Evaluation Metrics Evaluation metrics are essential for assessing model performance, distinguishing between different learning models [70]. Key performance metrics for regression models include the mean absolute error (MAE), mean squared error (MSE), root mean square error (RMSE), determination coefficient (R-squared), and mean absolute percentage error (MAPE). The average importance of errors with a given array of forecasts is calculated using the MAE, which is defined as an arithmetic mean of the absolute deviations between the predicted observations and the actual observations [94]. MSE measures how closely the regressor line resembles the dataset points, which is used to assess the estimator's performance [70]. The RMSE measures how well the data are focused on the best fit line and used to calculate the standard deviation of the residuals or projected error [95]. The superiority of the generated framework over the baseline framework is demonstrated by the coefficient of determination (R-Squared), which is used to assess how well the regression framework fits the data [96]. The average of the percentage errors, or how far the model's prediction deviates from the related results, is calculated using MAPE [70].

For machine learning-based classification algorithms in crop yield prediction, evaluation includes accuracy [40,80], precision (Li et al., 2018; Sivanantham et al., 2022), recall [59,97], sensitivity [34,43], specificity [80,97], and F1 Score (Kalaiarasi and Anbarasi, 2021; Mupangwa et al., 2020). Classification accuracy remains the most widely used and informative metric for classification problems.

4. Section IV results and discussion

The papers chosen for the review are evaluated and condensed.

Fig. 2 exhibits the number of articles published spanning from 2018 to April 2023. Evidently, there has been notable advancement in research related to predicting crop yields.

Fig. 2.

Fig. 2

Distribution of articles.

4.1. Q1: approaches used in literature discussion

For the first research question (Q1) Previous studies have employed a range of classification and regression algorithms for crop yield prediction. Among the selected 115 articles, various machine learning and deep learning methods have been summarized. Unique methods include Interaction Regression (IR), Elastic Net (EN), Extreme Learning Machine (ELM), Cubist, and YieldNet. Commonly used machine learning algorithms are Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN), while Extreme Gradient Boosting (XGBoost), Decision Trees (DT), and Multi Linear Regression (MLR) are also prevalent, as depicted in Fig. 3. In contrast, deep learning algorithms like Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Deep Neural Network (DNN) are frequently utilized for crop yield prediction. Deep learning falls under the umbrella of machine learning and is increasingly promising in this field. These methods are often combined, such as CNN-LSTM and multilevel deep learning systems, particularly when utilizing remote sensing data. Notably, simple neural networks are less commonly used for crop production prediction. Many of these approaches incorporate data pre-processing. Fig. 3 illustrates the distribution of crop yield prediction publications based on machine learning and deep learning methods.

Fig. 3.

Fig. 3

Most Used Machine Learning and Deep Learning approaches.

4.1.1. Approaches discussion

Fig. 3 presents the most commonly employed methods as identified from the selected articles.

Random forest (RF) is one of the most popular machine learning techniques for predicting crop yields, as seen in Fig. 3. Because of its unique capacity to locate the crucial details inside the data, it is widely used. The advantages of adopting the RF technique include the ability to resolve the variable collinearity problem, which is typically addressed by using conventional LR models. According to Fig. 3, the second most used algorithm is the Artificial Neuron Network (ANN). Artificial Neurons allows them to process huge volumes of data, identify a pattern from it, and draw conclusions. ANN was utilized extensively in agricultural yield prediction. The third most used algorithm is support vector machine (SVM). SVR performed significantly better than K-NN, BNN, and LR in research [98].

Additionally, depicted in Fig. 3 is the convolutional neural network (CNN), which stands as one of the most widely used deep learning techniques for forecasting agricultural productivity. It has a unique ability to locate the crucial details in the data, which makes it quite popular. Nevavuori et al. [89] investigated crop yield prediction using a distinctive profile of temperature and photoperiod, creating a region-specific deep learning model. For agricultural production modeling, Oikonomidis et al. [65] created CNN-DNN, CNN-RNN, CNN-LSTM, and CNN-XGBoost. The CNN-DNN model outperformed the others in terms of performance.

Results from prior crop yield prediction by Z. Zhang et al. [99] showed that the ML and DL approaches definitely outperformed LR, mainly because LASSO was able to extract the dynamic correlations between the variables and the target predictor [100,101].The ability of LSTM to learn time-dependent information distinguishes it from other deep learning techniques, which were also employed extensively in crop production prediction. X. Wang et al. [102] studied the model performance accuracy employed various combinations of soil, meteorological, and remote sensing data. It is not always true that the most popular algorithm is also the best one. Both classifiers and clustering methods are employed in the chosen papers. In those publications, images are utilized for clustering; hence the publication is related to machine vision rather than ML utilizing a numerical dataset.

ML and DL techniques such as RF, SVM, DT, ANN, XGBoost, Classification, LSTM, CNN and DNN have been applied for yield estimation. These approaches are based on statistical analysis and regression. It is quite clear that regression-based methods are mainly used for yield estimation [1].

Strengths and Weaknesses of Some Popular Algorithms Agricultural data often contain noisy features and outliers, which regularization can help mitigate. Regularized classifiers like linear SVM outperform non-regularized methods like LDA, and SVM's decision rule is a reliable and low-variance linear function in kernel space [[103], [104], [105]]. Low variability is crucial for minimizing classification errors in the inherently unstable agricultural data over time [53]. SVM performs well with small training sets and high-dimensional feature vectors [12,106]. Random Forest (RF) offers high-speed operation, generalization performance, and an ensemble of tree-structured classifiers, making it ideal for finding non-linear patterns [107]. RF requires minimal data preprocessing and excels in feature selection for superior yield results. Convolutional Neural Networks (CNNs) are essential in computer vision and remote sensing, capable of handling various retrieval tasks, including soil moisture retrieval (T. [55,108]). Unlike conventional neural networks, CNNs have a fully connected layer with local connectivity, making them highly effective for processing spatial data.

Algorithm’s ability to Cope with Specific Problems High-dimensional feature vectors in agricultural data can be effectively classified using techniques like Support Vector Machines (SVM). To handle sequences of high-dimensional data, dynamic classifiers can be beneficial, particularly when dealing with multiple time segments. Other methods like Random Forest (RF), Principal Component Analysis (PCA), and correlation coefficients can also manage high-dimensional datasets. For improved performance, combining classifiers, especially when dealing with non-stationary datasets, can reduce variance. Combining Linear Discriminant Analysis (LDA) and SVM often leads to better results. Additionally, Artificial Neural Networks (ANN) are frequently used in agricultural production and vegetation prediction due to their ability to extract complex, dynamic, and non-linear patterns, particularly with remote sensing data [12]. Convolutional Neural Networks (CNN) are ideal for image data with substantial training samples, enabling tasks like nutrient level estimation, crop yield mapping, and disease identification. For object recognition applications, advanced CNN architectures such as Faster R-CNN, R-FCN, and SSD are highly effective. Regression-based algorithms like RF, ANN, and Support Vector Regression (SVR) are successful for crop yield prediction, and model predictability can be enhanced by ensembling multiple algorithms [84].

4.2. Q2: features used in crop yield prediction

Regarding research question two (Q2), an exploration and synthesis of the features integrated into the machine learning and deep learning algorithms showcased in the papers was executed. Table 2 provides an overview of the features that have been examined for predicting crop yield. Since crop yield data is a crucial component of crop yield prediction, it is used exclusively in all publications.

Table 2.

Features used for crop yield prediction.

Feature No. of times used Feature No. of times used
Rainfall 50 Soil fertility 1
Temperature (max-min) 69 Snow water equivalent 3
Humidity 27 Yield 22
Soil Moisture 5 APAR 1
NDVI 40 GNDVI 3
EVI 24 GCVI 2
LAI 12 LSWI 2
FPAR 3 SAVI 6
Solar radiation 10 Irrigation 11
NIR 2 Number of tanks 1
DVI 1 number of tube wells and open wells 2
Average temperature 4 Wind Speed 6
Precipitation 19 SPI/GDO/LST 1
Soil type 34 Soil pH 11
Season 4 RGB 7
Area 10 Cultivation practices 1
TPI 1 Wind direction 2
Vapor pressure 5 Crop 8
Wet day frequency 1 Sample ration (SR) 2
Sentinel-2/2Y-102D/GRD/SAR (Remote sensing) 1 WDRVI/GPP/GRVI/ARD 1
Evapotranspiration 10 Satellite images 1
Seed variety 3 CI 1
Fertilizers (Nitrogen, Phosphorus, Potash) 23 Crop Phonology 1
Pesticides 1 NDWI 11
DESIS 1 SWIR-1, SWIR-2 3
SIF 2

Among the features considered, soil type, maximum and minimum temperature, humidity, and rainfall are consistently observed in the majority of studies, highlighting their prevalence. Nevertheless, it's important to recognize that certain studies introduce unique features tailored to their specific contexts. Included in these variables are precipitation forecasts, normalized difference vegetation index (NDVI), humidity, fertilizer, soil pH, enhanced vegetation index (EVI), normalized difference water index (NDWI), irrigation, photoperiod, gamma radiation, MODIS-EVI, leaf area index (LAI), and crop data. In addition, some studies have included extra nutrients like calcium, magnesium, potassium, sulfur, nitrogen, and boron as part of their feature selection. It is worth highlighting that the frequently employed features may not always rely on the same types of information. For instance, when examining temperature, researchers may consider factors such as average temperature, maximum temperature, and minimum temperature, as illustrated in the work of van Klompenburg et al. [3]. These studies primarily aim to predict the specific type of crop.

In Table 3, we have structured feature groups to represent essential attributes, providing a more organized view of the independent variables. These attributes span multiple domains, covering moisture, nutrients, field management, soil characteristics, and crop-related details. Notably, the most frequently utilized feature categories revolve around soil properties, meteorological data, weather conditions, and crop yield information.

Table 3.

Grouped features.

Group Name of the features No. of times used (GroupWise)
Soil Data Soil Moisture (5), Soil type (34), Area (10), Soil pH (11), Soil fertility (1) 61
Meteorological Data/Weather Conditions Rainfall (50),Temperature (max-min) (69), Humidity (27),Average temperature (4), Precipitation (19), Wet day frequency (1), Evapotranspiration (10), Vapor pressure (5), Wind speed (6), Wind direction (2), Solar radiation (10), Snow water equivalent (3) 206
Crop yield Information Pesticides (1), Crop (8), Yield data (22), Crop Phonology (1),Season (4) 36
Images RGB (7), Satellite images (1) 8
Vegetation Indices NDVI—normalized difference vegetation index (40), EVI—enhanced vegetation index (24), LA I – leaf area index (12), FPAR – fraction of photosynthetically active radiation (3), Sentinel-2 generated VIs and topographic data (1), NIR – near-infrared (2),DVI – difference vegetation index (1), TPI – Topographic Position Index (1),FPAR – fraction of photosynthetically active radiation (3) APAR – absorbed photosynthetically active radiation(1), GNDVI – Green Normalized Difference Vegetation Index (3), GCVI – green chlorophyll vegetation index (2), LSWI – Land Surface Water Index (2), SAVI – Soil Adjusted Vegetation Index (6), DESIS – DLR Earth Sensing Imaging Spectrometer (1), SIF - Stress intensification factor (2), CI - Composite Index (1), SPI/GDO/LST (1), WDRVI/GPP/GRVI/ARD (1), NDWI (11), SWIR-1, SWIR-2 (3) 121
Farm Management Fertilizers (Nitrogen, Phosphorus, Potash) (Kg) (23), Seed variety (3), Number of tanks (1), number of tube wells and open wells (2), Cultivation practices (1), Irrigation (11) 41

Within the 'Soil Information' attribute group, you'll come across elements like soil moisture, soil type, pH value, and production area. The 'Humidity' category includes various aspects related to precipitation, humidity levels, expected precipitation, and precipitation forecasts. 'Field management' deals with the decisions made by farmers regarding field modifications and encompasses nutrient management, including elements such as irrigation and fertilizer management. Additionally, data from solar sources can be used to derive parameters related to irradiance or temperature.

Environmental factors play a crucial role in crop yield prediction as they directly influence plant growth, development, and productivity. Understanding the complex interactions between environmental variables and crop yields is essential for developing accurate prediction models. Several key environmental factors include weather variables such as temperature, precipitation, humidity, and solar radiation, which significantly impact crop growth and development. Soil properties like texture, nutrient content, pH, and water-holding capacity also affect crop growth by influencing root development, nutrient uptake, and water availability. Additionally, topography, including geographical features such as elevation, slope, and aspect, can influence microclimates and water drainage, thereby affecting local growing conditions and crop yields.

Data Collection using Remote Sensing Crop production is influenced by environmental factors, weather patterns, diseases and other parameters [28]. Ground observation, remote sensing, global positioning system, and on-site surveys serve as methods for monitoring environmental conditions, additional parameters, and crop growth. Obtaining data over a large area through ground observation or traditional methods poses challenges, resulting in less precise and variable outcomes [12]. To overcome this constraint, remote sensing is increasingly being used for crop monitoring. Remote sensing technology comprises the process of collecting and analyzing information about the Earth and its components via instruments placed in the atmosphere or on satellites, eliminating the necessity for physical engagement [57]. Compared to approaches like field surveys, remote sensing can generate a substantial quantity of data [109]. Computation of vegetation index (VI) is one of the most important reasons for using optical remote sensing to acquire agricultural data [28]. Normalized difference vegetation index (NDVI), normalized difference water index (NDWI), enhanced vegetation index (EVI), green normalized difference vegetation index (GNDVI), fraction of photosynthetically active radiation (FPAR), soil adjusted vegetation index (SAVI) and various other indices are instances of vegetation indices (VI).

4.2.1. Effect of vegetation indices and environmental factors

Vegetation indices are designed to enhance sensitivity to vegetation characteristics while minimizing interference from factors like soil reflectance and atmospheric conditions. Zhao et al. [110] established a function linking vegetation indices to crop coefficients (Kc) for efficient irrigation management and water conservation. Vegetation indices find extensive use in crop yield estimation, protein analysis, biomass assessment, weed control, and fertilizer management (X. [102]). Normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), leaf area index (LAI), green normalized difference vegetation index (GNDVI), normalized difference water index (NDWI) and soil adjusted vegetation index (SAVI) are frequently employed vegetation indices. Research has explored how the relationship between remotely sensed data and crop yield changes over the growing season, with varying associations at different growth stages (Shiu and Chuang, 2019; Lin et al., 2019). Earlier studies often concentrated on specific crops and years, with some vegetation indices showing high accuracy in predicting crop yields, particularly for corn [94,[111], [112], [113]]. Crop yield prediction combines vegetation indices with environmental variables like canopy temperature, water stress, humidity, nutrients, and soil data (X. [102]). The complexity of crop yield prediction is noted, with limited research on individual features that significantly impact predictions.

In order to calculate the corn yield, X. Wang et al. [102] combined vegetation indices like the normalized difference vegetation index (NDVI) and the absorbed photosynthetically active radiation (APAR) with environmental variables included canopy surface temperature and the water stress index. In addition, additional characteristics like humidity, nutrients, and soil data are also utilized to forecast crop yields. There has been little research done to identify individual features that have a significant impact on crop yield prediction because so many features are already used in agricultural yield prediction. In light of this, a comprehensive exploration is required to gain a deeper insight into the variables and factors influencing crop production predictions.

4.3. Q3: third research question

In previous studies on crop yield prediction, a wide array of performance evaluation metrics has been employed, as depicted in Fig. 4. Remarkably, RMSE stands out as the most frequently utilized metric in crop yield prediction algorithms. Specifically, RMSE, R-squared, MAE, and classification accuracy were utilized as performance evaluation metrics in the papers by Joshua et al. [71], Pham et al. [114], Sajid et al. [84], and Shahhosseini et al. [81] respectively. However, the diverse use of evaluation metrics across these studies has posed challenges in comparing similar crop yield prediction models. As noted by Rashid et al. [25], this diversity complicates direct model comparisons. To address this issue, it is recommended that a consistent and systematic approach or a single standardized metric be adopted to quantify specific crop production prediction models. This would enable more straightforward and meaningful comparisons between different crop yield prediction algorithms.

Fig. 4.

Fig. 4

Metrics for assessing performance.

4.4. Q4: challenges discussion

This summary discusses the challenges and findings related to crop yield prediction based on a review of 115 articles, including 68 using machine learning (ML) methods and 47 using deep learning (DL) methods. Key points include:

Challenges in Crop Yield Prediction The primary challenge identified is the complexity of creating a functional model. Model accuracy tends to improve with larger and more diverse datasets. Implementing these models in farm management systems poses another difficulty, and accuracy increases when local parameters are incorporated.

Data Insufficiency Several studies mentioned the problem of insufficient data availability. Despite limited data, some methods still showed promise, but further testing with more diverse data is recommended. Integrating data from various sources is also suggested for improvement.

Variability in Features and Scope The selected publications vary in the features they use, which depends on the depth and quantity of investigation, study scope, region, and crop. The choice of attributes is influenced by dataset availability and research objectives. Notably, more features do not always lead to better yield prediction performance.

Algorithm Diversity Various studies employed a range of algorithms, making it challenging to determine an optimal model. However, some machine learning models were more commonly used than others.

Data Integration and Environmental Factors The consensus among studies is that incorporating appropriate data, especially environmental factors, is crucial for accuracy. Most articles used data from limited sources, and few considered environmental factors, which raised questions about their accuracy.

5. Section V: Practical applications

Crop yield prediction models have numerous practical applications in agriculture. Farmers can utilize these models to optimize the allocation of resources such as water, fertilizer, and pesticides, thereby maximizing yields and minimizing input costs. Additionally, by anticipating potential yield fluctuations due to environmental factors, farmers can implement risk management strategies to mitigate the impact of adverse weather conditions or other environmental stressors. Accurate yield predictions also enable stakeholders to make informed decisions regarding market planning, trade policies, and commodity pricing, contributing to more efficient and resilient agricultural markets. Furthermore, policymakers can use crop yield prediction models to develop evidence-based policies and programs aimed at promoting food security, sustainable agriculture, and rural development.

6. Section VI: recommendations for future research

The need for further research is emphasized, particularly in identifying the most accurate approaches or methods for crop yield prediction by considering environmental factors. Selecting suitable datasets, thorough pre-processing, and efficient regression model training are crucial for obtaining the best results, especially given the continuously changing environmental conditions in different districts. Transfer learning and domain adaptation techniques can help overcome the challenge of limited labeled data by transferring knowledge from related domains or pre-trained models [58,59]. Fine-tuning pre-trained models on agricultural datasets can significantly improve model performance, especially in cases where labeled data is scarce [115]. Developing techniques to improve the interpretability and explainability of DL models is crucial for gaining trust and acceptance among farmers and stakeholders. Integrating data from multiple sources, including remote sensing, IoT devices, satellite imagery, and weather data, can provide a more comprehensive understanding of agricultural systems. Multimodal data fusion techniques can help extract valuable insights and improve the performance of ML and DL models. Collaboration among researchers, farmers, agricultural extension services, and policymakers is essential for advancing the field of agricultural ML and DL. Open data sharing initiatives can help address the challenge of data scarcity by making agricultural datasets more accessible to the research community. Developing robust ML and DL models that can generalize well across different environmental conditions, cropping systems, and geographic regions is crucial for their widespread adoption. Robustness can be improved through techniques such as data augmentation, ensemble learning, and model regularization. Designing ML and DL models with the end-user in mind is essential for their adoption and impact. Engaging with farmers and stakeholders throughout the model development process can help ensure that the models address real-world challenges and provide actionable insights.

Overall, the summary highlights the challenges and areas for improvement in crop yield prediction, emphasizing the importance of data quality, feature selection, and environmental factors in enhancing prediction accuracy. Furthermore, ML and DL have the potential to revolutionize agriculture, enabling more sustainable and efficient farming practices while ensuring food security and environmental operation.

7. Section VII: Conclusion

To address the global challenge of feeding a growing population, adopting advanced agricultural technology is crucial. Agricultural practitioners require timely and accurate guidance for crop yield prediction. Our review reveals that selected articles emphasize different features, particularly data accessibility and research coverage extent in yield prediction using machine learning and deep learning techniques. Models with more attributes do not consistently perform better, and feature selection depends on dataset availability and research objectives. Different algorithms have been used across studies, with variations in usage. Crop type, geographical location, and intensity levels vary among conducted studies. To determine the most effective model, specific feature selection algorithms should be tested with high-performing models, with significant differences in machine learning model usage. Traditional machine learning frameworks like random forest (RF), support vector machine (SVM), and neural network (NN) show promise.

The review explores 47 articles on deep learning algorithms. Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Deep Neural Networks (DNN) are commonly adopted. These techniques, along with pixel categorization and remote sensing data, show promise in crop yield prediction.

Combining quantitative and qualitative data sources, such as temperature, soil type, and remote sensing indices like NDVI, enhances prediction accuracy. Integrating remote sensing data, image processing, and deep learning can improve accuracy in predicting crop yields over large regions.

Various models, including crop models, machine learning, and remote sensing models, employ weight-dependent yield estimation. Count-based assessments using image processing and remote sensing modeling face challenges like object clutter. Common performance metrics include RMSE, R2, MAE, MSE, and MAPE, which measure prediction accuracy. The study sets the foundation for future advancements in crop yield prediction, with a focus on developing deep learning-based models. It emphasizes the need for testing specific feature selection algorithms with high-performing models and addressing the challenges of dynamic variables in crop prediction.

Overall, the review highlights the importance of advanced technology, data integration, and machine learning and deep learning techniques in improving crop yield prediction accuracy, contributing to addressing global food security challenges.

CRediT authorship contribution statement

Md. Abu Jabed: Writing – review & editing, Writing – original draft. Masrah Azrifah Azmi Murad: Writing – review & editing, Supervision.

Data availability statement

Data will be made available on request.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

APPENDIX. Literature Review Table (Selected Publications)

Databases Authors Data Source Features Methods Measure attributes Objectives Findings Limitations
Web of Science [85] 271 counties in Germany from 1999 to 2019 Weather, soil, and crop phenology, yield value CNN, CNN-FCNN, RF, KNN, LASSO and Ridge Regression, SVR, XGBoost, DNN, Regression tree. MAE, RMSE and correlation coefficient matrices (r) To benchmark winter wheat yield prediction in Germany with advanced machine learning techniques.
To introduce a 1-D convolutional neural network (CNN) for enhanced accuracy.
CNN model outperformed all other baseline models used for winter wheat yield prediction Sophisticated models are required.
Black box nature;
PubMed. [65] 9 states of the United States of America (USA)
Khaki and Wang [88], and Khaki, Wang, and Archontoulis [57]
weather, soil, and farm management data XGBoost, CNN-XGBoost, CNN-DNN, CNN-RNN, and CNN-LSTM Efficiency (Features Selection)
R2, RMSE, MSE
To leverage the strengths of each approach within a hybrid model to optimize the effectiveness of crop yield prediction. The hybrid CNN-DNN model performs exceptionally well with an R2 of 0.87, RMSE of 0.266, MSE of 0.071, and MAE of 0.199, surpassing all other models. XGBoost achieved the second-highest performance while executing faster than deep learning models. The Corn Belt soybean crop dataset in the U.S. is a commonly used resource in various studies.
Results may differ with other datasets when applying the proposed models.
Scopus [116] Google Earth Engine (GEE) cloud, and spectral matching techniques (SMTs). Landsat-8 30 m and MODIS 250 m data, (ARD for Landsat 30 m data, ARD for MODIS 250 m data, SWIR 1 & 2, Thermal Infrared (TIR), RGB, NDVI, NDVIMVCi, EVI, NDWI) RF Accuracy
R2, RMSE
To develop cropland products in South Asia using Landsat-8 30 m, MODIS 250, and Google Earth Engine (GEE) with machine learning and spectral matching techniques (SMTs) for food and water security. The irrigated vs. rainfed 30m product showed 79.8 % overall accuracy, with 79 % for irrigated and 74 % for rainfed cropland. The cropping intensity product had an 85.3 % overall accuracy, with 88 % for single cropping, 85 % for double cropping, and 67 % for triple cropping. Crop type mapping ranged from 72 % to 97 % accuracy, explaining 63–98 % variability when compared to national statistics. Absence of reference data for specific croplands
Complex mapping over vast areas
Science Direct [69] Sentinel-1 satellite, Sentinel-2 satellite of the European Space Agency (ESA) SAR, optical and field data. Ordinary Least Squares Regression (OLS), RF, DT, KNN, Ridge regressor, SVM and DLMLP model Accuracy and performance evaluation (R2, MAE and RMSE) To predict soil moisture, EC (salinity), and SOC (organic carbon) levels through diverse machine learning methods and subsequently assessing their respective accuracies for comparison. Significant correlations exist between soil health parameters and crop yield (R2 values: 0.79 for soil moisture, 0.88 for EC, and 0.51 for SOC). The DLMLP model achieved good accuracy, even without the need to validate soil health parameters. Limited to soil data only.
Web of Science [86] GitHub link (https://github.com/tryambakganguly/Yield-Prediction-Temporal-Attention). Weather data Stacked LSTM and Temporal Attention model, Support Vector Regression with Radial Basis Function kernel (SVR-RBF) and least absolute shrinkage and selection
operator (LASSO) regression
RMSE, MAE, and coefficient of determination or R-squared (R2) To enhance the interpretability of prediction results in multivariate time-series forecasting without compromising accuracy. The deep learning model demonstrated significantly superior performance when contrasted with the USDA model, which relies on domain knowledge. Less factors used
Limited data
Scopus [87] POWER Data Access Viewer v2.0.0 MERRA-2 database, https://power.larc.nasa.gov/data-access-viewer/(accessed on 1 January 2022). satellite-derived hydro-meteorological data between 1981 and 2020 DRS–RF hybrid model and SVM as feature selection approach Prediction Performance
Correlation coefficient (C), RRMSE, MAPE
To design an innovative hybrid machine learning model that combines Random Forest (RF) with Dragonfly Optimization (DR) and Support Vector Regression (SVR) to predict tea yield in Bangladesh. The DRS–RF hybrid model outperformed standalone machine learning methods, yielding the lowest relative error rate of 11 % in tea yield forecasting. Advanced model can be applied for better prediction.
Scopus [79] Orchard, Google Earth Engine [Sentinel 2 and Landsat satellite images, including the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI)] Field Data Collection: irrigation, fertilization, phytosanitary treatment,
daily data on temperature, precipitation, humidity, wind speed, and solar radiation,
Spectral data acquisition: NDVI, NDWI (2015–2019)
linear algorithms such as linear regression, lasso regression, ridge regression.
Ensemble learning techniques such as CatBoost Regressor, Light Gradient Boosting Machine, Random Forest Regressor, Extreme Gradient Boosting, etc
Accuracy
MAE, MSE
To develop a state-of-the-art solution for forecasting citrus fruit crop yields prior to harvest, integrating machine learning algorithms trained on historical agricultural data with spectral data obtained from satellite imagery. The orthonormal automatic pursuit algorithm achieved favorable prediction scores with MAE of 0.2489 and MSE of 0.0843. Field data collection in precision agriculture is challenging, requiring specialized sensors and technical teams for data quality control.
Historical and open-source databases are scarce.
Scopus [40] Paprika smart farm (greenhouse paprika data in the year October to December 2019 and January to July 2020) in South Korea Environmental and solar energy factors RF, SVM, and gradient boosting machine (GBM), Accuracy precision,
recall,
f1-score
kappa
To assess and contrast linear classification methods and various machine learning models in terms of their predictive accuracy for paprika growth, taking into account environmental factors and solar energy data across two beds. RF can effectively predict paprika growth with an accuracy of 0.88. Limited data
Small area
Scopus [88] Syngenta and the Analytics Society of INFORMS, United States and Canada Crop genotype, yield performance, and environment (weather and soil) DNN, Lasso, shallow neural networks (SNN), and regression tree (RT) Accuracy
RMSE (Reduced to 11 % of the average yield and 46 % of the standard)
To conduct feature selection using the trained DNN model, effectively reducing the input space dimension without a substantial decrease in prediction accuracy. DNN(W) and DNN(S) exhibited comparable strong performance, surpassing DNN(G), underscoring the greater influence of environmental factors (weather and soil) on crop yield variability than genotype. Advanced model s (Hybrid) needed with explanations
Scopus [57] Corn Belt in the United States., National Agricultural Statistics
Service of the United States (USDA-NASS, 2019), Daymet (Thornton et al., 2018) and Gridded Soil Survey
Geographic Database for the United States (gSSURGO, 2019).
Yield performance, management, weather, and soil CNN, RNN, RF, deep fully connected neural networks (DFNN) and the least absolute shrinkage and selection operator (LASSO) Performance
RMSE
To introduce a deep learning framework that employs Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to predict crop yields using environmental data and agricultural management practices. The proposed method (CNN-RNN) outperformed than other Methods.
The new model (CNN-RNN) achieved a root-mean-square-error (RMSE) 9 % and 8 % of their respective average yields.
More accurate weather prediction models need to be considered
Scopus [117] Sentinel-1 satellites. Nitrogen rate applied, precipitation, slope, elevation, topographic position index (TPI), aspect, and two radar backscatter coefficients AdaBoost stacked autoencoder (SAE), three-dimensional CNN (3D-CNN), CNN-Late Fusion (CNN-LF), RF, Bayesian multiple linear regression (BMLR), and multiple linear regression (MLR). Accuracy
RMSE, RMedSE, Pearson correlation coefficient (r) and structural similarity (SSIM)
To propose a new method for predicting crop yield using Convolutional Neural Networks (CNNs) and multi-channel input raster. The proposed approach yields superior predictions when compared to five other methods. Data collected during early winter wheat season.
Proposed method may demand substantial computational resources.
No permission to share farmers' data.
Scopus [95] SRDI, BARC, BBS, and BMD (Aus, Aman, Boro, jute, wheat, and potato) rainfall, max-min temperatures, and humidity, TSP, diammonium phosphate (DAP), and MOP, HL, MHL, MLL, LL, and VLL. soil types, soil moisture, texture, consistency, and reaction Logistic regression (LR), SVM, RF, DNN Accuracy (MSE) To identify the optimal approach for selecting crop yield effectively. The Neural Network excels over other methods, maintaining an average prediction accuracy of 96.06 % across six different crops. Extend the model in different region
Science Direct [118] (2) Google Earth Engine platform, Agricultural Yearbook of the provinces and county level statistics bureaus (http://www.stats.gov. cn) Crop planting areas, county-level and field -level yield, climate, satellite, soil properties, and spatial information RF, DNN, ID-CNN, LSTM Accuracy
RMSE, R2
To showcase a scalable, straightforward, and cost-effective method for accurately estimating crop yields across different scales. DNN and RF models had relatively good performance at the field level, with mean R2 values of 0.71, 0.66 and RMSE values of 1127 kg/ha and 956 kg/ha, respectively Applicable only to winter wheat in China.
Google Earth Engine data may lack key variables.
Limited to four models.
Short time frame analyzed.
Scopus [119] European Space Agency’s (ESA’s) Wheat yield data and auxiliary data, Sentinel-2 and ZY-1 02D remote sensing imagery LSTM, RF, GBDT, and SVR Accuracy
RMSE, R2
To predict winter wheat crop yields using remote sensing data and machine learning methods. The LSTM model outperformed SVR, RF, and GBDT, with an R2 of 0.93, capturing the temporal link between satellite data and winter wheat yield. Utilized Sentinel-2 and ZY-1 02D data exclusively, without exploring other data types.
Did not account for weather and environmental factors in yield estimation.
Science Direct [120] Moderate Resolution Imaging Spectroradiometer (MODIS) Version 6 (Terra/MOD09A1) 8-day composite surface reflectance product with a spatial resolution of 0.5 km. Time-series intervals (data shapes) of the SIs (S-inputs) at an 8-day temporal resolution, meteorological data (M-inputs) and four geospatial information data (G-inputs) (LSWI and the VIs (NDVI or EVI) LSTM and one-dimensional convolutional neural network (1D-CNN) Accuracy
MSE, RMSE, R2, NSE
To propose an approach for forecasting rice yields with pixel-level precision, achieved by combining crop and deep learning models with satellite data in both South and North Korea. The proposed model exhibited strong performance [R2 = 0.859, Nash-Sutcliffe model efficiency = 0.858, root mean squared error = 0.605 Mg ha−1], highlighting distinct spatial patterns in rice yields across South and North Korea. Depends on satellite data, and prediction accuracy can be influenced by data quality and availability.
Science Direct [118] (1) GEE (Google Earth Engine) platform, Agricultural Yearbook of each county (http://www.stats.gov.cn) from 2001 to 2015 (unit: kg/ha). Satellite vegetation indexes, meteorological indexes, and soil properties, Google Earth Engine (GEE) platform Least Absolute Shrinkage and Selection Operator (LASSO) regression, RF, LSTM networks Accuracy
R2, RMSE
To investigate the timely prediction of rice yield across a vast area using publicly accessible multi-source data. LSTM (with R2 ranging from 0.77 to 0.87, RMSE from 298.11 to 724 kg/ha) performed better than others. Combining crop models, more detailed farming management data, and higher spatiotemporal resolution of input variables, such as daily weather and 10 m resolution data of Sentinel 2.
Scopus [80] Farming Community, India soil and environmental characteristics including Avg. soil temperature, avg. air temperature, minimum and maximum air temperature, rainfall, and air humidity RF, Bagging, K-NN, SVM, DT, NB
Feature Selection
MRFE, RFE, Boruta
accuracy (ACC), specificity (S), recall (R), precision (P), F1 score, MAE, log loss (LL), and area under the curve (AUC). To predict crop yield by leveraging feature selection techniques and classifiers that consider the agricultural environment's attributes.
The Ensemble technique (MRFE, with RF) outperforms the current classification method in terms of prediction accuracy. The data was collected manually and is not accessible publicly.
The proposed model's scalability was not addressed or discussed.
Science Direct. [39] Fergana valley, Central Asia, soil grids (https://soilgrids.org/) Sentinel-2 generated VIs, environmental data, soil data, field data, and topographic data LR, DT and RF regression models with scikit-learn Accuracy
R2
To create a machine learning-based yield estimation and prediction model.
To evaluate and compare the efficacy of different regression techniques, such as Decision Trees (DT), Linear Regression (LR), and Random Forest (RF).
Overall accuracy of 93 %, RF has the highest yield prediction accuracy when compared to LR and DT. More spectral features, data, and algorithms needed to improve the prediction accuracy.
Science Direct [11] State of Maharashtra previous years' climate, soil, and yield, (District-wise parameters like temperature, precipitation, humidity, soil type, crop type, season, area of the field) Multiple Linear Regression, DTR, GBR, Elastic Net, Lasso, Ridge, Partial Least Squares Regression, and feature engineering-based LSTM Accuracy
R2, MAE, RMSE
To propose a machine learning model that can make predictions based on various soil and environmental variables. The feature-engineered LSTM model outperforms other models, achieving an accuracy of 86.3 % and demonstrating the lowest mean absolute error and root mean square error. Higher error rate is attributed to the dynamic environmental changes in the districts
Less favorable features.
PubMed [49] meteorological department of Punjab, statistical abstract of Punjab issued by Economic advisor to Government, Punjab Climatic factors like minimum and maximum temperature, minimum and maximum relative humidity, rainfall, evaporation, wind direction and speed and solar radiation RNN-LSTM, ANN, RF and Multivariate Linear Regression (MVLR) Efficacy
RMSE MAE MSE
To discover an effective deep learning method for predicting wheat crop yields. RNN-LSTM model proving efficiency Study conducted in a small area
Using solely statistical models for crop yield prediction.

References

  • 1.Modi A., Sharma P., Saraswat D., Mehta R. Review of crop yield estimation using machine learning and deep learning techniques. Scalable Computing. 2022;23(2):59–79. doi: 10.12694/scpe.v23i2.2025. [DOI] [Google Scholar]
  • 2.Goel N., Kaur S., Kumar Y. AI, Edge and IoT-Based Smart Agriculture. Elsevier Inc; 2022. Chapter 23 - machine learning-based remote monitoring and predictive analytics system for crop and livestock. [DOI] [Google Scholar]
  • 3.van Klompenburg T., Kassahun A., Catal C. Crop yield prediction using machine learning: a systematic literature review. Comput. Electron. Agric. 2020;177(July) doi: 10.1016/j.compag.2020.105709. [DOI] [Google Scholar]
  • 4.Xu X., Gao P., Zhu X., Guo W., Ding J., Li C., Zhu M. Design of an integrated climatic assessment indicator (ICAI) for wheat production : a case study in Jiangsu Province , China. Ecol. Indicat. 2019;101(July 2018):943–953. doi: 10.1016/j.ecolind.2019.01.059. [DOI] [Google Scholar]
  • 5.Filippi P. 0123456789; 2019. An Approach to Forecast Grain Crop Yield Using Multi - Layered , Multi - Farm Data Sets and Machine Learning. [DOI] [Google Scholar]
  • 6.Azuaje F. Review of " data mining : practical machine learning tools and techniques " by witten and frank. January 2006. 2014. [DOI]
  • 7.Jin Y., Wang H., Sun C. Data-driven evolutionary optimization. Stud. Comput. Intell. (SCI) 2021;975:103–143. doi: 10.1007/978-3-030-74640-7. [DOI] [Google Scholar]
  • 8.Ansarifar J., Wang L., Archontoulis S.V. An interaction regression model for crop yield prediction. Sci. Rep. 2021;11(1):1–14. doi: 10.1038/s41598-021-97221-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Paudel D., Boogaard H., de Wit A., Janssen S., Osinga S., Pylianidis C., Athanasiadis I.N. Machine learning for large-scale crop yield forecasting. Agric. Syst. 2021;187(December 2020) doi: 10.1016/j.agsy.2020.103016. [DOI] [Google Scholar]
  • 10.Cedric L.S., Adoni W.Y.H., Aworka R., Zoueu J.T., Mutombo F.K., Krichen M., Kimpolo C.L.M. Crops yield prediction based on machine learning models: case of West African countries. Smart Agricultural Technology. 2022;2(December 2021) doi: 10.1016/j.atech.2022.100049. [DOI] [Google Scholar]
  • 11.Iniyan S., Akhil Varma V., Teja Naidu C. Crop yield prediction using machine learning techniques. Adv. Eng. Software. 2023;175(October 2022) doi: 10.1016/j.advengsoft.2022.103326. [DOI] [Google Scholar]
  • 12.Chlingaryan A., Sukkarieh S., Whelan B. Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: a review. Comput. Electron. Agric. 2018;151(November 2017):61–69. doi: 10.1016/j.compag.2018.05.012. [DOI] [Google Scholar]
  • 13.Liakos K.G., Busato P., Moshou D., Pearson S., Bochtis D. Machine learning in agriculture: a review. Sensors. 2018;18(8):1–29. doi: 10.3390/s18082674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Young L.J. Agricultural crop forecasting for large geographical areas. Annual Review of Statistics and Its Application. 2019;6(August 2018):173–196. doi: 10.1146/annurev-statistics-030718-105002. [DOI] [Google Scholar]
  • 15.Elavarasan D., Vincent D.R., Sharma V., Zomaya A.Y., Srinivasan K. Forecasting yield by integrating agrarian factors and machine learning models: a survey. Comput. Electron. Agric. 2018;155(October):257–282. doi: 10.1016/j.compag.2018.10.024. [DOI] [Google Scholar]
  • 16.Kamilaris A., Prenafeta-Boldú F.X. Deep learning in agriculture: a survey. Comput. Electron. Agric. 2018;147(July 2017):70–90. doi: 10.1016/j.compag.2018.02.016. [DOI] [Google Scholar]
  • 17.Beulah R. A survey on different data mining techniques for crop yield prediction. International Journal of Computer Sciences and Engineering. 2019;7(1):738–744. doi: 10.26438/ijcse/v7i1.738744. [DOI] [Google Scholar]
  • 18.Koirala A., Walsh K.B., Wang Z., McCarthy C. Deep learning – method overview and review of use for fruit detection and yield estimation. Comput. Electron. Agric. 2019;162(January):219–234. doi: 10.1016/j.compag.2019.04.017. [DOI] [Google Scholar]
  • 19.Häni N., Roy P., Isler V. A comparative study of fruit detection and counting methods for yield mapping in apple orchards. J. Field Robot. 2020;37(2):263–282. doi: 10.1002/rob.21902. [DOI] [Google Scholar]
  • 20.Zhang Q., Liu Y., Gong C., Chen Y., Yu H. Applications of deep learning for dense scenes analysis in agriculture: a review. Sensors. 2020;20(5):1–33. doi: 10.3390/s20051520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dharani M.K., Thamilselvan R., Natesan P., Kalaivaani P.C.D., Santhoshkumar S. Review on crop prediction using deep learning techniques. J. Phys. Conf. 2021;1767(1) doi: 10.1088/1742-6596/1767/1/012026. [DOI] [Google Scholar]
  • 22.Darwin B., Dharmaraj P., Prince S., Popescu D.E., Hemanth D.J. Recognition of bloom/yield in crop images using deep learning models for smart agriculture: a review. Agronomy. 2021;11(4):1–22. doi: 10.3390/agronomy11040646. [DOI] [Google Scholar]
  • 23.Maheswari P., Raja P., Apolo-Apolo O.E., Pérez-Ruiz M. Intelligent fruit yield estimation for orchards using deep learning based semantic segmentation techniques—a review. Front. Plant Sci. 2021;12(June):1–18. doi: 10.3389/fpls.2021.684328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Monteiro A., Santos S., Gonçalves P. Precision agriculture for crop and livestock farming—brief review. Animals. 2021;11(8):1–18. doi: 10.3390/ani11082345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Rashid M., Bari B.S., Yusup Y., Kamaruddin M.A., Khan N. A comprehensive review of crop yield prediction using machine learning approaches with special emphasis on palm oil yield prediction. IEEE Access. 2021;9:63406–63439. doi: 10.1109/ACCESS.2021.3075159. [DOI] [Google Scholar]
  • 26.Benos L., Tagarakis A.C., Dolias G., Berruto R., Kateris D., Bochtis D. Machine learning in agriculture: a comprehensive updated review. Sensors. 2021;21(11):1–55. doi: 10.3390/s21113758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hasan A.S.M.M., Sohel F., Diepeveen D., Laga H., Jones M.G.K. A survey of deep learning techniques for weed detection from images. Comput. Electron. Agric. 2021;184(December 2020) doi: 10.1016/j.compag.2021.106067. [DOI] [Google Scholar]
  • 28.Muruganantham P., Wibowo S., Grandhi S., Samrat N.H., Islam N. A systematic literature review on crop yield prediction with deep learning and remote sensing. Rem. Sens. 2022;14(9) doi: 10.3390/rs14091990. [DOI] [Google Scholar]
  • 29.Oikonomidis A., Catal C., Kassahun A. Deep learning for crop yield prediction: a systematic literature review. N. Z. J. Crop Hortic. Sci. 2023;51(1):1–26. doi: 10.1080/01140671.2022.2032213. [DOI] [Google Scholar]
  • 30.Bouguettaya A., Zarzour H., Kechida A., Taberkit A.M. Deep learning techniques to classify agricultural crops through UAV imagery: a review. Neural Comput. Appl. 2022;34(12):9511–9536. doi: 10.1007/s00521-022-07104-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ojo M.O., Zahid A. Deep learning in controlled environment agriculture: a review of recent advancements, challenges and prospects. Sensors. 2022;22(20) doi: 10.3390/s22207965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Tranfield D., Denyer D., Smart P. Towards a methodology for developing evidence-informed management knowledge by means of systematic review. Br. J. Manag. 2003;14(3):207–222. doi: 10.1111/1467-8551.00375. [DOI] [Google Scholar]
  • 33.Page M.J., McKenzie J.E., Bossuyt P.M., Boutron I., Hoffmann T.C., Mulrow C.D., Shamseer L., Tetzlaff J.M., Akl E.A., Brennan S.E., Chou R., Glanville J., Grimshaw J.M., Hróbjartsson A., Lalu M.M., Li T., Loder E.W., Mayo-Wilson E., McDonald S.…Moher D. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int. J. Surg. 2021;88(March) doi: 10.1016/j.ijsu.2021.105906. [DOI] [PubMed] [Google Scholar]
  • 34.Sairamkumar S. Design of ANN based machine learning method for crop prediction. 2021;3(3):223–239. [Google Scholar]
  • 35.Amankulova K., Farmonov N., Mukhtorov U., Mucsi L. Sunflower crop yield prediction by advanced statistical modeling using satellite-derived vegetation indices and crop phenology. Geocarto Int. 2023;38(1) doi: 10.1080/10106049.2023.2197509. [DOI] [Google Scholar]
  • 36.Anbananthen K.S.M., Subbiah S., Chelliah D., Sivakumar P., Somasundaram V., Velshankar K.H., Khan M.K.A.A. An intelligent decision support system for crop yield prediction using hybrid machine learning algorithms. F1000Research. 2021;10:1–18. doi: 10.12688/f1000research.73009.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Khanal S., Fulton J., Klopfenstein A., Douridas N., Shearer S. Integration of high resolution remotely sensed data and machine learning techniques for spatial prediction of soil properties and corn yield. Comput. Electron. Agric. 2018;153(January):213–225. doi: 10.1016/j.compag.2018.07.016. [DOI] [Google Scholar]
  • 38.Prasad N.R. Crop yield prediction in cotton for regional level using random forest approach. Spatial Information Research. 2020 doi: 10.1007/s41324-020-00346-6. [DOI] [Google Scholar]
  • 39.Singh M., Choudhary K., Paringer R., Kupriyanov A. Journal of the Saudi society of agricultural sciences machine learning for yield prediction in fergana valley , central asia. Journal of the Saudi Society of Agricultural Sciences. 2022;xxxx doi: 10.1016/j.jssas.2022.07.006. [DOI] [Google Scholar]
  • 40.Venkatesan S., Lim J., Cho Y. A Crop Growth Prediction Model Using Energy Data Based on Machine Learning in Smart Farms. 2022;2022 doi: 10.1155/2022/2648695. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 41.Ju S., Lim H., Ma J.W., Kim S., Lee K., Zhao S., Heo J. Optimal county-level crop yield prediction using MODIS-based variables and weather data: a comparative study on machine learning models. Agric. For. Meteorol. 2021;307(July) doi: 10.1016/j.agrformet.2021.108530. [DOI] [Google Scholar]
  • 42.Hara P., Piekutowska M., Niedbała G. Selection of independent variables for crop yield prediction using artificial neural network models with remote sensing data. Land. 2021;10(6) doi: 10.3390/land10060609. [DOI] [Google Scholar]
  • 43.Gupta S., Geetha A., Sankaran K.S., Zamani A.S., Ritonga M., Raj R., Ray S., Mohammed H.S. Machine Learning- and Feature Selection-Enabled Framework for Accurate Crop Yield Prediction. 2022;2022 [Google Scholar]
  • 44.Fu S., Chen D., He H., Liu S., Moon S., Peterson K.J., Shen F., Wang L., Wang Y., Wen A., Zhao Y., Liu H., Clinic M., Clinic M. HHS Public Access. 2021;1–41 doi: 10.1016/j.jbi.2020.103526.Clinical. [DOI] [Google Scholar]
  • 45.Waikar V.C., Thorat S.Y., Ghute A.A., Rajput P.P., Shinde M.S. Crop prediction based on soil classification using machine learning with classifier ensembling. International Research Journal of Engineering and Technology. 2020;7(May):4857–4861. www.irjet.net [Google Scholar]
  • 46.Batool D., Shahbaz M., Asif H.S., Shaukat K., Alam T.M., Hameed I.A., Ramzan Z., Waheed A., Aljuaid H., Luo S. A hybrid approach to tea crop yield prediction using simulation models and machine learning. Plants. 2022;11(15):1925. doi: 10.3390/plants11151925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Elavarasan D., Vincent P.M.D. Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian Applications. 2020;8 doi: 10.1109/ACCESS.2020.2992480. [DOI] [Google Scholar]
  • 48.Islam T., Chisty T.A., Chakrabarty A. IEEE Region 10 Humanitarian Technology Conference. 2019. A deep neural network approach for crop selection and yield prediction in Bangladesh. R10-HTC, 2018-Decem(1) [DOI] [Google Scholar]
  • 49.Bali N., Singla A. Deep learning based wheat crop yield prediction model in Punjab region of north India. Appl. Artif. Intell. 2021;35(15):1304–1328. doi: 10.1080/08839514.2021.1976091. [DOI] [Google Scholar]
  • 50.Olofintuyi S.S., Olajubu E.A., Olanike D. An ensemble deep learning approach for predicting cocoa yield. Heliyon. 2023;9(4) doi: 10.1016/j.heliyon.2023.e15245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Alibabaei K., Gaspar P.D., Lima T.M. Crop yield estimation using deep learning based on climate big data and irrigation scheduling. Energies. 2021;14(11):1–21. doi: 10.3390/en14113004. [DOI] [Google Scholar]
  • 52.Lecun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521(7553):436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  • 53.Mokhtar A., El-Ssawy W., He H., Al-Anasari N., Sammen S.S., Gyasi-Agyei Y., Abuarab M. Using machine learning models to predict hydroponically grown lettuce yield. Front. Plant Sci. 2022;13(March):1–10. doi: 10.3389/fpls.2022.706042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Cai Y., Guan K., Peng J., Wang S., Seifert C., Wardlow B., Li Z. A high-performance and in-season classification system of field-level crop types using time-series Landsat data and a machine learning approach. Remote Sensing of Environment. 2018;210(April 2017):35–47. doi: 10.1016/j.rse.2018.02.045. [DOI] [Google Scholar]
  • 55.Yuan Q., Shen H., Li T., Li Z., Li S., Jiang Y., Xu H., Tan W., Yang Q., Wang J., Gao J., Zhang L. Deep learning in environmental remote sensing: achievements and challenges. Remote Sensing of Environment. 2020;241(March 2019) doi: 10.1016/j.rse.2020.111716. [DOI] [Google Scholar]
  • 56.Engen M., Sandø E., Sjølander B.L.O., Arenberg S., Gupta R., Goodwin M. Farm-scale crop yield prediction from multi-temporal data using deep hybrid neural networks. Agronomy. 2021;11(12):1–31. doi: 10.3390/agronomy11122576. [DOI] [Google Scholar]
  • 57.Khaki S., Wang L., Archontoulis S.V. A CNN-RNN Framework for Crop Yield Prediction. 2020;10(January):1–14. doi: 10.3389/fpls.2019.01750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Coulibaly S., Kamsu-Foguem B., Kamissoko D., Traore D. Deep neural networks with transfer learning in millet crop images. Comput. Ind. 2019;108:115–120. doi: 10.1016/j.compind.2019.02.003. [DOI] [Google Scholar]
  • 59.Krishnamoorthy N., Narasimha Prasad L.V., Pavan Kumar C.S., Subedi Bharat, Haftom Baraki Abraha S.V. Rice leaf diseases prediction using deep neural networks with transfer learning. Environ. Res. 2021;198(April) doi: 10.1016/j.envres.2021.111275. [DOI] [PubMed] [Google Scholar]
  • 60.Servia H., Pareeth S., Michailovsky C.I., de Fraiture C., Karimi P. Operational framework to predict field level crop biomass using remote sensing and data driven models. Int. J. Appl. Earth Obs. Geoinf. 2022;108 doi: 10.1016/j.jag.2022.102725. [DOI] [Google Scholar]
  • 61.Venugopal A., Aparna S., Mani J., Mathew R., Williams P.V. Crop yield prediction using machine learning algorithms. 2021;9(13):87–91. [Google Scholar]
  • 62.Bouras E.H., Jarlan L., Er-Raki S., Balaghi R., Amazirh A., Richard B., Khabba S. Cereal yield forecasting with satellite drought-based indices, weather data and regional climate indices using machine learning in Morocco. Rem. Sens. 2021;13(16) doi: 10.3390/rs13163101. [DOI] [Google Scholar]
  • 63.Cao J., Wang H., Li J., Tian Q., Niyogi D. Improving the forecasting of winter wheat yields in northern China with machine learning–dynamical hybrid subseasonal-to-seasonal ensemble prediction. Rem. Sens. 2022;14(7) doi: 10.3390/rs14071707. [DOI] [Google Scholar]
  • 64.Huber F., Yushchenko A., Stratmann B., Steinhage V. Extreme gradient boosting for yield estimation compared with deep learning approaches. Comput. Electron. Agric. 2022;202 doi: 10.1016/j.compag.2022.107346. [DOI] [Google Scholar]
  • 65.Oikonomidis A., Catal C., Kassahun A. Hybrid deep learning-based models for crop yield prediction. Appl. Artif. Intell. 2022;36(1) doi: 10.1080/08839514.2022.2031823. [DOI] [Google Scholar]
  • 66.Rajinikanth T.V., Sri N.T., Saikrishna A.Y. Agriculture Crop Yield Analysis and Prediction using Feature Selection based Machine Learning Techniques. 2022;8958(2):99–108. doi: 10.35940/ijeat.B3942.1212222. [DOI] [Google Scholar]
  • 67.Amaratunga V., Wickramasinghe L., Perera A., Jayasinghe J., Rathnayake U., Zhou J.G. Artificial neural network to estimate the paddy yield prediction using climatic data. Math. Probl Eng. 2020;2020 doi: 10.1155/2020/8627824. [DOI] [Google Scholar]
  • 68.Kouadio L., Deo R.C., Byrareddy V., Adamowski J.F. Artificial intelligence approach for the prediction of Robusta coffee yield using soil fertility properties. Comput. Electron. Agric. 2018;155(October):324–338. doi: 10.1016/j.compag.2018.10.014. [DOI] [Google Scholar]
  • 69.Tripathi A., Tiwari R.K., Tiwari S.P. A deep learning multi-layer perceptron and remote sensing approach for soil health based crop yield estimation. Int. J. Appl. Earth Obs. Geoinf. 2022;113(April) doi: 10.1016/j.jag.2022.102959. [DOI] [Google Scholar]
  • 70.Elavarasan D., Vincent P M D.R., Srinivasan K., Chang C.-Y. A hybrid CFS filter and RF-RFE wrapper-based feature extraction for enhanced agricultural crop yield prediction modeling. Agriculture. 2020;10(9):400. doi: 10.3390/agriculture10090400. [DOI] [Google Scholar]
  • 71.Joshua S.V., Priyadharson A.S.M., Kannadasan R., Khan A.A., Lawanont W., Khan F.A., Rehman A.U., Ali M.J. Crop yield prediction using machine learning approaches on a wide spectrum. Comput. Mater. Continua (CMC) 2022;72(3):5663–5679. doi: 10.32604/cmc.2022.027178. [DOI] [Google Scholar]
  • 72.Joshua V., Priyadharson S.M., Kannadasan R. Exploration of machine learning approaches for paddy yield prediction in eastern part of Tamilnadu. Agronomy. 2021;11(10) doi: 10.3390/agronomy11102068. [DOI] [Google Scholar]
  • 73.Talasila V. Analysis and Prediction of Crop Production in Andhra Region Using Deep Convolutional Regression Network. 2020;13(5):1–9. doi: 10.22266/ijies2020.1031.01. [DOI] [Google Scholar]
  • 74.Cao J., Zhang Z., Tao F., Zhang L., Luo Y., Han J., Li Z. Identifying the contributions of multi-source data for winter wheat yield prediction in China. Rem. Sens. 2020;12(5):1–22. doi: 10.3390/rs12050750. [DOI] [Google Scholar]
  • 75.Qiao M., He X., Cheng X., Li P., Luo H., Zhang L., Tian Z. Crop yield prediction from multi-spectral, multi-temporal remotely sensed imagery using recurrent 3D convolutional neural networks. Int. J. Appl. Earth Obs. Geoinf. 2021;102(April) doi: 10.1016/j.jag.2021.102436. [DOI] [Google Scholar]
  • 76.Khaki S., Pham H., Wang L. Simultaneous corn and soybean yield prediction from remote sensing data using deep transfer learning. Sci. Rep. 2021 doi: 10.1038/s41598-021-89779-z. 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Iniyan S., Jebakumar R. Mutual information feature selection (MIFS) based crop yield prediction on corn and soybean crops using multilayer stacked ensemble regression (MSER) Wireless Pers. Commun. 2022;126(3):1935–1964. doi: 10.1007/s11277-021-08712-9. [DOI] [Google Scholar]
  • 78.Kundu S.G., Ghosh A., Kundu A., G P G. A ml-ai enabled ensemble model for predicting agricultural yield. Cogent Food Agric. 2022;8(1) doi: 10.1080/23311932.2022.2085717. [DOI] [Google Scholar]
  • 79.Moussaid A., El Fkihi S., Zennayi Y., Lahlou O., Kassou I., Bourzeix F., El Mansouri L., Imani Y. Machine learning applied to tree crop yield prediction using field data and satellite imagery: a case study in a citrus orchard. Informatics. 2022;9(4) doi: 10.3390/informatics9040080. [DOI] [Google Scholar]
  • 80.Raja S.P., Sawicka B., Stamenkovic Z., Mariammal G. Crop prediction based on characteristics of the agricultural environment using various feature selection techniques and classifiers. IEEE Access. 2022;10(March):23625–23641. doi: 10.1109/ACCESS.2022.3154350. [DOI] [Google Scholar]
  • 81.Shahhosseini M., Hu G., Huber I., Archontoulis S.V. Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt. Sci. Rep. 2021;123456789:1–15. doi: 10.1038/s41598-020-80820-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Pejak B., Lugonja P., Antić A., Panić M., Pandžić M., Alexakis E., Mavrepis P., Zhou N., Marko O., Crnojević V. Soya yield prediction on a within-field scale using machine learning models trained on sentinel-2 and soil data. Rem. Sens. 2022;14(9):1–22. doi: 10.3390/rs14092256. [DOI] [Google Scholar]
  • 83.Chen X., Feng L., Yao R., Wu X., Sun J., Gong W. Prediction of maize yield at the city level in China using multi-source data. Rem. Sens. 2021;13(1):1–17. doi: 10.3390/rs13010146. [DOI] [Google Scholar]
  • 84.Sajid S.S., Shahhosseini M., Huber I., Hu G., Archontoulis S.V. County-scale crop yield prediction by integrating crop simulation with machine learning models. Front. Plant Sci. 2022;13(November):1–16. doi: 10.3389/fpls.2022.1000224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Srivastava A.K., Safaei N., Khaki S., Lopez G., Zeng W. Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci. Rep. 2022:1–14. doi: 10.1038/s41598-022-06249-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Shook J., Gangopadhyay T., Wu L., Ganapathysubramanian B., Id S.S., Singh A.K. Crop yield prediction integrating genotype and weather variables using deep learning. 2021;1–19 doi: 10.1371/journal.pone.0252402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Jui S.J.J., Ahmed A.A.M., Bose A., Raj N., Sharma E., Soar J., Chowdhury M.W.I. Spatiotemporal hybrid random forest model for tea yield prediction using satellite-derived variables. Rem. Sens. 2022;14(3):1–18. doi: 10.3390/rs14030805. [DOI] [Google Scholar]
  • 88.Khaki S., Wang L. Crop Yield Prediction Using Deep Neural Networks. 2019;10(May):1–10. doi: 10.3389/fpls.2019.00621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Nevavuori P., Narra N., Lipping T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019;163(June) doi: 10.1016/j.compag.2019.104859. [DOI] [Google Scholar]
  • 90.Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019;1(5):206–215. doi: 10.1038/s42256-019-0048-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Elbadri M., Abdoon M.A., Berir M., Almutairi D.K. A symmetry chaotic model with fractional derivative order via two different methods. Symmetry. 2023;15(6):1–12. doi: 10.3390/sym15061151. [DOI] [Google Scholar]
  • 92.Guma F.E.L., Badawy O.M., Berir M., Abdoon M.A. Numerical analysis of fractional-order dynamic dengue disease epidemic in Sudan. Journal of the Nigerian Society of Physical Sciences. 2023;5(2):1–6. doi: 10.46481/jnsps.2023.1464. [DOI] [Google Scholar]
  • 93.Abdoon M.A. Programming first integral method general formula for the solving linear and nonlinear equations. Appl. Math. 2015;6(3):568–575. doi: 10.4236/am.2015.63051. [DOI] [Google Scholar]
  • 94.Ali M., Deo R.C., Downs N.J., Maraseni T. Multi-stage committee based extreme learning machine model incorporating the in fl uence of climate parameters and seasonality on drought forecasting. Comput. Electron. Agric. 2018;152(July):149–165. doi: 10.1016/j.compag.2018.07.013. [DOI] [Google Scholar]
  • 95.Chakrabarty A., Mansoor N., Uddin M.I., Al-Adaileh M.H., Alsharif N., Alsaade F.W. Prediction approaches for smart cultivation: a comparative study. Complexity. 2021;2021 doi: 10.1155/2021/5534379. [DOI] [Google Scholar]
  • 96.Pant J., Pant R.P., Kumar M., Pratap D., Pant H. Materials Today : proceedings Analysis of agricultural crop yield prediction using statistical techniques of machine learning. Mater. Today: Proc. 2021;xxxx:1–5. doi: 10.1016/j.matpr.2021.01.948. [DOI] [Google Scholar]
  • 97.Patil A., Patil P., Kokate P.S. International journal of advancements in engineering and technology (IJAET) crop prediction system using machine learning algorithms. 2020;1(1):1–8. [Google Scholar]
  • 98.Abbas F., Afzaal H., Farooque A.A., Tang S. Crop yield prediction through proximal sensing and machine learning algorithms. Agronomy. 2020;10(7) doi: 10.3390/AGRONOMY10071046. [DOI] [Google Scholar]
  • 99.Zhang Z., Jin Y., Chen B., Brown P. California almond yield prediction at the orchard level with a machine learning approach. Front. Plant Sci. 2019;10(July):1–18. doi: 10.3389/fpls.2019.00809. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Cai Y., Guan K., Lobell D., Potgieter A.B., Wang S., Peng J., Xu T., Asseng S., Zhang Y., You L., Peng B. Agricultural and Forest Meteorology Integrating satellite and climate data to predict wheat yield in Australia using machine learning approaches. Agric. For. Meteorol. 2019;274(February):144–159. doi: 10.1016/j.agrformet.2019.03.010. [DOI] [Google Scholar]
  • 101.Feng P., Wang B., Liu D.L., Waters C., Yu Q. Agricultural and Forest Meteorology Incorporating machine learning with biophysical model can improve the evaluation of climate extremes impacts on wheat yield in south-eastern Australia. Agric. For. Meteorol. 2019;275(May):100–113. doi: 10.1016/j.agrformet.2019.05.018. [DOI] [Google Scholar]
  • 102.Wang X., Huang J., Feng Q., Yin D. Winter wheat yield prediction at county level and uncertainty analysis in main wheat-producing regions of China with deep learning approaches. 2020. [DOI]
  • 103.Gan H., She Q., Ma Y., Wu W., Meng M. Generalization improvement for regularized least squares classification. Neural Comput. Appl. 2019;31:1045–1051. doi: 10.1007/s00521-017-3090-9. [DOI] [Google Scholar]
  • 104.Zhang, Y., Qin, Q., Ren, H., Sun, Y., & Li, M. (n.d.). Optimal Hyperspectral Characteristics Determination for Winter Wheat Yield Prediction. 1–18. 10.3390/rs10122015. [DOI]
  • 105.Zoppis I., Mauri G., Dondi R. Encyclopedia Of Bioinformatics And Computational Biology: ABC of Bioinformatics (Vols. 1–3) Elsevier Ltd; 2018. Kernel methods: support vector machines. [DOI] [Google Scholar]
  • 106.Kang Y., Ozdogan M., Zhu X., Ye Z., Hain C., Anderson M. Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US Midwest. Environ. Res. Lett. 2020;15(6) doi: 10.1088/1748-9326/ab7df9. [DOI] [Google Scholar]
  • 107.Yousefi D.B M., Mohd Rafie A.S., Abd Aziz S., Azrad S., Mazmira Mohd Masri M., Shahi A., Marzuki O.F. Classification of oil palm female inflorescences anthesis stages using machine learning approaches. Information Processing in Agriculture. 2021;8(4):537–549. doi: 10.1016/j.inpa.2020.11.007. [DOI] [Google Scholar]
  • 108.Wang T., Liang J., Liu X. Soil moisture retrieval algorithm based on TFA and CNN. IEEE Access. 2019;7:597–604. doi: 10.1109/ACCESS.2018.2885565. [DOI] [Google Scholar]
  • 109.Maya Gopal P.S., Bhargavi R. Performance evaluation of best feature subsets for crop yield prediction using machine learning algorithms. Appl. Artif. Intell. 2019;33(7):621–642. doi: 10.1080/08839514.2019.1592343. [DOI] [Google Scholar]
  • 110.Zhao J., Yang X., Sun S. Constraints on maize yield and yield stability in the main cropping regions in China. Eur. J. Agron. 2018;99(August 2017):106–115. doi: 10.1016/j.eja.2018.07.003. [DOI] [Google Scholar]
  • 111.Haghverdi A., Washington-allen R.A., Leib B.G. Prediction of cotton lint yield from phenology of crop indices using arti fi cial neural networks. Comput. Electron. Agric. 2018;152(July):186–197. doi: 10.1016/j.compag.2018.07.021. [DOI] [Google Scholar]
  • 112.Panek E., Gozdowski D. Remote Sensing Applications : society and Environment Analysis of relationship between cereal yield and NDVI for selected regions of Central Europe based on MODIS satellite data. Remote Sens. Appl.: Society and Environment. 2020;17(August 2019) doi: 10.1016/j.rsase.2019.100286. [DOI] [Google Scholar]
  • 113.Vallentin C., Harfenmeister K., Itzerott S., Kleinschmit B., Conrad C., Spengler D. vol. 23. Springer; US: 2022. Suitability of satellite remote sensing data for yield estimation in northeast Germany. (Precision Agriculture). Issue 1. [DOI] [Google Scholar]
  • 114.Pham H.T., Awange J. vols. 1–18. 2022. (Evaluation of Three Feature Dimension Reduction Techniques for Machine Learning-Based Crop Yield Prediction Models). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Pan S.J., Yang Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010;22(10):1345–1359. doi: 10.1109/TKDE.2009.191. [DOI] [Google Scholar]
  • 116.Gumma M.K., Thenkabail P.S., Panjala P., Teluguntla P., Yamano T., Mohammed I. Multiple agricultural cropland products of South Asia developed using Landsat-8 30 m and MODIS 250 m data using machine learning on the Google Earth Engine (GEE) cloud and spectral matching techniques (SMTs) in support of food and water security. GIScience Remote Sens. 2022;59(1):1048–1077. doi: 10.1080/15481603.2022.2088651. [DOI] [Google Scholar]
  • 117.Morales G., Sheppard J.W., Hegedus P.B., Maxwell B.D. Improved yield prediction of winter wheat using a novel two-dimensional deep regression neural network trained via remote sensing. Sensors. 2023;23(1) doi: 10.3390/s23010489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Cao J., Zhang Z., Luo Y., Zhang L., Zhang J., Li Z., Tao F. Wheat yield predictions at a county and field scale with deep learning, machine learning, and google earth engine. Eur. J. Agron. 2021;123(December 2020) doi: 10.1016/j.eja.2020.126204. [DOI] [Google Scholar]
  • 119.Cheng E., Zhang B., Peng D., Zhong L., Yu L., Liu Y., Xiao C., Li C., Li X., Chen Y., Ye H., Wang H., Yu R., Hu J., Yang S. Wheat yield estimation using remote sensing data based on machine learning approaches. Front. Plant Sci. 2022;13(December):1–16. doi: 10.3389/fpls.2022.1090970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Jeong S., Ko J., Yeom J.M. Predicting rice yield at pixel scale through synthetic use of crop and deep learning models with satellite data in South and North Korea. Sci. Total Environ. 2022;802 doi: 10.1016/j.scitotenv.2021.149726. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES