Application of Machine Learning in Hydraulic Fracturing: A Review

Yulin Ma; Man Ye

doi:10.1021/acsomega.4c11342

. 2025 Mar 14;10(11):10769–10785. doi: 10.1021/acsomega.4c11342

Application of Machine Learning in Hydraulic Fracturing: A Review

Yulin Ma ^1,^*, Man Ye ¹

PMCID: PMC11947802 PMID: 40160741

Abstract

Hydraulic fracturing is a widely used technology to increase oil and gas production, and accurate prediction of the postpressure production capacity of hydraulic fracturing is the key to the efficient development of oil and gas fields. However, the multiplicity and asymmetry of reservoir parameters, as well as the high degree of nonlinearity of fluid flow, often make semianalytical modeling and numerical simulation to predict the production behavior a challenge. Based on the research on the application of machine learning (ML) methods in hydraulic fracturing, this paper analyzes the limitations and applicability of classical ML algorithms as well as combinatorial models, summarizes the practical applications of ML in hydraulic fracturing operations, and discusses the ML algorithms to assist hydraulic fracturing analysis and improve hydraulic fracturing production rates. Finally, the development of interpretable modeling methods based on knowledge embedding and knowledge discovery is a challenge and a future direction for fracking research.

1. Introduction

Hydraulic fracturing involves injecting a high-energy pressurized fracturing fluid into a reservoir to enhance hydrocarbon extraction and ultimate recovery by forming a network of highly conductive fractures around the borehole. This fracture network not only improves the hydraulic conductivity of the reservoir rock but also increases the surface area conducive to hydrocarbon production.

In 1947, the first hydraulic fracturing treatment was performed on a gas well operated by Pan American Oil Company in the Hugoton field. This operation, conducted in a vertical well, created a simple two-flank fracture to increase the gas production. The extensive fracture networks formed in multistage fracturing significantly boosted hydrocarbon production, making it economically viable for the oil and gas industry to exploit vast hydrocarbon resources in previously untapped tight unconventional reservoirs.¹ Since the first treatment in 1947, hydraulic fracturing has become a commonly used method to stimulate the productivity of oil and gas wells. As the global oil and gas sector continues to evolve, hydraulic fracturing technology has shown a trend of increasing sophistication. Its development history began with small-scale fracturing with minor addition of sand for near-well zone decontamination. Over time, there was a gradual increase in sand addition and fracturing scale to improve the hydraulic conductivity of low-permeability reservoirs, leading to the development of medium-sized fracturing techniques. The technology was later applied to medium- and high-permeability reservoirs, enabling end-desanding fracturing and substantially improving reservoir conductivity. Eventually, hydraulic fracturing evolved into large-scale applications across reservoir systems. Throughout the ongoing development of oil fields, failed wells and wells with declining production continue to pose challenges. To address these issues and enhance production, repetitive fracturing technology has been proposed and implemented. Additionally, for shale gas production, a targeted construction method based on reservoir characteristics, namely, multistage fracturing technology, has been developed, offering precise targeting and effective fracturing results.

Production prediction after hydraulic fracturing is a prerequisite for the fine design of hydraulic fracturing parameters, and how to accurately and quickly realize shale gas production prediction has been the focus and challenge of domestic and international research. Traditional production prediction methods are based on existing production data. Since 1945, Arps et al.² summarized the production decreasing law into three types: exponential decreasing, hyperbolic decreasing, and harmonic decreasing in order to determine the decreasing parameter and predict the future dynamics, and put forward the plate fitting method, which has been referred to in almost all the studies of decreasing curves since then. The power-law exponential (PLE) decreasing method, which has more generality than the decreasing relationship, was developed and adopted by Ilk et al.³ to establish a PLE decreasing model for analyzing the decreasing pattern of shale gas reservoirs. Due to the difference in fluid flow between artificial and natural fractures, the prediction by the power law exponential decreasing method and Arps decline curve often underestimates the hydrocarbon production. Whether for hydraulic fracturing simulation or reservoir simulation, the prediction of production capacity by numerical simulation methods is often inaccurate due to the extremely high uncertainty in the nature of fluid transport during the fracturing process and taking into account the complexity of the reservoir, the nature of the fractures, and other factors. In recent years, with the rapid development of artificial intelligence (AI), data-driven techniques have been widely used in the fields of natural language processing, image recognition, medical, materials and petroleum engineering.⁴⁻⁸ Since the success of hydraulic fracturing technology, the quantity and quality of logging data have been increasing, providing a favorable foundation for the implementation of ML in hydraulic fracturing operations. In recent years, rapidly evolving ML techniques have attracted much attention from researchers, especially for their outstanding performance in predicting production capacity after hydraulic fracturing.

Meanwhile, rapidly advancing ML techniques have garnered extensive attention from researchers, particularly for their outstanding performance in hydraulic fracturing prediction. Therefore, it is essential to review and organize existing ML methods in hydraulic fracturing operations. This review focuses on the application of ML methods in hydraulic fracturing operations, summarizes lessons learned from previous studies, and explores future directions for the accurate estimation of hydraulic fracturing predictions using these methods. Figure 1 shows the workflow of the entire paper. Section 2 introduces various ML methods applied to hydraulic fracturing prediction, summarizes lessons learned from hydraulic fracturing operation predictions based on actual oilfield cases, and analyzes the benefits, drawbacks, and limitations of various data-driven methods and their combined models in detail. Section 3 delves into the specific applications of ML methods in hydraulic fracturing operations and summarizes insights gained from predicting fracturing operations based on real-world oilfield cases. Section 4 highlights some challenges of current research and proposes future directions for ML-based hydraulic fracturing prediction. Finally, Section 5 provides a summary of the entire work.

In the oil and gas industry, accurate forecasting of the capacity is very important for decision-making. And data analysis plays a key role in the construction and evaluation of forecasting models. Table 1 summarizes the process for achieving oil and gas production capacity forecasts. The program mainly includes four aspects: data collection, data preprocessing, modeling building, and production forecasting. There are some issues worth paying attention to in the practical machine learning prediction of hydraulic fracturing data, which are nonlinear and multivariate.

Table 1. Procedures of Hydrocarbon Production Forecast by Data.

procedure	operation
Data Collection	Collect relevant data on oil and gas production and factors affecting production capacity
Data Preprocessing
Data Cleaning	Clean and process the data, dealing with missing values and outliers
Normalization	Elimination of numerical differences between physical variables
Feature Selection	Select the characteristic variables that affect the production capacity
Data Segmentation	Divide the data set into training and test sets
Modeling building
Model Selection	Select the appropriate predictive model
Model Training	Train the model using the training set
Model Evaluation	Evaluate the model performance using the test set
Production forecast	Use the trained model for oil and gas production prediction

Open in a new tab

1.1. Classification and Labeling of Hydraulic Fracturing Data

The classification and labeling of hydraulic fracturing data are crucial for selecting the appropriate ML algorithms and conducting feature engineering. This work provides a detailed statistical analysis of hydraulic fracturing data. As shown in Table 2, hydraulic fracturing data are divided into four categories: geological characteristics, engineering parameters, monitoring data, and production data, and the data types are labeled. The data are generally classified into numerical and subtyping types.

Table 2. Classification of Hydraulic Fracturing Data.

parameter	description	data type
Geological characteristics	The data describing the target formation and rock properties directly affect the fracturing design and effectiveness
Rock properties
Rock type	Such as sandstone, shale, limestone, etc	Categorical
Porosity	The ratio of pore volume to total volume in rocks affects the storage capacity of reservoirs	Numerical type
Permeability	The ability of rocks to allow fluids to pass through and affect the flow of oil and gas	Numerical type
Young’s modulus	The elastic modulus of rocks reflects their rigidity
Poisson’s ratio	The ratio of pore volume to total volume in rocks is the ratio of transverse strain to longitudinal strain of rocks under stress	Numerical type
Compressive strength	The ability of rocks to resist compressive failure	Numerical type
Natural fracture density	The number and distribution of natural fractures in the geological strata	Numerical type
Formation properties
Formation thickness	Thickness of the target formation	Numerical type
Formation pressure	The original pressure of the formation	Numerical type
Fracture pressure	The minimum pressure required to rupture the formation	Numerical type
In-situ stress	The stress distribution in the formation includes maximum horizontal principal stress, minimum horizontal principal stress, and vertical stress	Numerical type
Temperature	Temperature of the stratum	Numerical type
Reservoir properties
Hydrocarbon saturation	The proportion of pore volume occupied by oil and gas in the formation	Numerical type
Reservoir depth	Depth of target formation	Numerical type
Engineering parameters	Data describing the techniques and operating conditions used during the fracturing construction process
Properties of fracturing fluid
Fracturing fluid type	Such as clear water, gel, foam, etc.	Categorical
Viscosity	The flow resistance of fracturing fluid	Numerical type
Density	The ratio of mass to volume of fracturing fluid	Numerical type
pH value	The acidity and alkalinity of fracturing fluid	Numerical type
additive	Such as drag reducing agents, proppant carriers, etc.	Categorical
Proppant properties
Proppant type	Such as quartz sand, ceramic particles, etc	Categorical
Proppant size	The particle size of the proppant	Numerical type
Proppant concentration	The proportion of proppant in fracturing fluid	Numerical type
Inject parameters
Injection rate	The injection rate of fracturing fluid	Numerical type
Injection pressure	Injection pressure of fracturing fluid	Numerical type
Injection volume	Total injection volume of fracturing fluid and proppant	Numerical type
Number of fracturing sections	Number of sections for hydraulic fracturing construction	Numerical type
Monitoring data	Real time data collected during the fracturing process is used to evaluate the fracturing effect and adjust construction parameters.
Microseismic data
Event loation	Three dimensional coordinates of microseismic events	Numerical type
Event magnitude	The energy magnitude of microseismic events	Numerical type
Event frequency	Frequency of occurrence of microseismic events	Numerical type
Bottom hole pressure data
Real time bottomhole pressure	The pressure changes at the bottom of the well during the fracturing process	Numerical type
Rupture pressure	Pressure during formation rupture	Numerical type
Crack propagation data
Fracture length	Horizontal extension length of cracks	Numerical type
Fracture width	The width of the crack opening	Numerical type
Fracture height	Vertical extension height of cracks	Numerical type
Crack direction	The direction of crack propagation	Numerical type
Production data	The production performance of oil and gas wells after fracturing is used to evaluate the fracturing effect.
Production data
Oil and gas production	Daily oil and gas production after fracturing	Numerical type
Accumulated production	Accumulated oil and gas production after fracturing	Numerical type
Production pressure data
Wellhead pressure	Wellhead pressure of production well	Numerical type
Bottomhole flow pressure	Bottom flow pressure of production well	Numerical type
Production dynamic data
Production index	Production under unit pressure difference	Numerical type
Decreasing rate	The rate of change in production over time	Numerical type

Open in a new tab

1.2. Data Preprocessing for Hydrocarbon Production Forecast

The quality of the training data greatly affects the predictive ability of ML models. Training data often come from on-site data and data generated by numerical models. During hydraulic fracturing, data may inevitably experience noise and fluctuations due to incomplete on-site recording. The dimensional differences between the training data features generated by the numerical models may result in significant differences in numerical values. The model may learn too much from variables with large values and not train sufficiently on variables with small values, resulting in a poor model training performance. Therefore, data preprocessing is required. Table 3 summarizes the general steps of data preprocessing, which includes data cleaning, normalization, correlation analysis, and data segmentation. It is worth noting that high-quality data can be obtained by deleting or filling missing values. The maximum-minimum normalization and Z-score normalization methods in feature normalization are used to identify numerical data, while categorical data are usually identified using the one-hot encoding technique. Feature selection is a crucial step in the ML process, especially for multiparameter, high-dimensional engineering data such as hydraulic fracturing, which can effectively remove redundant features, improve model performance, and enhance interpretability.

Table 3. Main Process of Data Preprocessing.

data preprocessing	method	description
Dealing with missing values	Delete missing values	Directly delete rows and columns containing missing values when there are few missing values and they are randomly distributed.
	Interpolation filling	Suitable for filling missing values using linear interpolation, polynomial interpolation, etc. when data have temporal or spatial continuity
	Statistical values filling	Suitable for filling missing values with mean, median, or mode when the data distribution is uniform
	Prediction model filling	Suitable for predicting and filling missing values using models such as KNN when the data are complex and there are many missing values
Data Cleaning	Remove noise	Using filtering or smoothing techniques to remove noise
	Handling Outliers	Identify and handle outliers using statistical methods such as Z-score and IQR
	Standardization of Format	Unified date, unit, and other formats (such as converting psi to MPa)
Normalization	Maximum-minimum normalization	Scale the data to the range of [0, 1], applicable when the data distribution is bounded (such as fracturing fluid volume between 0 and 1000m³)
	Z-score standardization	Convert data into a distribution with a mean of 0 and a standard deviation of 1, suitable for data distributions that are approximately normal (such as geological stress data)
	One-hot encoding	One-hot encoding is commonly used to handle features that do not have a size relationship between categories
Feature Selection	Principal Component Analysis	Feature correlation analysis
	Gray correlation analysis	Important feature ranking
	Decision tree/Random Forest	Important feature extraction

Open in a new tab

1.3. Hyperparameter Adjustment and Model Generalization Ability

The adjustment of the hyperparameters first requires the determination of key hyperparameters. The core parameters of different models are different; for example, the hyperparameters of SVM are regularization parameter c and kernel function, while the hyperparameters of neural networks are learning rate, batch size, and number of layers. Adjust parameters are adjusted by defining a parameter space. The continuous parameters are set within a reasonable range. For example, the learning rate is set to [0.001, 0.1]. Possible values are set for discrete parameters, such as the depth of the tree is set to [3, 5, 7]. The accuracy of hydrocarbon production is measured by selecting evaluation indicators (MSE,RMSE,RSE etc.).

The generalization ability of a model refers to its ability to perform well on unseen data and is a core indicator for measuring the performance of machine learning models. The training set, testing set, and validation set are separated from the total data set, and then cross validation is performed to evaluate the predictive performance of the model. It should be noted that time series data need to be divided in chronological order. The generalization ability of a model refers to its ability to perform well on unseen data and is a core indicator for measuring the performance of machine learning models. The training set, testing set, and validation set are separated from the total data set, and then cross validation is performed to evaluate the predictive performance of the model. It should be noted that time series data need to be divided in chronological order.

The classification and labeling of hydraulic fracturing data are crucial for us to deeply analyze data features and select the appropriate machine learning algorithms. Data preprocessing is an important step before making predictions, and the quality of the data will affect the performance and reliability of subsequent learning models. The adjustment of hyperparameters is of utmost importance in model optimization, which affects the prediction accuracy of the model.

2. ML Methods

The high complexity of fracturing data poses a significant challenge for numerical models in predicting oil and gas production. Machine learning models have become a new method for predicting hydraulic fracturing. First, the effectiveness and limitations of several types of machine learning models for pressure prediction were summarized. Second, the combination of physics informed neural networks with knowledge in the field of hydraulic fracturing was discussed to enhance the interpretability of machine learning.

2.1. Traditional ML Methods

ML, as a subset of AI, involves the analysis and processing of data by computer programs to enable automated learning and predictive capabilities. ML tasks can be categorized into three main types: classification, regression, and clustering. Additionally, deep learning is widely used in hydraulic fracturing operations. Deep learning, an advanced ML technique, employs neural networks with multiple layers of nonlinear processing units to learn and represent data. Traditional ML methods used for hydraulic fracturing prediction include K-Nearest Neighbor (KNN),⁹ Support Vector Machine (SVM),¹⁰ and Random Forests (RF).¹¹

The K-Nearest Neighbors algorithm is a supervised learning method, as illustrated in Figure 2. It predicts new observations by identifying the K closest data points and assigns the category of the new observation based on the majority category of these K nearest points. Essentially, KNN utilizes the distances between data points to predict the category or value of a new observation. This method can identify data clusters similar to historical hydraulic fracturing data, which can be used to classify hydraulic fracturing straight well categories. By comparing and analyzing the specific factors between poorly and highly productive wells, KNN can effectively identify the main controlling factors that affect the well yield, thereby optimizing hydraulic fracturing parameters. Because KNN does not assume that data follows a specific distribution (such as Gaussian distribution), nor does it construct global decision boundaries, but instead makes voting decisions based on samples within local neighborhoods, KNN is particularly suitable for data sets where the class domains have significant intersections or overlaps. It performs well in the automatic classification of class domains with larger sample sizes. However, for class domains with smaller sample sizes, the KNN is more prone to misclassification. Despite its advantages, the KNN algorithm is very computationally intensive, especially when dealing with large feature sets. Additionally, it has a low prediction accuracy for rare classes when the sample distribution is unbalanced.

Support Vector Machine is a classical supervised learning algorithm used for solving classification and regression problems. It works by finding an optimal hyperplane in the feature space that maximizes the separation between the different classes. Due to the modeling ability, small sample generalization, and noise robustness of SVM for high-dimensional nonlinear data, the SVM algorithm has significant advantages in crack prediction and gas content prediction. For instance, Shi¹² applied SVM, along with Artificial Neural Networks and Multiple Regression Analysis, to predict fractures in the An-1 and An-2 wells in the Anpeng Oilfield in the Yuyang Sag of the Nanxiang Basin. Additionally, Shi evaluated the gas content of 40 tight sandstone samples in the Tabamiao area of the Ordos Basin. The results demonstrated that SVM exhibited absolute advantages over the other methods.

The SVM algorithm is also widely used in the evaluation of hydraulic fracturing parameters. Here, we introduce the SVM algorithm workflow as an example for evaluating proppant transportation in hydraulic fracturing operations. Step 1 involves the collection and preprocessing of fracturing data (such as the proppant type, pumped proppant flow rate, and proppant concentration). The input training data are then mapped to a high-dimensional feature space. Step 2 entails determining the learning objective of the SVM. Step 3 involves training the model and making predictions based on the data. Step 4 performs error analysis to assess the model’s accuracy. The SVM method is a powerful classifier, particularly effective for handling high-dimensional data sets and nonlinear classification problems, and it exhibits good robustness.¹³ Mapping data to a high-dimensional space does not increase computational complexity due to the kernel function approach, which overcomes issues related to the curse of dimensionality and nonlinear separability.

The effectiveness of the SVM algorithm heavily depends on the choice of kernel function, and it is often necessary to try multiple kernels to achieve optimal results. Table 4 lists four kernel functions that determine the high-dimensional feature space and includes a hybrid kernel function,¹⁴ which is a combination of the Radial Basis Function (RBF) and polynomial kernels. Linear kernel function is the simplest kernel function, which usually performs well and has high computational efficiency in high-dimensional data such as text classification, making it suitable for large-scale data sets. The polynomial kernel function controls the complexity of the model through polynomial order. The higher the order, the stronger the nonlinear ability of the model, but it is also more prone to overfitting. It is suitable for nonlinear classification of data with obvious polynomial features and requiring moderate complexity. Radial basis function has strong nonlinear ability, which can map data to infinite dimensional space, and is suitable for most classification and regression problems. It is the default kernel function choice, especially when the data features are unclear. The Sigmoid kernel function is similar to the activation function of neural networks. It is suitable for binary classification problems and may perform well in specific fields, such as text classification. Table 5 summarizes detailed information about SVM in oil and gas production forecasting.

Table 4. Summary of Kernel Functions.

Kernel function	mathematical expression
RBF kernel	K(x₁, x₂) = (x₁, x₂) + c
Sigmoid kernel	K(x₁, x₂) = [γ(x₁, x₂) + c]^d
Polynomial kernel	K(x₁, x₂) = exp(−γ∥x₁ – x₂∥²)
Linear kernel	K(x₁, x₂) = tan h[γ(x₁, x₂) + c]
Mixed kernel	K_mix(x₁, x₂) = m[(x₁, x₂) + 1]^d + (1–m) exp(−γ∥x₁ – x₂∥²)

Open in a new tab

Table 5. Prediction of Oil and Gas Production by SVM method.

output	number of data set	number of features	references
core-based porosity	94 samples with core data and conventional logs	5 features include two conventional well logs and three log-derived variables	(14)
formation pressure	245 points in shale formation	7 features include real drilling surface parameters and logs data	(16)
fracture pressure	3925 points in all the formations	6 features include the pore pressure and real drilling surface parameters
hydraulic aperture	300 sets of fracture models	6 features include mechanical aperture and fracture morphology parameters	(17)
lithofacies	182 samples with both core data and conventional logs	8 derived variables	(18)
Porosity Stress Sensitivity	57 shale samples	2 principal component vectors	(19)

Open in a new tab

RFs are an ensemble learning method typically used to solve regression and classification problems. They make predictions by constructing a large number of random decision trees and aggregating their average results. The construction of a Random-Forest-based hydraulic fracturing prediction method involves several steps. Initially, a random percentage of samples from the hydraulic fracturing data set are chosen for training purposes. Subsequently, a random subset of features is picked from the complete set to train the decision tree. These selected features are then divided into multiple decision trees. Finally, the prediction results from these multiple decision trees are combined using either a voting mechanism or averaging. Schuetter et al.¹⁵ gathered a data set from shale formations in the Permian Basin and constructed a predictive model using Random Forests to predict production metrics for target fractured wells. Random Forest has high accuracy and can parallelize different data samples and features. It is robust to noise and outliers in the data and can reduce the overfitting problem associated with a single decision tree by integrating multiple decision trees. However, because RF requires the construction and execution of multiple decision trees to obtain prediction results, training and prediction take longer. Additionally, its prediction results are not easily interpretable.

The characteristics of the three traditional ML-based hydraulic fracturing prediction methods are summarized in Table 6. KNN is highly dependent on the quality of the data samples and is only suitable for hydraulic fracturing predictions on small data sets. The Support Vector Machine is unsuitable for handling large-scale data sets in hydraulic fracturing. Random Forest requires longer training and prediction times, and its prediction results are less interpretable. Although traditional ML methods are widely used in hydraulic fracturing parameter prediction and production capacity prediction, they still involve shallow learning in feature extraction and require simple transformations of the data. Most traditional ML methods are not suitable for handling large-scale data sets. As hydraulic fracturing technology becomes more complex and the data scale for hydraulic fracturing production continues to grow, these techniques are insufficient for accomplishing complex hydraulic fracturing production predictions.

Table 6. Characteristics of Three Traditional ML-Based Hydraulic Fracturing Prediction Methods.

method	principle	limitations	applicable scenarios
KNN	Predict the test sample points based on the classification of the nearest known sample points in the surrounding area	• High dependence on the quality of data samples	Hydraulic fracturing prediction suitable for small data set samples
		• Incorrect samples and imbalanced data can lead to inaccurate prediction results
SVM	Find an optimal hyperplane to distinguish different samples	• Not suitable for handling large-scale data sets in hydraulic fracturing	Suitable for handling high-dimensional data sets and nonlinear classification problems
		• Need to try multiple kernel functions to find the optimal effect
RF	Combining multiple decision trees to integrate prediction results	• Long training and prediction time	Suitable for handling high-dimensional data, capable of handling both discrete and continuous data
		• The predicted results are not easily interpretable

Open in a new tab

2.2. Neural Network and Its Variants

Since hydraulic fracturing has become a common method to stimulate the productivity of oil and gas wells, a vast amount of fracturing data has been accumulated. The inefficient training strategies and shallow structures of traditional ML methods are inadequate for handling this massive amount of data and often neglect other influencing factors, leading to optimization difficulties. Deep learning, a branch of ML typically implemented through artificial neural networks (ANN), addresses these limitations. Figure 3 shows the network structure of the ANN algorithm. Deep learning enables the discovery of implicit dependencies and the prediction of the expected value of an objective function by effectively managing large data sets.

In the oil and gas industry, petroleum engineers employ artificial neural networks to solve fundamental problems,^27,28 such as permeability prediction. The hydraulic fracturing industry is a prime area for ANN development due to the vast amount of data it generates. For instance, McVey et al.²⁹ utilized ANN algorithms to identify hydraulic fracturing parameters that affect natural gas production, thereby improving the effectiveness of hydraulic fracturing operations, even in the absence of high-quality reservoir data. Ibrahim et al.³⁰ gathered a data set comprising 200 well production records and completion designs from oil wells in the Niobrara Shale formation. They employed an ANN to forecast the expected ultimate recovery from multistage hydraulically fractured wells within this formation. Neural networks are extensively used to solve the problem of predicting hydraulic fracturing operations. The procedure involves utilizing a subset of the collected data to train the learning algorithm. This training process includes determining the quantity of hidden layers and nodes in each layer, the weights of the connections, and the transfer function of the neurons. By adjustment of these parameters, the learning method is refined and optimized. The remaining data are then used to make predictions. Other neural networks commonly used for hydraulic fracturing prediction include Convolutional Neural Networks (CNN)³¹ and Recurrent Neural Networks (RNN).³²

Convolutional Neural Networks are an improved algorithm derived from ANN and are known for their stronger learning capabilities. CNNs have been applied to data processing and parameter prediction in fracturing wells. For instance, Zhang et al.³³ trained a CNN to image hydraulic fractures under energized steel casing, demonstrating good generalization ability. CNNs are also widely used for microseismic event detection during the hydraulic fracturing monitoring and interpretation processes as well as for fracture prediction due to their excellent performance in image classification, target detection, and image generation. Liu et al.³⁴ used CNNs to automatically classify microseismic events in the Eagle Ford Shale, achieving efficient predictions. Jang et al.³⁵ trained a CNN with fracture network data as input and reservoir simulation results as output to estimate oil and gas production capacity in naturally fractured reservoirs. CNNs utilize components including convolutional, pooling, and fully connected layers to extract features from images. These features enable CNNs to perform various tasks including image classification, object detection, and image generation.

Recurrent Neural Networks are a crucial component of deep learning, primarily used to solve time series problems. However, despite their ability to handle short-term sequential data, RNNs face limitations when dealing with longer sequences due to issues such as gradient explosion and vanishing gradients. To address these challenges, variants of RNN, including Long Short-Term Memory (LSTM)³⁶ and Gated Recurrent Unit (GRU),³⁷ have been developed. These variants are particularly effective for time series prediction problems, allowing future production to be predicted based on previous historical production data. The network structures of the RNN, LSTM, and GRU methods are illustrated in Figure 4.

(a) RNN, (b) GRU, and (c) LSTM network structures.

LSTM is significantly more complex compared to the simple architecture of RNN. The core innovation behind LSTM lies in its use of a cellular state and three distinct gates: the forget gate, the input gate, and the output gate. These components differentiate LSTM from a standard RNN. Although LSTM excels at mitigating the vanishing and exploding gradient problems, its intricate architecture necessitates greater memory resources and results in longer training times. The GRU architecture streamlines the LSTM design by merging the forget and input gates into the unified update gate. Additionally, it consolidates the hidden state and cell state into a single entity, enhancing the simplicity and efficiency of the model. This streamlined design reduces computational costs while retaining many of the benefits of the LSTM. Table 7 illustrates the application of the RNN and its variants in predicting oil and gas production. It provides data-driven input and output parameters along with their predictive effectiveness, offering valuable insights for big data analysts and petroleum engineers.

Table 7. Recursive Neural Network Based Hydraulic Fracturing Prediction Method.

neural network	input parameter	output parameters	description	reference
LSTM	Local natural gas demand, population, temperature, US natural gas prices, global crude oil prices	Long-term natural gas demand	Three variable prediction for low natural gas demand period and five variable prediction for high natural gas demand period	(20)
H-Bi-LSTM	Bi-LSTM1: the tubing pressure, the proppant ratio, and the injection rate;	fractures shape	H-Bi LSTM outperforms LSTM and Bi LSTM	(21)
	Bi-LSTM2: 2 sequences of tubing pressure and proppant ratio in the sand-laden stage;
	Bi-LSTM3: sequence of the tubing pressure after stopping pump;
	Bi-LSTM4: the hydraulic sand fracturing operation curves obtained by Bi-LSTM1
EEMD-LSTM	The first three intrinsic mode functions obtained through empirical mode decomposition	Time series oil production	EEMD-LSTM outperforms EEMD-ANN and EEMD-SVM	(22)
Bi-GRU based on SSA	Case1: oil rate, choke size, shut-in time	Time series oil production	SSA based Bi GRU outperforms ARMA, Arps, LSTM, Arima, GRU, RNN	(23)
	Case2: choke size, shut-in time, formation parameters, fracturing parameters
A-LSTM	Production in the first four months	Production in the fifth month	A-LSTM outperforms DLSTM, NEA, ARIMA, Multi-RNN, DGRU	(24)
DLSTM	Daily oil production	Time series oil production	DLSTM outperforms RNN, GRU	(25)
LSTM	Daily shale gas production	Time series shale gas production	LSTM outperforms ARIMA, DCA	(26)

Open in a new tab

Prediction methods based on RNN systems perform better in yield prediction for hydraulic fracturing operations. Traditional ML methods typically rely on a single data point to predict yield, meaning they can estimate only either the initial rate or the cumulative yield rather than predicting the complete time series of yield data. In contrast, RNN-based methods effectively address this issue by capturing the temporal dependencies in the data. However, when dealing with long-term data, the memory capability of RNNs for earlier time steps diminishes, making accurate predictions challenging.

2.3. Hybrid Network Model

Extensive research has been carried out on predicting the fracturing yield using data-driven methods. However, the accuracy of these predictions is heavily influenced by the quality and amount of field data available. Given the complexity of fracturing conditions, hybrid-model-based prediction methods show great potential in the field of hydraulic fracturing prediction. Hybrid model is an ensemble method that combines multiple single agent models, aiming to improve prediction accuracy, robustness, and generalization ability by integrating the advantages of different models. Considering the complexity of nonlinear time series prediction in multifractured horizontal shale gas wells, Qin et al.³⁸ utilized the GRU’s capacity to retain long-term sequential data. They also exploited the MLP’s versatility in handling the nonlinear relationships between input and output data. The GRU-MLP neural network, as introduced by Yang et al.,³⁹ was employed to predict shale gas production from fractured wells using a novel combinatorial approach. This methodology applies a unique combination of neural network architectures to enhance the accuracy of production forecasts.

The workflow of the GRU-MLP combination is illustrated in Figure 5. MLP, as a generalization of the perceptron, has a superior ability to express and learn nonlinear relationships. However, MLP struggles with time series problems due to its lack of memory function.⁴⁰ Conversely, GRU, a variant designed to address the length limitations of RNN algorithms, excels at predicting time series problems. The hybrid model of GRU and MLP achieves higher accuracy within an acceptable computational time frame and can efficiently and accurately predict the production of shale gas multistage fractured horizontal wells. Due to its simpler structure, fewer parameters, and faster training compared to LSTM. Combining MLP may reduce computational costs while maintaining performance.

CNN specializes in extracting spatial features, and LSTM captures long-term dependencies. Therefore, Figure 6 combines a hybrid network model (CNN-LSTM) of CNN and LSTM using CNN to extract features from multidimensional time series and LSTM to predict monthly water production during the growing period, monthly gas production during the stabilized production period, and monthly water production during the stabilized production period using the extracted features.⁴¹

Although ML algorithms can obtain good results for prediction in hydraulic fracturing operations, the different data sizes and data structures lead to the fact that there is no single algorithm that can be applied to all situations. The integration of neural networks can mine a variety of oilfield data to solve the prediction difficulties, such as the complexity of the factors affecting oilfield production and its dynamic variability, and improve the accuracy of the oilfield production prediction.

2.4. Physically Informed Neural Networks

ML algorithms have produced many innovative results in the field of hydraulic fracturing prediction. The precision of these predictions largely depends on the quality of the input data. However, the accuracy can vary significantly based on how accurate the data provided are. ML methods can only capture correlations between variables without understanding the causality between input and output variables. This leads to a lack of physical interpretability, and the parameter correlation characteristics in ML models often deviate significantly from the actual mechanistic laws.

Combining physical knowledge with data-driven models to balance accuracy and interpretability remains a challenge. To address this, a new approach called a physically informed neural network (PINN) has been developed to enhance the interpretability of data-driven models. PINN is a novel hybrid-driven neural network model. The core idea of PINN is to incorporate the constraints of mathematical-physical models into the training process of neural networks. In existing research, the embedding of physical information in PINNs is typically achieved by modifying the loss function during network training to include a priori knowledge in the form of partial differential equations. Figure 7 illustrates the basic architecture of the PINN. The partial differential equation under consideration generally contains both a linear part and a nonlinear component, as shown below:

Automatic differentiation techniques are employed in neural networks to derive the partial differential relationships between the outputs and inputs of the model, utilizing the back-propagation algorithm based on the chain rule of differentiation. A key distinction between PINN and other neural networks lies in the construction of the loss function. In PINNs, the loss function incorporates both data-driven and physics-informed components, which can be expressed as the mean-square error (MSE) of a traditional data-driven model as follows:

The loss function of PINN consists of physical information loss, initial condition loss, and boundary condition loss:

Physical information residual neural networks use a loss function to integrate petrophysical information with neural networks, thereby improving the stratigraphic assessment of wells.⁴² This study conducts a comparative analysis of the prediction performance between a petrophysical information residual neural network and a traditional residual neural network. The findings demonstrate that the petrophysical information residual neural network model outperforms the traditional model in terms of prediction accuracy.

Wang et al.⁴³ utilized the PINN model to determine the decline curves of shale gas wells. Their model was trained by using production data sourced from 20 wells located in the Duvernay Formation. They compared results with predictions from a fully connected neural network with different regularization weights and found that the PINN model accurately predicts the production dynamics of the test wells. Qu et al.⁴⁴ proposed a hybrid neural network model incorporating physical constraints, as shown in Figure 8, for evaluating the fracturing effectiveness of horizontal wells in tight oil and gas reservoirs. The model’s loss function comprises the traditional data-driven model loss, field experience-defined loss, and the 2D fracture model loss, where MSE is the loss function of the traditional ML model, loss denotes the loss defined by field experience, the subscript f denotes the fracture geometry, the subscript k denotes the fracture permeability, and the subscript 2D denotes the 2D fracture model.

2.5. Summary of Data-Driven Models for Production Forecast

The four hydraulic fracturing prediction methods are summarized in Table 8. Although traditional ML methods have low training costs, their applicability is limited. They are generally suitable for simple hydraulic fracturing conditions due to their shallow architectures. Recurrent neural networks can predict the fracturing yield of time series; however, their effectiveness diminishes for long-term predictions, often resulting in large errors. Additionally, the gradient explosion problem in the recurrent neural network remains unresolved. LSTM is an improved recursive neural network. LSTM is designed to solve the long-term dependency problem in recurrent neural networks. Being able to learn long-term dependency relationships alleviates the problem of gradient vanishing. But the structure is relatively complex, with high computational costs and parameter quantities. GRU is a further simplification of LSTM. The structure is simpler than LSTM, with fewer parameters and higher computational efficiency, but it can still compete with LSTM in many tasks. Although it simplifies the structure, it may not be as powerful as LSTM when dealing with very complex problems. GRU excels at capturing short-term dependencies and dynamic patterns in a time series. For example, by analyzing the relationship between the water injection rate and crack propagation, GRU can predict real-time changes in crack morphology and optimize fracturing process parameters. Hybrid models integrate multiple methods to predict hydraulic fracturing production, enabling them to handle complex fracturing conditions with high accuracy. However, this comes at the cost of increased training complexity and more computational resources.

Table 8. Comparison of Three Hydraulic Fracturing Prediction Methods.

method	principle	limitations	applicable scenarios
Traditional ML	Training the use of statistical knowledge and hydraulic fracturing data characteristics to establish regression methods	• Shallow architecture	• Short prediction time
		• Limited applicability	• Low precision requirement
		• Features must be manually selected	• Small data volume
Neural networks and their variants	Using memory units to learn the impact of historical data on the current situation	• High training costs	• High precision requirements
		• Difficulty in accurately predicting long-term production	• Large data volume
Hybrid model	Connecting Several Methods or Parallel Computing Multiple Methods	High computational complexity	Complex fracturing conditions with high prediction accuracy
PINN	Adding physical model constraints during neural network training	Poor generalization ability	Suitable for different data volumes and data subject to significant noise interference

Open in a new tab

3. Application of ML to Hydraulic Fracturing Operations

Hydraulic fracturing is a reservoir modification method used to increase the effective permeability of tight unconventional reservoirs, such as shale oil and gas. Despite its widespread application worldwide, although widely used globally, hydraulic fracturing operations are complex, making reservoir simulation, well completion optimization, and enhancing oil and gas production still challenging. With swift progress in unconventional oil and gas reservoirs, the amount of data associated with oil extraction and production has been expanding rapidly. ML is well-suited to optimize parameter predictions using historical oil and gas data and reservoir information. It can also predict capacity, fill in missing data values before hydraulic fracturing operations, diagnose working conditions, provide risk warnings during fracturing operations, and optimize production after refracturing operations.

3.1. Application of ML to Fracturing Flowback

The fracturing flowback procedure involves removing the fracturing fluid injected into a well along with the oil, gas, and water released from the formation, after hydraulic fracturing. This process typically includes the recovery and disposal of the fracturing fluid as well as monitoring parameters such as wellhead flow and pressure. The effectiveness of this process directly impacts the increased production from fracturing. Using dynamic data from return production and static geoengineering data, ML methods can optimize the return differential pressure and flow rate to control the proppant return flow rate and optimize the fracturing fluid return rate.

Li et al.⁴⁵ developed an optimized nozzle diameter utilizing an augmented residual deep learning neural network. This innovation helps to prevent sand particles from being present in the flowback of fracturing fluid, thereby enhancing the efficiency and cutting the costs associated with hydraulic fracturing. Guo et al.⁴⁶ studied the correlation between characteristic flowback coefficients of 214 shale gas horizontal wells in the Weiyuan area and geological and engineering parameters. They developed an ML approach to forecast the flowback curves of shale gas horizontal wells. Liu et al.⁴⁷ developed a model for predicting flowback rates in shale gas wells using deep learning techniques. This model was developed by utilizing data from 286 shale gas wells located in the Weiyuan oil field.

Understanding the distribution of proppant particles in fracturing fluids is crucial to assessing the effectiveness of hydraulic fracturing operations. Detecting the fate of proppant pumped downhole can improve well landing zones, well spacing, and fracture or cluster spacing design. In situ sampling is a common method for analyzing proppant distribution but often incurs high costs and provides incomplete and inaccurate analyses. Other methods, such as semiquantitative approaches and systematic sample imaging, may lack sufficient grain size detail⁴⁸ or be biased by imaging, leading to inaccurate results.⁴⁹

ML methods can effectively address these issues. Maity et al.⁵⁰ developed a workflow for identifying proppant particle locations and classification using algorithms such as ANN, KNN, Bayesian classifiers, and SVM, which were applied on-site in the Permian Basin. Temizel et al.⁵¹ used predictive models generated by ANN and SVM algorithms to design the type of fracturing fluid, proppant material, and injection rate. Hou et al.⁵² introduced a new Proppant Filling Index (PFI) for real-time evaluation of proppant injection using GRU and RF models, enhancing the efficiency of delivering proppant to fractures and characterizing artificially fractured reservoirs. Wang et al.⁵³ employed a cross-validated recursive feature elimination technique to create models predicting production for four horizontal wells, optimizing the quantities of proppant and fracturing fluid.

3.2. Application of ML to Fracture Characterization in Hydraulic Fracturing Operations

In unconventional reservoirs, factors such as stress state, stress anisotropy, rock mechanical properties, and natural fractures control hydraulic fracture extension and significantly affect single-well capacity.⁵⁴ Additionally, formation parameters such as porosity, permeability, and saturation influence the reservoir’s oil supply capacity. Therefore, modeling fractures, characterizing fracture geometry, and investigating fracture extension factors are essential for accurate oil and gas management prediction.

Optimizing fracture parameters is a challenge for the oil and gas industry to enhance production. Large-scale hydraulic fracturing typically creates highly complex multiscale fracture networks. Various methods exist for modeling hydraulic fractures, and traditionally, optimization relies on numerical simulation methods.⁵⁵⁻⁵⁷ However, solving large-scale problems with these methods is time-consuming and requires engineers to possess expertise in geology, engineering, reservoirs, and mechanics to construct accurate simulation models. Given time constraints, finding a global optimization may not always be feasible, making manual modeling a significant obstacle to the development of large-scale reservoir simulations.⁵⁸

In recent decades, ML methods have shown great potential in hydraulic fracturing. ML has been widely used for predicting the hydraulic fracture geometry and interactions between hydraulic and natural fractures. The interaction between these fractures affects the propagation of fractures in the formation, with outcomes such as crossing, steering, and blocking altering the direction or terminating the propagation of hydraulic fractures to varying degrees. Table 9 presents various applications of ML-based methods for fracture prediction, detailing the predicted fracture form, the predicted object, the prediction method, and the prediction results.

Table 9. ML Methods for Predicting Cracks.

object	research contents	method	reference	description
Hydraulic fractures (HF)	The prediction of the fracture’s aspect ratio	ANN	(59)	ANN prediction approximate particle dynamics method
	Optimize fracture half-length, number, spacing, and conductivity	RBF, MLP, KNN	(60)	RBF, MLP outperforms KNN
	The influence of reservoir pressure on the formation of hydraulic fracture	RF, ANN	(61)	ANN outperforms RF
Hydraulic fractures - natural fractures (HF-NF)	The influence of the relative angle of HF and NF, fracture gradient, NF friction coefficient, overpressure index, stress difference, formation depth and net pressure on NF slip	RF	(62)	RF and sensitivity analysis produce different but useful feature rankings
	The interaction of HF and NF	ANN	(63)	ANN is consistent with analytical solutions and numerical models

Open in a new tab

While various ML algorithms have been applied, there remains a lack of sufficient research focused on minimizing the computational load of the iterative processes involved in optimizing shale gas reservoirs. A single ML algorithm cannot effectively address all fracturing conditions effectively. Therefore, synergizing multiple ML methods to build a hybrid model can better optimize horizontal well spacing and fractures in fracturing operations. Wang et al.⁶⁴ applied Gaussian process regression, radial basis function networks, and support vector regression to approximate the numerical simulation model, creating a hybrid model to optimize horizontal well spacing and hydraulic fracturing stage placement. Xiao et al.⁶⁰ utilized three supervised ML models—radial basis functions, KNN, and MLP—to simulate multistage fracture parameters. Nouri et al.⁶⁵ combined three intelligent systems—RBF Neural Network, MLP, and Least Squares SVM—in the Marun field reservoir in Iran, using complete logging data to determine fracture density.

3.3. Application of ML to Missing Value Filling in Hydraulic Fracturing Operations

In the oil and gas industry, loss of drilling data can occur for various reasons, including tool malfunctions and wellbore instability during drilling operations. Recovering this missing logging information, such as density logs, shear acoustic logs, and sonic logs, can be costly if the drilling run is repeated with logging instruments. The correlation similarity method is essential for accurately predicting the missing log data. By analyzing adjacent well logs, this technique forecasts the logging profile of the target logs. It accounts for formation variations, following the patterns observed in the reference logs to achieve accurate predictions.^66,67 However, predicting missing logging data using correlation similarity methods requires calibration to fit localized data, often resulting in a low accuracy.

With the advent of big data, ML has also been employed to predict missing logging data. Akkurt et al.⁶⁸ developed an innovative unsupervised algorithm for outlier detection, which aims to identify anomalies in density and acoustic logging data. Additionally, this algorithm can ascertain the orientation of a well using various logs. Al-Anazi et al.⁶⁹ used the nonlinear SVM approach to predict permeability distribution and classify highly inhomogeneous sandstone reservoirs. Other petrophysics-based models for generating suitable log predictions are heavily dependent on lithologic units and require significant expertise and calibration. To improve the accuracy of predicting missing values, Ali et al.⁷⁰ introduced a new method using a ML-based similarity algorithm and DNNs to predict missing shear acoustic logs. Their results were more accurate than those of traditional methods. Bukar et al.⁷¹ demonstrated a method based on supervised learning algorithms and implemented in MATLAB to predict sonic logging curves using ML techniques such as multivariate regression models, SVM, and decision trees.

3.4. Application of ML to Capacity Prediction in Hydraulic Fracturing

Accurately forecasting future production and estimating the ultimate recovery present one of the most significant challenges in the oil and gas industry. Traditional production forecasting methods include empirical equations, decline curve analysis, semianalytical models, and numerical simulation techniques, all of which rely on existing production data. Table 10 demonstrates the application scenarios, advantages, and drawbacks associated with decline curve analysis, numerical simulation techniques, and ML methods.

Table 10. Comparison of Three Existing Yield Forecasting Methods.

method	limitations	applicable scenarios	advantage
descent curve analysis	dependent on historical data	fracturing conditions with simple geological characteristics	direct methods for predicting production that are still in use today
numerical simulation of oil reservoirs	restricted by geological uncertainty	fracturing conditions with complex geological characteristics	mature and widely used in reservoir development and prediction
ML methods	restricted by data quality	fracturing conditions with simple and complex geological features	fast calculation speed and high calculation accuracy

Open in a new tab

Due to the high uncertainty in fluid transport during the fracturing process, the complexity of the reservoir, and the fractured nature of the underlying formation, production capacity predictions via numerical simulation are often inaccurate. Decline curve analysis can also lead to inaccurate predictions because of the differences in fluid flow between hydraulic and natural fractures, the need for experienced reservoir technicians, and the limitations of individual equations in accurately describing the entire production process. In contrast, researchers using ML methods to predict oil and gas production have greatly improved the accuracy of the production forecasts.

Gupta et al.⁷² trained neural networks and ARIMA models using historical production data, achieving an acceptable accuracy range in predicting production decline in shale gas reservoirs. Anderson et al.⁷³ introduced a comprehensive ML tool that utilizes available data from reservoir modeling, drilling, and hydraulic fracturing to predict final oil, gas, and water production. Liang et al.⁷⁴ gathered data from over 4,000 wells in Eagle Ford, encompassing more than 25 variables. They applied random forest regression to forecast gas EUR and oil EUR.

Most of the existing literature relies on neural networks, support vector machines, and random forests. These methods, however, struggle to adequately address the problem of production prediction in unconventional shale gas reservoirs due to the time-consuming and labor-intensive nature of dynamic production data. To tackle this issue, Xue et al.⁷⁵ employed a multiobjective RF to build a data-driven model for predicting the dynamic production behavior of shale gas reservoirs. This technique employs a subset of the initial data within each tree and integrates the outputs from all of the trees to reach a final decision. This method improves the model’s robustness against outliers in the data set.

All of these methods typically predict production based on a single data point, meaning they can only forecast initial rate, cumulative production, or recovery, but not time series production. Deep neural networks based on LSTM offer higher accuracy in predicting the time series yield. Song et al.⁷⁶ introduced an LSTM neural network model designed for productivity prediction of fractured horizontal wells in volcanic rock reservoirs. In their approach, the model uses the oil rate within a specified time window and the corresponding choke size as the input parameters. The output generated by the model is the oil rate in the time series. The LSTM neural network’s prediction results were then compared with traditional decline curve analysis, fully connected ANN and RNN, ARMA, and ARIMA models. Zhou et al.⁷⁷ proposed a model based on deep learning, specifically the CNN-BiGRU-AM architecture, to forecast shale oil time series production. They incorporated an attention mechanism into the model to ensure the accuracy of neural network learning and compared the prediction results with those of supervised learning methods (RF, DT, and MLP) and other deep learning methods (LSTM, GRU, and Bi-LSTM) to assess the performance of the CNN-BiGRU-AM model.

3.5. Application of ML in Fracturing Condition Diagnosis and Risk Warning

Fracturing condition diagnosis and risk warning are two critical tasks in the hydraulic fracturing process, with the aim of ensuring the safety and efficiency of operations. In recent years, deep learning, particularly deep neural networks, has been increasingly applied to diagnose working conditions and provide risk warnings during construction. Utilizing real-time monitoring data from the surface or downhole, deep learning methods can diagnose various complex conditions and warn of potential construction risks.

In the area of fracturing condition diagnosis, Shen et al.⁷⁸ developed the real-time diagnostic model for fracturing conditions utilizing deep learning architectures such as CNN and U-Net. This model enabled automatic annotation, model training, and autonomous diagnosis of the initiation and termination moments of fracturing operations, the seating and sealing of bridge plugs, and the staged fracturing conditions. Yuan et al.⁷⁹ achieved intelligent recognition of events like fracturing start and stop, formation rupture, and transient closure using an Att-Bi-LSTM neural network, along with generalized learning system neural networks and BP neural networks. They also established a fracturing “stage” event recognition model based on the enhanced Unet++ network, enabling intelligent recognition of events such as ball pumping, preacid treatment, temporary fracture plugging, and sand plugging. Ramirez et al.⁸⁰ developed a diagnostic model for Instantaneous Shutdown Pressure (ISIP) using ANN and linear regression, where ANN was used for training to identify and segregate necessary data, and linear regression predicted the ISIP value when the mud rate was zero, based on data extracted by the ANN.

Regarding fracturing risk early warning, Sun et al.⁸¹ realized real-time diagnosis of sand plugging by building a CNN and an LSTM neural network to fuse dynamic and static data features, using the inverse slope method to identify sand plugging feature data. Soroush et al.⁸² also achieved real-time diagnosis of sand plugging by employing a convolutional neural network and LSTM neural network to combine dynamic and static data features, utilizing the inverse slope method to discriminate sand plugging feature data. Hou et al.⁸³ proposed sand plugging probability characterization parameters and established a sand plugging probability prediction model based on a recurrent neural network, enabling real-time assessment of sand plugging risk probability during fracturing.

4. Challenges and Future Directions

With the development of ML methods, coupled with the continuous improvement in hydraulic fracturing data acquisition capabilities, fracturing prediction methods based on ML have become a research hotspot. The integration of ML with oil and gas development has yielded a range of innovative results, propelling the industry toward digitalization and intelligence. ML can overcome many limitations of traditional methods, reduce the time and capital costs of fracturing design, and enhance the reliability of the analysis results. However, existing research on using ML to guide fracturing design and oil and gas development has also revealed several challenges.

4.1. Challenges

1.
Lack of Suitable Training Sample Libraries: The quality of data from oil and gas fields varies significantly, and comprehensive databases for ML model training are scarce. Developing hydraulic fracturing prediction methods based on ML requires continuously updating and improving fracturing data. Currently, the data used for ML training generally come from field data or simulations, which have yet to form robust training sample libraries suitable for deep learning.
2.
Inadequate Targeted ML Model Selection: Selecting an appropriate ML algorithm for predicting production in unconventional reservoirs remains a significant challenge. An incorrect choice can result in prolonged training times, poor prediction accuracy, and problems with overfitting. The presence of extensive missing data or unclear data characteristics can introduce substantial biases in the prediction outcomes. Additionally, a single ML method cannot be perfectly applied to all fracturing scenarios.
3.
Need for Interpretable ML in Specialized Fields: The oil and gas industry is highly specialized, necessitating interpretable ML models. Currently, data processing is essentially a “black box”, with nontransparent relationships between inputs and outputs and a lack of specialized theoretical knowledge to guide the process. The complexity of fracturing mechanisms often results in conventional modeling methods lacking transparency in decision logic and interpretability of calculation results, particularly in areas such as fracturing production capacity optimization, fracture diagnosis, and effect evaluation.

4.2. Future Directions

1.
Future Directions in Hydraulic Fracturing Technology: The future of hydraulic fracturing technology is moving toward segmented cluster fracturing, with multiwell fracturing being a hot research topic. The pressure channel form is illustrated in Figure 9. To achieve global optimization of fracturing design, it is necessary to build a comprehensive data warehouse integrating geological, geophysical, drilling, completion, fracturing, and production data. ML algorithms can then be utilized to optimize the entire fracturing process, including design, construction control, and rejection control. By employing ML algorithms, the whole process of fracturing—from design, construction control, to return flow control—can be globally optimized to enhance production. Using RNN and their variants to extract time sequence data features, companies can diagnose complex wellbore conditions such as fracturing processes, bridge plug seating and sealing, and temporary plugging. Additionally, these algorithms can provide early warnings for risks such as sand plugging, casing deformation, interwell pressurization, and ground equipment failures.
2.
Ensuring Model Reliability and Interpretability in Fracturing Operations: Increasing production through fracturing requires high reliability in model decision-making due to its high-input and high-risk nature. Research into model interpretability can make fracturing predictions more transparent, addressing the “black-box” problem often associated with traditional ML models. The development of modeling methods that combine physical mechanisms with data-driven approaches is a crucial and inevitable trend in fracturing research. In recent years, the Physics-Informed Neural Network (PINN) method has been successfully used for predicting hydraulic fracturing fracture geometry and yield. By incorporating fracture permeability and yield control equations into the physical constraints, the PINN method has shown promise. Further research is expected to integrate additional important features, such as porosity and saturation, into the PINN model. This enhancement will enable the neural network, grounded in physical constraints, to more accurately and transparently predict increases in fracturing yield.

Segmented and clustered fracturing of horizontal wells, types of fracturing, and backflow after fracturing (a. Connected through fracturing fractures; b. Connected through natural fractures; c. Hydraulic fractures are directly connected to the wellbore; d. Backflow).

5. Conclusion

Due to the widespread application of machine learning models in the field of hydraulic engineering. This Review discusses in detail the program, challenges, development directions, and roles of machine learning prediction technology. By summarization of successful cases, the following conclusions can be drawn.

1.
We briefly introduce the basic concepts of hydraulic fracturing, review various ML methods applied to hydraulic fracturing. This article briefly introduces the basic concept of hydraulic fracturing, and provides a detailed introduction to the classification and labeling of hydraulic fracturing data, data preprocessing, hyperparameter adjustment of machine learning models, and evaluation of machine learning generalization ability. Especially for highly complex hydraulic fracturing data, data preprocessing and data classification and labeling deserve our attention.
2.
Detailed analysis and discussion were conducted on the advantages, disadvantages, and limitations of traditional machine learning algorithms, neural networks and their variants, hybrid models, and physics aware neural networks. Emphasis was placed on their applications in fracturing flowback, fracture characterization, missing value filling, production capacity prediction and the diagnosis of fracturing conditions and risk early warning. We also summarize some shortcomings in the use of ML for hydraulic fracturing prediction. In cases of a large amount of missing data, unstable numerical simulation modeling, or unclear data characteristics, the application of ML algorithms can lead to significant bias. Additionally, due to the reduced ability of ML methods to remember multiple time steps, these methods cannot accurately predict long-duration fracturing production.
3.
We provide an outlook on the future development of ML-based hydraulic fracturing methods. The future direction involves using ML to consider the entire process of fracturing yield enhancement and incorporating more fracturing features into PINN or its variants.

The authors declare no competing financial interest.

References

Barati R.; Liang J.-T., A review of fracturing fluid systems used for hydraulic fracturing of oil and gas wells. J. Appl. Polym. Sci. 2014, 131, 10.1002/app.40735. [DOI] [Google Scholar]
Arps J. J. Analysis of Decline Curves. Transactions of the AIME 1945, 160, 228–247. 10.2118/945228-G. [DOI] [Google Scholar]
Ilk D.; Rushing J. A.; Perego A. D.; Blasingame T. A.. Exponential vs. Hyperbolic Decline in Tight Gas Sands - Understanding the Origin and Implications for Reserve Estimates Using Arps’ Decline Curves. Paper presented at the SPE Annual Technical Conference and Exhibition, Denver, Colorado, USA, September 2008. Paper Number: SPE-116731-MS. SPE, 2008. 10.2118/116731-MS (accessed August 26, 2024). [DOI]
Morgan D.; Jacobs R. Opportunities and Challenges for Machine Learning in Materials Science. Annu. Rev. Mater. Res. 2020, 50, 71–103. 10.1146/annurev-matsci-070218-010015. [DOI] [Google Scholar]
Olsson F.A Literature Survey of Active Machine Learning in the Context of Natural Language Processing; Swedish Institute of Computer Science, 2009. https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-23510 (accessed August 24, 2024).
Pak M.; Kim S.. A review of deep learning in image recognition. In 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT); IEEE, 2017: pp 1–3. 10.1109/CAIPT.2017.8320684. [DOI] [Google Scholar]
Shehab M.; Abualigah L.; Shambour Q.; Abu-Hashem M. A.; Shambour M. K. Y.; Alsalibi A. I.; Gandomi A. H. Machine learning in medical applications: A review of state-of-the-art methods. Computers in Biology and Medicine 2022, 145, 105458 10.1016/j.compbiomed.2022.105458. [DOI] [PubMed] [Google Scholar]
Tariq Z.; Aljawad M. S.; Hasan A.; Murtaza M.; Mohammed E.; El-Husseiny A.; Alarifi S. A.; Mahmoud M.; Abdulraheem A. A systematic review of data science and machine learning applications to the oil and gas industry. J. Petrol Explor Prod Technol. 2021, 11, 4339–4374. 10.1007/s13202-021-01302-2. [DOI] [Google Scholar]
Cover T.; Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 1967, 13, 21–27. 10.1109/TIT.1967.1053964. [DOI] [Google Scholar]
Cortes C.; Vapnik V. Support-vector networks. Mach Learn 1995, 20, 273–297. 10.1007/BF00994018. [DOI] [Google Scholar]
Breiman L. Random Forests. Machine Learning 2001, 45, 5–32. 10.1023/A:1010933404324. [DOI] [Google Scholar]
Shi G. Superiorities of support vector machine in fracture prediction and gassiness evaluation. Petroleum Exploration and Development 2008, 35, 588–594. 10.1016/S1876-3804(09)60091-4. [DOI] [Google Scholar]
Chandra M. A.; Bedi S. S. Survey on SVM and their application in image classification[J]. International Journal of Information Technology 2021, 13 (5), 1–11. 10.1007/s41870-017-0080-1.33527094 [DOI] [Google Scholar]
Zhong Z.; Carr T. R. Application of a new hybrid particle swarm optimization-mixed kernels function-based support vector machine model for reservoir porosity prediction: A case study in Jacksonburg-Stringtown oil field, West Virginia, USA. Interpretation 2019, 7, T97–T112. 10.1190/INT-2018-0093.1. [DOI] [Google Scholar]
Schuetter J., Mishra S., Zhong M., LaFollette R.. Data Analytics for Production Optimization in Unconventional Reservoirs. In Proceedings of the 3rd Unconventional Resources Technology Conference; OnePetro, 2015. 10.15530/URTEC-2015-2167005. [DOI] [Google Scholar]
A. Ahmed S.; A.A. Mahmoud S.; Elkatatny M.; Mahmoud A. A.. Prediction of Pore and Fracture Pressures Using Support Vector Machine. Paper presented at the International Petroleum Technology Conference, Beijing, China, March 2019. Paper Number: IPTC-19523-MS; OnePetro, 2019. 10.2523/IPTC-19523-MS. [DOI] [Google Scholar]
Sun Z.; Wang L.; Zhou J.-Q.; Wang C. A new method for determining the hydraulic aperture of rough rock fractures using the support vector regression. Engineering Geology 2020, 271, 105618 10.1016/j.enggeo.2020.105618. [DOI] [Google Scholar]
Wang G.; Carr T. R.; Ju Y.; Li C. Identifying organic-rich Marcellus Shale lithofacies by support vector machine classifier in the Appalachian basin. Computers & Geosciences 2014, 64, 52–60. 10.1016/j.cageo.2013.12.002. [DOI] [Google Scholar]
Liang Z.; Jiang Z.; Wu W.; Guo J.; Wang M.; Nie Z.; Li Z.; Xu D.; Xue Z.; Chen R.; Han Y. Study and Classification of Porosity Stress Sensitivity in Shale Gas Reservoirs Based on Experiments and Optimized Support Vector Machine Algorithm for the Silurian Longmaxi Shale in the Southern Sichuan Basin, China, ACS. Omega 2022, 7, 33167–33185. 10.1021/acsomega.2c03393. [DOI] [PMC free article] [PubMed] [Google Scholar]
Asala H. I.; Chebeir J.; Zhu W.; Gupta I.; Taleghani A. D.; Romagnoli J.. A Machine Learning Approach to Optimize Shale Gas Supply Chain Networks. Paper presented at the SPE Annual Technical Conference and Exhibition, San Antonio, Texas, USA, October 2017. Paper Number: SPE-187361-MS; SPE, 2017, 10.2118/187361-MS. [DOI]
Yang Z.; Yang C.; Li X.; Min C. Pattern Recognition of the Vertical Hydraulic Fracture Shapes in Coalbed Methane Reservoirs Based on Hierarchical Bi-LSTM Network. Complexity 2020, 2020 (2020), e1734048 10.1155/2020/1734048. [DOI] [Google Scholar]
Liu W.; Liu W. D.; Gu J. Forecasting oil production using ensemble empirical model decomposition based Long Short-Term Memory neural network. J. Pet. Sci. Eng. 2020, 189, 107013 10.1016/j.petrol.2020.107013. [DOI] [Google Scholar]
Li X.; Ma X.; Xiao F.; Xiao C.; Wang F.; Zhang S. Time-series production forecasting method based on the integration of Bidirectional Gated Recurrent Unit (Bi-GRU) network and Sparrow Search Algorithm (SSA). J. Pet. Sci. Eng. 2022, 208, 109309 10.1016/j.petrol.2021.109309. [DOI] [Google Scholar]
Kumar I.; Tripathi B. K.; Singh A. Attention-based LSTM network-assisted time series forecasting models for petroleum production. Engineering Applications of Artificial Intelligence 2023, 123, 106440 10.1016/j.engappai.2023.106440. [DOI] [Google Scholar]
Sagheer A.; Kotb M. Time series forecasting of petroleum production using deep LSTM recurrent networks. Neurocomputing 2019, 323, 203–213. 10.1016/j.neucom.2018.09.082. [DOI] [Google Scholar]
Yang R.; Liu X.; Yu R.; Hu Z.; Duan X. Long short-term memory suggests a model for predicting shale gas production. Applied Energy 2022, 322, 119415 10.1016/j.apenergy.2022.119415. [DOI] [Google Scholar]
Hassanvand M.; Moradi S.; Fattahi M.; Zargar G.; Kamari M. Estimation of rock uniaxial compressive strength for an Iranian carbonate oil reservoir: Modeling vs. artificial neural network application. Petroleum Research 2018, 3, 336–345. 10.1016/j.ptlrs.2018.08.004. [DOI] [Google Scholar]
Sami N. A.; Ibrahim D. S. Forecasting multiphase flowing bottom-hole pressure of vertical oil wells using three machine learning techniques. Petroleum Research 2021, 6, 417–422. 10.1016/j.ptlrs.2021.05.004. [DOI] [Google Scholar]
McVey D. S.; Mohaghegh S.; Aminian K.; Ameri S. Identification of Parameters Influencing the Response of Gas Storage Wells to Hydraulic Fracturing With the Aid of a Neural Network. SPE Computer Applications 1996, 8, 54–57. 10.2118/29159-PA. [DOI] [Google Scholar]
Ibrahim A. F.; Alarifi S. A.; Elkatatny S. Application of Machine Learning to Predict Estimated Ultimate Recovery for Multistage Hydraulically Fractured Wells in Niobrara Shale Formation. Computational Intelligence and Neuroscience 2022, 2022, e7084514 10.1155/2022/7084514. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li Z.; Liu F.; Yang W.; Peng S.; Zhou J. A Survey of Convolutional Neural Networks: Analysis Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems 2022, 33, 6999–7019. 10.1109/TNNLS.2021.3084827. [DOI] [PubMed] [Google Scholar]
Salehinejad H.; Sankar S.; Barfett J.; Colak E.; Valaee S.. Recent Advances in Recurrent Neural Networks. arXiv.org, 2017. https://arxiv.longhoe.net/abs/1801.01078v3 (accessed May 31, 2024).
Zhang R.; Sun Q.; Zhang X.; Cui L.; Wu Z.; Chen K.; Wang D.; Liu Q. H. Imaging Hydraulic Fractures Under Energized Steel Casing by Convolutional Neural Networks. IEEE Transactions on Geoscience and Remote Sensing 2020, 58, 8831–8839. 10.1109/TGRS.2020.2991011. [DOI] [Google Scholar]
Liu Y.; Huff O.; Luo B.; Jin G.; Simmons J. Convolutional neural network-based classification of microseismic events originating in a stimulated reservoir from distributed acoustic sensing data. Geophysical Prospecting 2022, 70, 904–920. 10.1111/1365-2478.13199. [DOI] [Google Scholar]
Jang Y.; Kwon S.; Park G.; Lee S.; Min B.. Assessment of Hydrocarbon Productivity Using Convolutional Neural Network with Phase-field Approach in Naturally Fractured Reservoirs, 2020; American Geophysical Union, Fall Meeting 2020, abstract #H036-0005. https://ui.adsabs.harvard.edu/abs/2020AGUFMH036.0005J (accessed August 3, 2023).
Hochreiter S.; Schmidhuber J. Long Short-Term Memory. Neural Computation 1997, 9, 1735–1780. 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
Cho K.; Van Merrienboer B.; Gulcehre C.; Bahdanau D.; Bougares F.; Schwenk H.; Bengio Y.. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv, 2014. 10.48550/arXiv.1406.1078. [DOI]
Qin X.; Hu X.; Liu H.; Shi W.; Cui J. A Combined Gated Recurrent Unit and Multi-Layer Perception Neural Network Model for Predicting Shale Gas Production. Processes 2023, 11, 806. 10.3390/pr11030806. [DOI] [Google Scholar]
Yang R.; Qin X.; Liu W.; Huang Z.; Shi Y.; Pang Z.; Zhang Y.; Li J.; Wang T. A Physics-Constrained Data-Driven Workflow for Predicting Coalbed Methane Well Production Using Artificial Neural Network. SPE Journal 2022, 27, 1531–1552. 10.2118/205903-PA. [DOI] [Google Scholar]
Shi Y.; Song X.; Song G. Productivity prediction of a multilateral-well geothermal system based on a long short-term memory and multi-layer perceptron combinational neural network. Applied Energy 2021, 282, 116046 10.1016/j.apenergy.2020.116046. [DOI] [Google Scholar]
Zha W.; Liu Y.; Wan Y.; Luo R.; Li D.; Yang S.; Xu Y. Forecasting monthly gas field production based on the CNN-LSTM model. Energy 2022, 260, 124889 10.1016/j.energy.2022.124889. [DOI] [Google Scholar]
Shao R.; Wang H.; Xiao L. Reservoir evaluation using petrophysics informed machine learning: A case study. Artificial Intelligence in Geosciences 2024, 5, 100070 10.1016/j.aiig.2024.100070. [DOI] [Google Scholar]
Wang H.; Wang M.; Chen S.; Hui G.; Pang Y. A novel governing equation for shale gas production prediction via physics-informed neural networks. Expert Systems with Applications 2024, 248, 123387 10.1016/j.eswa.2024.123387. [DOI] [Google Scholar]
Qu H.-Y.; Zhang J.-L.; Zhou F.-J.; Peng Y.; Pan Z.-J.; Wu X.-Y. Evaluation of hydraulic fracturing of horizontal wells in tight reservoirs based on the deep neural network with physical constraints. Petroleum Science 2023, 20, 1129–1141. 10.1016/j.petsci.2023.03.015. [DOI] [Google Scholar]
Li R.; Wei H.; Wang J.; Li B.; Zheng X.; Bai W. An Artificial Intelligence Method for Flowback Control of Hydraulic Fracturing Fluid in Oil and Gas Wells. Processes 2023, 11, 1773. 10.3390/pr11061773. [DOI] [Google Scholar]
Guo W.; Zhang X.; Kang L.; Gao J.; Liu Y. Investigation of Flowback Behaviours in Hydraulically Fractured Shale Gas Well Based on Physical Driven Method. Energies 2022, 15, 325. 10.3390/en15010325. [DOI] [Google Scholar]
Yuyang L.; Xinhua M.; Xiaowei Z.; Wei G.; Lixia K.; Rongze Y.; Yuping S. Shale gas well flowback rate prediction for Weiyuan field based on a deep learning algorithm. J. Pet. Sci. Eng. 2021, 203, 108637 10.1016/j.petrol.2021.108637. [DOI] [Google Scholar]
Raterman K. T.; Farrell H. E.; Mora O. S.; Janssen A. L.; Gomez G. A.; Busetti S.; McEwen J.; Friehauf K.; Rutherford J.; Reid R.; Jin G.; Roy B.; Warren M. Sampling a Stimulated Rock Volume: An Eagle Ford Example. SPE Reservoir Evaluation & Engineering 2018, 21, 927–941. 10.2118/191375-PA. [DOI] [Google Scholar]
Elliott S. J.; Gale J. F. W.. Analysis and Distribution of Proppant Recovered From Fracture Faces in the HFTS Slant Core Drilled Through a Stimulated Reservoir. In Proceedings of the 6th Unconventional Resources Technology Conference; American Association of Petroleum Geologists: Houston, Texas, USA, 2018. 10.15530/urtec-2018-2902629. [DOI]
Maity D.; Ciezobka J. Designing a robust proppant detection and classification workflow using machine learning for subsurface fractured rock samples post hydraulic fracturing operations. J. Pet. Sci. Eng. 2019, 172, 588–606. 10.1016/j.petrol.2018.09.062. [DOI] [Google Scholar]
Temizel C., Purwar S., Abdullayev A., Urrutia K., Tiwari A.. Efficient Use of Data Analytics in Optimization of Hydraulic Fracturing in Unconventional Reservoirs. Paper presented at the Abu Dhabi International Petroleum Exhibition and Conference, Abu Dhabi, UAE, November 2015. Paper Number: SPE-177549-MS. OnePetro, 2015. 10.2118/177549-MS. [DOI] [Google Scholar]
Hou L.; Elsworth D.; Zhang F.; Wang Z.; Zhang J. Evaluation of proppant injection based on a data-driven approach integrating numerical and ensemble learning models. Energy 2023, 264, 126122 10.1016/j.energy.2022.126122. [DOI] [Google Scholar]
Wang S.; Chen S. Insights to fracture stimulation design in unconventional reservoirs based on machine learning modeling. J. Pet. Sci. Eng. 2019, 174, 682–695. 10.1016/j.petrol.2018.11.076. [DOI] [Google Scholar]
Cipolla C. L.; Lewis R. E.; Maxwell S. C.; Mack M. G.. Appraising Unconventional Resource Plays: Separating Reservoir Quality from Completion Effectiveness. Paper presented at the International Petroleum Technology Conference, Bangkok, Thailand, November 2011. Paper Number: IPTC-14677-MS. OnePetro, 2011. 10.2523/IPTC-14677-MS. [DOI] [Google Scholar]
Keshavarzi R.; Mohammadi S.. A New Approach for Numerical Modeling of Hydraulic Fracture Propagation in Naturally Fractured Reservoirs. Paper presented at the SPE/EAGE European Unconventional Resources Conference and Exhibition, Vienna, Austria, March 2012. Paper Number: SPE-152509-MS; SPE: Vienna, Austria, 2012. 10.2118/152509-MS. [DOI] [Google Scholar]
Yan C.; Jiao Y.-Y.; Zheng H. A fully coupled three-dimensional hydro-mechanical finite discrete element approach with real porous seepage for simulating 3D hydraulic fracturing. Computers and Geotechnics 2018, 96, 73–89. 10.1016/j.compgeo.2017.10.008. [DOI] [Google Scholar]
Zangeneh N.; Eberhardt E.; Bustin R. M. Investigation of the influence of natural fractures and in situ stress on hydraulic fracture propagation using a distinct-element approach. Can. Geotech. J. 2015, 52, 926–946. 10.1139/cgj-2013-0366. [DOI] [Google Scholar]
Mohaghegh S. D.; Gaskari R.; Maysami M.. Shale Analytics: Making Production and Operational Decisions Based on Facts: A Case Study in Marcellus Shale. Paper presented at the SPE Hydraulic Fracturing Technology Conference and Exhibition, The Woodlands, Texas, USA, January 2017. Paper Number: SPE-184822-MS. SPE: The Woodlands, Texas, USA, 2017: p D031S007R004. 10.2118/184822-MS. [DOI] [Google Scholar]
Lapin R.; Murachev A.; Sevostianov A.; Tsvetkov D.; Kalyuzhnyuk A.; Osokina A. E. Neural networks and data-driven surrogate models for simulation of steady-state fracture growth. Materials Physics and Mechanics 2019, 42, 351–358. 10.18720/MPM.4232019_10. [DOI] [Google Scholar]
Xiao C.; Zhang S.; Ma X.; Zhou T.; Li X. Surrogate-assisted hydraulic fracture optimization workflow with applications for shale gas reservoir development: a comparative study of machine learning models. Natural Gas Industry B 2022, 9, 219–231. 10.1016/j.ngib.2022.03.004. [DOI] [Google Scholar]
Filippov E. V.; Zaharov L. A.; Martyushev D. A.; Ponomareva I. N.. Reproduction of reservoir pressure by machine learning methods and study of its influence on the cracks formation process in hydraulic fracturing. J. Mining Institute 2022, 924–932. https://cyberleninka.ru/article/n/reproduction-of-reservoir-pressure-by-machine-learning-methods-and-study-of-its-influence-on-the-cracks-formation-process-in (accessed May 29, 2024). [Google Scholar]
Zhao P.; Gray K. E. Analytical and Machine-Learning Analysis of Hydraulic Fracture-Induced Natural Fracture Slip. SPE Journal 2021, 26, 1722–1738. 10.2118/205346-PA. [DOI] [Google Scholar]
Teixeira Silveira B.; Roehl D.; Mejia Sanchez E. C. Forecasting of the interaction between hydraulic and natural fractures using an artificial neural network. J. Pet. Sci. Eng. 2022, 208, 109446 10.1016/j.petrol.2021.109446. [DOI] [Google Scholar]
Wang L.; Yao Y.; Zhao G.; Adenutsi C. D.; Wang W.; Lai F. A hybrid surrogate-assisted integrated optimization of horizontal well spacing and hydraulic fracture stage placement in naturally fractured shale gas reservoir. J. Pet. Sci. Eng. 2022, 216, 110842 10.1016/j.petrol.2022.110842. [DOI] [Google Scholar]
Nouri-Taleghani M.; Mahmoudifar M.; Shokrollahi A.; Tatar A.; Karimi-Khaledi M. Fracture density determination using a novel hybrid computational scheme: a case study on an Iranian Marun oil field reservoir. Journal of Geophysics and Engineering 2015, 12, 188–198. 10.1088/1742-2132/12/2/188. [DOI] [Google Scholar]
Gardner G. H. F.; Gardner L. W.; Gregory A. R. Formation velocity and density—the diagnostic basics for stratigraphic traps. Geophysics 1974, 39, 770–780. 10.1190/1.1440465. [DOI] [Google Scholar]
Greenberg M. L.; Castagna J. P. Shear-Wave Velocity Estimation in Porous Rocks: Theoretical Formulation, Preliminary Verification and Applications1. Geophysical Prospecting 1992, 40, 195–209. 10.1111/j.1365-2478.1992.tb00371.x. [DOI] [Google Scholar]
Akkurt R.; Conroy T. T.; Psaila D.; Paxton A.; Low J.; Spaans P.. Accelerating and Enhancing Petrophysical Analysis With Machine Learning: A Case Study of an Automated System for Well Log Outlier Detection and Reconstruction, in: OnePetro, 2018. (accessed August 3, 2023). [Google Scholar]
Al-Anazi A.; Gates I. D. A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs. Engineering Geology 2010, 114, 267–277. 10.1016/j.enggeo.2010.05.005. [DOI] [Google Scholar]
Ali M.; Jiang R.; Ma H.; Pan H.; Abbas K.; Ashraf U.; Ullah J. Machine learning - A novel approach of well logs similarity based on synchronization measures to predict shear sonic logs. J. Pet. Sci. Eng. 2021, 203, 108602 10.1016/j.petrol.2021.108602. [DOI] [Google Scholar]
Bukar I.; Adamu M. B.; Hassan U.. A Machine Learning Approach to Shear Sonic Log Prediction. Paper presented at the SPE Nigeria Annual International Conference and Exhibition, Lagos, Nigeria, August 2019. Paper Number: SPE-198764-MS. OnePetro, 2019. 10.2118/198764-MS. [DOI]
Gupta S.; Fuehrer F.; Jeyachandra B. C.. Production Forecasting in Unconventional Resources using Data Mining and Time Series Analysis. Paper presented at the SPE/CSUR Unconventional Resources Conference – Canada, Calgary, Alberta, Canada, September 2014. Paper Number: SPE-171588-MS. OnePetro, 2014. 10.2118/171588-MS. [DOI]
Anderson R. N.; Xie B.; Wu L.; Kressner A. A.; Frantz J. H.; Ockree M. A.; Brown K. G.. Petroleum Analytics Learning Machine to Forecast Production in the Wet Gas Marcellus Shale. In Unconventional Resources Technology Conference; Society of Exploration Geophysicists, American Association of Petroleum Geologists, Society of Petroleum Engineers, 1–3 August 2016, San Antonio, TX, 2016: pp 132–147. 10.15530/urtec-2016-2426612. [DOI]
Liang Y.; Zhao P., A Machine Learning Analysis Based on Big Data for Eagle Ford Shale Formation. Paper presented at the SPE Annual Technical Conference and Exhibition, Calgary, Alberta, Canada, September 2019. Paper Number: SPE-196158-MS. OnePetro, 2019. 10.2118/196158-MS. [DOI] [Google Scholar]
Xue L.; Liu Y.; Xiong Y.; Liu Y.; Cui X.; Lei G. A data-driven shale gas production forecasting method based on the multi-objective random forest regression. J. Pet. Sci. Eng. 2021, 196, 107801 10.1016/j.petrol.2020.107801. [DOI] [Google Scholar]
Song X.; Liu Y.; Xue L.; Wang J.; Zhang J.; Wang J.; Jiang L.; Cheng Z. Time-series well performance prediction based on Long Short-Term Memory (LSTM) neural network model. J. Pet. Sci. Eng. 2020, 186, 106682 10.1016/j.petrol.2019.106682. [DOI] [Google Scholar]
Zhou G.; Guo Z.; Sun S.; Jin Q. A CNN-BiGRU-AM neural network for AI applications in shale oil production prediction. Applied Energy 2023, 344, 121249 10.1016/j.apenergy.2023.121249. [DOI] [Google Scholar]
Shen Y.; Cao D.; Ruddy K.. Applications of Deep Learning Methodology in Real-Time Completion Event Recognition. Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada. 2019, 14.
Yuan B.; Zhao M.; Meng S.; Zhang W.; Zheng H. Intelligent identification and real-time warning method of diverse complex events in horizontal well fracturing. Petroleum Exploration and Development 2023, 50, 1487–1496. 10.1016/S1876-3804(24)60482-9. [DOI] [Google Scholar]
Ramirez A.; Iriarte J.. Event Recognition on Time Series Frac Data using Machine Learning–Part II. Paper presented at the SPE Liquids-Rich Basins Conference - North America, Odessa, Texas, USA, November 2019. Paper Number: SPE-197093-MS. OnePetro, 2019. 10.2118/197093-MS. [DOI] [Google Scholar]
Sun J. J.; Battula A.; Hruby B.; Hossaini P.. Application of Both Physics-Based and Data-Driven Techniques for Real-Time Screen-Out Prediction with High Frequency Data. Proceedings of the 8th Unconventional Resources Technology Conference; OnePetro, 2020. 10.15530/urtec-2020-3349. [DOI] [Google Scholar]
Soroush H.; Belyadi H.; Kang H.; Murugesu M.P., Early Prediction and Prevention of Tip Screen-Out Using Deep Learning. Paper presented at the 56th U.S. Rock Mechanics/Geomechanics Symposium, Santa Fe, New Mexico, USA, June 2022. Paper Number: ARMA-2022-0052. OnePetro, 2022. 10.56952/ARMA-2022-0052. [DOI] [Google Scholar]
Hou L.; Cheng Y.; Elsworth D.; Liu H.; Ren J. Prediction of the continuous probability of sand screenout based on a deep learning workflow. SPE Journal 2022, 27, 1520–1530. 10.2118/209192-PA. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Salehinejad H.; Sankar S.; Barfett J.; Colak E.; Valaee S.. Recent Advances in Recurrent Neural Networks. arXiv.org, 2017. https://arxiv.longhoe.net/abs/1801.01078v3 (accessed May 31, 2024).
Cho K.; Van Merrienboer B.; Gulcehre C.; Bahdanau D.; Bougares F.; Schwenk H.; Bengio Y.. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv, 2014. 10.48550/arXiv.1406.1078. [DOI]

[ref1] Barati R.; Liang J.-T., A review of fracturing fluid systems used for hydraulic fracturing of oil and gas wells. J. Appl. Polym. Sci. 2014, 131, 10.1002/app.40735. [DOI] [Google Scholar]

[ref2] Arps J. J. Analysis of Decline Curves. Transactions of the AIME 1945, 160, 228–247. 10.2118/945228-G. [DOI] [Google Scholar]

[ref3] Ilk D.; Rushing J. A.; Perego A. D.; Blasingame T. A.. Exponential vs. Hyperbolic Decline in Tight Gas Sands - Understanding the Origin and Implications for Reserve Estimates Using Arps’ Decline Curves. Paper presented at the SPE Annual Technical Conference and Exhibition, Denver, Colorado, USA, September 2008. Paper Number: SPE-116731-MS. SPE, 2008. 10.2118/116731-MS (accessed August 26, 2024). [DOI]

[ref4] Morgan D.; Jacobs R. Opportunities and Challenges for Machine Learning in Materials Science. Annu. Rev. Mater. Res. 2020, 50, 71–103. 10.1146/annurev-matsci-070218-010015. [DOI] [Google Scholar]

[ref5] Olsson F.A Literature Survey of Active Machine Learning in the Context of Natural Language Processing; Swedish Institute of Computer Science, 2009. https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-23510 (accessed August 24, 2024).

[ref6] Pak M.; Kim S.. A review of deep learning in image recognition. In 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT); IEEE, 2017: pp 1–3. 10.1109/CAIPT.2017.8320684. [DOI] [Google Scholar]

[ref7] Shehab M.; Abualigah L.; Shambour Q.; Abu-Hashem M. A.; Shambour M. K. Y.; Alsalibi A. I.; Gandomi A. H. Machine learning in medical applications: A review of state-of-the-art methods. Computers in Biology and Medicine 2022, 145, 105458 10.1016/j.compbiomed.2022.105458. [DOI] [PubMed] [Google Scholar]

[ref8] Tariq Z.; Aljawad M. S.; Hasan A.; Murtaza M.; Mohammed E.; El-Husseiny A.; Alarifi S. A.; Mahmoud M.; Abdulraheem A. A systematic review of data science and machine learning applications to the oil and gas industry. J. Petrol Explor Prod Technol. 2021, 11, 4339–4374. 10.1007/s13202-021-01302-2. [DOI] [Google Scholar]

[ref9] Cover T.; Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 1967, 13, 21–27. 10.1109/TIT.1967.1053964. [DOI] [Google Scholar]

[ref10] Cortes C.; Vapnik V. Support-vector networks. Mach Learn 1995, 20, 273–297. 10.1007/BF00994018. [DOI] [Google Scholar]

[ref11] Breiman L. Random Forests. Machine Learning 2001, 45, 5–32. 10.1023/A:1010933404324. [DOI] [Google Scholar]

[ref12] Shi G. Superiorities of support vector machine in fracture prediction and gassiness evaluation. Petroleum Exploration and Development 2008, 35, 588–594. 10.1016/S1876-3804(09)60091-4. [DOI] [Google Scholar]

[ref13] Chandra M. A.; Bedi S. S. Survey on SVM and their application in image classification[J]. International Journal of Information Technology 2021, 13 (5), 1–11. 10.1007/s41870-017-0080-1.33527094 [DOI] [Google Scholar]

[ref14] Zhong Z.; Carr T. R. Application of a new hybrid particle swarm optimization-mixed kernels function-based support vector machine model for reservoir porosity prediction: A case study in Jacksonburg-Stringtown oil field, West Virginia, USA. Interpretation 2019, 7, T97–T112. 10.1190/INT-2018-0093.1. [DOI] [Google Scholar]

[ref15] Schuetter J., Mishra S., Zhong M., LaFollette R.. Data Analytics for Production Optimization in Unconventional Reservoirs. In Proceedings of the 3rd Unconventional Resources Technology Conference; OnePetro, 2015. 10.15530/URTEC-2015-2167005. [DOI] [Google Scholar]

[ref16] A. Ahmed S.; A.A. Mahmoud S.; Elkatatny M.; Mahmoud A. A.. Prediction of Pore and Fracture Pressures Using Support Vector Machine. Paper presented at the International Petroleum Technology Conference, Beijing, China, March 2019. Paper Number: IPTC-19523-MS; OnePetro, 2019. 10.2523/IPTC-19523-MS. [DOI] [Google Scholar]

[ref17] Sun Z.; Wang L.; Zhou J.-Q.; Wang C. A new method for determining the hydraulic aperture of rough rock fractures using the support vector regression. Engineering Geology 2020, 271, 105618 10.1016/j.enggeo.2020.105618. [DOI] [Google Scholar]

[ref18] Wang G.; Carr T. R.; Ju Y.; Li C. Identifying organic-rich Marcellus Shale lithofacies by support vector machine classifier in the Appalachian basin. Computers & Geosciences 2014, 64, 52–60. 10.1016/j.cageo.2013.12.002. [DOI] [Google Scholar]

[ref19] Liang Z.; Jiang Z.; Wu W.; Guo J.; Wang M.; Nie Z.; Li Z.; Xu D.; Xue Z.; Chen R.; Han Y. Study and Classification of Porosity Stress Sensitivity in Shale Gas Reservoirs Based on Experiments and Optimized Support Vector Machine Algorithm for the Silurian Longmaxi Shale in the Southern Sichuan Basin, China, ACS. Omega 2022, 7, 33167–33185. 10.1021/acsomega.2c03393. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref20] Asala H. I.; Chebeir J.; Zhu W.; Gupta I.; Taleghani A. D.; Romagnoli J.. A Machine Learning Approach to Optimize Shale Gas Supply Chain Networks. Paper presented at the SPE Annual Technical Conference and Exhibition, San Antonio, Texas, USA, October 2017. Paper Number: SPE-187361-MS; SPE, 2017, 10.2118/187361-MS. [DOI]

[ref21] Yang Z.; Yang C.; Li X.; Min C. Pattern Recognition of the Vertical Hydraulic Fracture Shapes in Coalbed Methane Reservoirs Based on Hierarchical Bi-LSTM Network. Complexity 2020, 2020 (2020), e1734048 10.1155/2020/1734048. [DOI] [Google Scholar]

[ref22] Liu W.; Liu W. D.; Gu J. Forecasting oil production using ensemble empirical model decomposition based Long Short-Term Memory neural network. J. Pet. Sci. Eng. 2020, 189, 107013 10.1016/j.petrol.2020.107013. [DOI] [Google Scholar]

[ref23] Li X.; Ma X.; Xiao F.; Xiao C.; Wang F.; Zhang S. Time-series production forecasting method based on the integration of Bidirectional Gated Recurrent Unit (Bi-GRU) network and Sparrow Search Algorithm (SSA). J. Pet. Sci. Eng. 2022, 208, 109309 10.1016/j.petrol.2021.109309. [DOI] [Google Scholar]

[ref24] Kumar I.; Tripathi B. K.; Singh A. Attention-based LSTM network-assisted time series forecasting models for petroleum production. Engineering Applications of Artificial Intelligence 2023, 123, 106440 10.1016/j.engappai.2023.106440. [DOI] [Google Scholar]

[ref25] Sagheer A.; Kotb M. Time series forecasting of petroleum production using deep LSTM recurrent networks. Neurocomputing 2019, 323, 203–213. 10.1016/j.neucom.2018.09.082. [DOI] [Google Scholar]

[ref26] Yang R.; Liu X.; Yu R.; Hu Z.; Duan X. Long short-term memory suggests a model for predicting shale gas production. Applied Energy 2022, 322, 119415 10.1016/j.apenergy.2022.119415. [DOI] [Google Scholar]

[ref27] Hassanvand M.; Moradi S.; Fattahi M.; Zargar G.; Kamari M. Estimation of rock uniaxial compressive strength for an Iranian carbonate oil reservoir: Modeling vs. artificial neural network application. Petroleum Research 2018, 3, 336–345. 10.1016/j.ptlrs.2018.08.004. [DOI] [Google Scholar]

[ref28] Sami N. A.; Ibrahim D. S. Forecasting multiphase flowing bottom-hole pressure of vertical oil wells using three machine learning techniques. Petroleum Research 2021, 6, 417–422. 10.1016/j.ptlrs.2021.05.004. [DOI] [Google Scholar]

[ref29] McVey D. S.; Mohaghegh S.; Aminian K.; Ameri S. Identification of Parameters Influencing the Response of Gas Storage Wells to Hydraulic Fracturing With the Aid of a Neural Network. SPE Computer Applications 1996, 8, 54–57. 10.2118/29159-PA. [DOI] [Google Scholar]

[ref30] Ibrahim A. F.; Alarifi S. A.; Elkatatny S. Application of Machine Learning to Predict Estimated Ultimate Recovery for Multistage Hydraulically Fractured Wells in Niobrara Shale Formation. Computational Intelligence and Neuroscience 2022, 2022, e7084514 10.1155/2022/7084514. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] Li Z.; Liu F.; Yang W.; Peng S.; Zhou J. A Survey of Convolutional Neural Networks: Analysis Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems 2022, 33, 6999–7019. 10.1109/TNNLS.2021.3084827. [DOI] [PubMed] [Google Scholar]

[ref32] Salehinejad H.; Sankar S.; Barfett J.; Colak E.; Valaee S.. Recent Advances in Recurrent Neural Networks. arXiv.org, 2017. https://arxiv.longhoe.net/abs/1801.01078v3 (accessed May 31, 2024).

[ref33] Zhang R.; Sun Q.; Zhang X.; Cui L.; Wu Z.; Chen K.; Wang D.; Liu Q. H. Imaging Hydraulic Fractures Under Energized Steel Casing by Convolutional Neural Networks. IEEE Transactions on Geoscience and Remote Sensing 2020, 58, 8831–8839. 10.1109/TGRS.2020.2991011. [DOI] [Google Scholar]

[ref34] Liu Y.; Huff O.; Luo B.; Jin G.; Simmons J. Convolutional neural network-based classification of microseismic events originating in a stimulated reservoir from distributed acoustic sensing data. Geophysical Prospecting 2022, 70, 904–920. 10.1111/1365-2478.13199. [DOI] [Google Scholar]

[ref35] Jang Y.; Kwon S.; Park G.; Lee S.; Min B.. Assessment of Hydrocarbon Productivity Using Convolutional Neural Network with Phase-field Approach in Naturally Fractured Reservoirs, 2020; American Geophysical Union, Fall Meeting 2020, abstract #H036-0005. https://ui.adsabs.harvard.edu/abs/2020AGUFMH036.0005J (accessed August 3, 2023).

[ref36] Hochreiter S.; Schmidhuber J. Long Short-Term Memory. Neural Computation 1997, 9, 1735–1780. 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]

[ref37] Cho K.; Van Merrienboer B.; Gulcehre C.; Bahdanau D.; Bougares F.; Schwenk H.; Bengio Y.. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv, 2014. 10.48550/arXiv.1406.1078. [DOI]

[ref38] Qin X.; Hu X.; Liu H.; Shi W.; Cui J. A Combined Gated Recurrent Unit and Multi-Layer Perception Neural Network Model for Predicting Shale Gas Production. Processes 2023, 11, 806. 10.3390/pr11030806. [DOI] [Google Scholar]

[ref39] Yang R.; Qin X.; Liu W.; Huang Z.; Shi Y.; Pang Z.; Zhang Y.; Li J.; Wang T. A Physics-Constrained Data-Driven Workflow for Predicting Coalbed Methane Well Production Using Artificial Neural Network. SPE Journal 2022, 27, 1531–1552. 10.2118/205903-PA. [DOI] [Google Scholar]

[ref40] Shi Y.; Song X.; Song G. Productivity prediction of a multilateral-well geothermal system based on a long short-term memory and multi-layer perceptron combinational neural network. Applied Energy 2021, 282, 116046 10.1016/j.apenergy.2020.116046. [DOI] [Google Scholar]

[ref41] Zha W.; Liu Y.; Wan Y.; Luo R.; Li D.; Yang S.; Xu Y. Forecasting monthly gas field production based on the CNN-LSTM model. Energy 2022, 260, 124889 10.1016/j.energy.2022.124889. [DOI] [Google Scholar]

[ref42] Shao R.; Wang H.; Xiao L. Reservoir evaluation using petrophysics informed machine learning: A case study. Artificial Intelligence in Geosciences 2024, 5, 100070 10.1016/j.aiig.2024.100070. [DOI] [Google Scholar]

[ref43] Wang H.; Wang M.; Chen S.; Hui G.; Pang Y. A novel governing equation for shale gas production prediction via physics-informed neural networks. Expert Systems with Applications 2024, 248, 123387 10.1016/j.eswa.2024.123387. [DOI] [Google Scholar]

[ref44] Qu H.-Y.; Zhang J.-L.; Zhou F.-J.; Peng Y.; Pan Z.-J.; Wu X.-Y. Evaluation of hydraulic fracturing of horizontal wells in tight reservoirs based on the deep neural network with physical constraints. Petroleum Science 2023, 20, 1129–1141. 10.1016/j.petsci.2023.03.015. [DOI] [Google Scholar]

[ref45] Li R.; Wei H.; Wang J.; Li B.; Zheng X.; Bai W. An Artificial Intelligence Method for Flowback Control of Hydraulic Fracturing Fluid in Oil and Gas Wells. Processes 2023, 11, 1773. 10.3390/pr11061773. [DOI] [Google Scholar]

[ref46] Guo W.; Zhang X.; Kang L.; Gao J.; Liu Y. Investigation of Flowback Behaviours in Hydraulically Fractured Shale Gas Well Based on Physical Driven Method. Energies 2022, 15, 325. 10.3390/en15010325. [DOI] [Google Scholar]

[ref47] Yuyang L.; Xinhua M.; Xiaowei Z.; Wei G.; Lixia K.; Rongze Y.; Yuping S. Shale gas well flowback rate prediction for Weiyuan field based on a deep learning algorithm. J. Pet. Sci. Eng. 2021, 203, 108637 10.1016/j.petrol.2021.108637. [DOI] [Google Scholar]

[ref48] Raterman K. T.; Farrell H. E.; Mora O. S.; Janssen A. L.; Gomez G. A.; Busetti S.; McEwen J.; Friehauf K.; Rutherford J.; Reid R.; Jin G.; Roy B.; Warren M. Sampling a Stimulated Rock Volume: An Eagle Ford Example. SPE Reservoir Evaluation & Engineering 2018, 21, 927–941. 10.2118/191375-PA. [DOI] [Google Scholar]

[ref49] Elliott S. J.; Gale J. F. W.. Analysis and Distribution of Proppant Recovered From Fracture Faces in the HFTS Slant Core Drilled Through a Stimulated Reservoir. In Proceedings of the 6th Unconventional Resources Technology Conference; American Association of Petroleum Geologists: Houston, Texas, USA, 2018. 10.15530/urtec-2018-2902629. [DOI]

[ref50] Maity D.; Ciezobka J. Designing a robust proppant detection and classification workflow using machine learning for subsurface fractured rock samples post hydraulic fracturing operations. J. Pet. Sci. Eng. 2019, 172, 588–606. 10.1016/j.petrol.2018.09.062. [DOI] [Google Scholar]

[ref51] Temizel C., Purwar S., Abdullayev A., Urrutia K., Tiwari A.. Efficient Use of Data Analytics in Optimization of Hydraulic Fracturing in Unconventional Reservoirs. Paper presented at the Abu Dhabi International Petroleum Exhibition and Conference, Abu Dhabi, UAE, November 2015. Paper Number: SPE-177549-MS. OnePetro, 2015. 10.2118/177549-MS. [DOI] [Google Scholar]

[ref52] Hou L.; Elsworth D.; Zhang F.; Wang Z.; Zhang J. Evaluation of proppant injection based on a data-driven approach integrating numerical and ensemble learning models. Energy 2023, 264, 126122 10.1016/j.energy.2022.126122. [DOI] [Google Scholar]

[ref53] Wang S.; Chen S. Insights to fracture stimulation design in unconventional reservoirs based on machine learning modeling. J. Pet. Sci. Eng. 2019, 174, 682–695. 10.1016/j.petrol.2018.11.076. [DOI] [Google Scholar]

[ref54] Cipolla C. L.; Lewis R. E.; Maxwell S. C.; Mack M. G.. Appraising Unconventional Resource Plays: Separating Reservoir Quality from Completion Effectiveness. Paper presented at the International Petroleum Technology Conference, Bangkok, Thailand, November 2011. Paper Number: IPTC-14677-MS. OnePetro, 2011. 10.2523/IPTC-14677-MS. [DOI] [Google Scholar]

[ref55] Keshavarzi R.; Mohammadi S.. A New Approach for Numerical Modeling of Hydraulic Fracture Propagation in Naturally Fractured Reservoirs. Paper presented at the SPE/EAGE European Unconventional Resources Conference and Exhibition, Vienna, Austria, March 2012. Paper Number: SPE-152509-MS; SPE: Vienna, Austria, 2012. 10.2118/152509-MS. [DOI] [Google Scholar]

[ref56] Yan C.; Jiao Y.-Y.; Zheng H. A fully coupled three-dimensional hydro-mechanical finite discrete element approach with real porous seepage for simulating 3D hydraulic fracturing. Computers and Geotechnics 2018, 96, 73–89. 10.1016/j.compgeo.2017.10.008. [DOI] [Google Scholar]

[ref57] Zangeneh N.; Eberhardt E.; Bustin R. M. Investigation of the influence of natural fractures and in situ stress on hydraulic fracture propagation using a distinct-element approach. Can. Geotech. J. 2015, 52, 926–946. 10.1139/cgj-2013-0366. [DOI] [Google Scholar]

[ref58] Mohaghegh S. D.; Gaskari R.; Maysami M.. Shale Analytics: Making Production and Operational Decisions Based on Facts: A Case Study in Marcellus Shale. Paper presented at the SPE Hydraulic Fracturing Technology Conference and Exhibition, The Woodlands, Texas, USA, January 2017. Paper Number: SPE-184822-MS. SPE: The Woodlands, Texas, USA, 2017: p D031S007R004. 10.2118/184822-MS. [DOI] [Google Scholar]

[ref59] Lapin R.; Murachev A.; Sevostianov A.; Tsvetkov D.; Kalyuzhnyuk A.; Osokina A. E. Neural networks and data-driven surrogate models for simulation of steady-state fracture growth. Materials Physics and Mechanics 2019, 42, 351–358. 10.18720/MPM.4232019_10. [DOI] [Google Scholar]

[ref60] Xiao C.; Zhang S.; Ma X.; Zhou T.; Li X. Surrogate-assisted hydraulic fracture optimization workflow with applications for shale gas reservoir development: a comparative study of machine learning models. Natural Gas Industry B 2022, 9, 219–231. 10.1016/j.ngib.2022.03.004. [DOI] [Google Scholar]

[ref61] Filippov E. V.; Zaharov L. A.; Martyushev D. A.; Ponomareva I. N.. Reproduction of reservoir pressure by machine learning methods and study of its influence on the cracks formation process in hydraulic fracturing. J. Mining Institute 2022, 924–932. https://cyberleninka.ru/article/n/reproduction-of-reservoir-pressure-by-machine-learning-methods-and-study-of-its-influence-on-the-cracks-formation-process-in (accessed May 29, 2024). [Google Scholar]

[ref62] Zhao P.; Gray K. E. Analytical and Machine-Learning Analysis of Hydraulic Fracture-Induced Natural Fracture Slip. SPE Journal 2021, 26, 1722–1738. 10.2118/205346-PA. [DOI] [Google Scholar]

[ref63] Teixeira Silveira B.; Roehl D.; Mejia Sanchez E. C. Forecasting of the interaction between hydraulic and natural fractures using an artificial neural network. J. Pet. Sci. Eng. 2022, 208, 109446 10.1016/j.petrol.2021.109446. [DOI] [Google Scholar]

[ref64] Wang L.; Yao Y.; Zhao G.; Adenutsi C. D.; Wang W.; Lai F. A hybrid surrogate-assisted integrated optimization of horizontal well spacing and hydraulic fracture stage placement in naturally fractured shale gas reservoir. J. Pet. Sci. Eng. 2022, 216, 110842 10.1016/j.petrol.2022.110842. [DOI] [Google Scholar]

[ref65] Nouri-Taleghani M.; Mahmoudifar M.; Shokrollahi A.; Tatar A.; Karimi-Khaledi M. Fracture density determination using a novel hybrid computational scheme: a case study on an Iranian Marun oil field reservoir. Journal of Geophysics and Engineering 2015, 12, 188–198. 10.1088/1742-2132/12/2/188. [DOI] [Google Scholar]

[ref66] Gardner G. H. F.; Gardner L. W.; Gregory A. R. Formation velocity and density—the diagnostic basics for stratigraphic traps. Geophysics 1974, 39, 770–780. 10.1190/1.1440465. [DOI] [Google Scholar]

[ref67] Greenberg M. L.; Castagna J. P. Shear-Wave Velocity Estimation in Porous Rocks: Theoretical Formulation, Preliminary Verification and Applications1. Geophysical Prospecting 1992, 40, 195–209. 10.1111/j.1365-2478.1992.tb00371.x. [DOI] [Google Scholar]

[ref68] Akkurt R.; Conroy T. T.; Psaila D.; Paxton A.; Low J.; Spaans P.. Accelerating and Enhancing Petrophysical Analysis With Machine Learning: A Case Study of an Automated System for Well Log Outlier Detection and Reconstruction, in: OnePetro, 2018. (accessed August 3, 2023). [Google Scholar]

[ref69] Al-Anazi A.; Gates I. D. A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs. Engineering Geology 2010, 114, 267–277. 10.1016/j.enggeo.2010.05.005. [DOI] [Google Scholar]

[ref70] Ali M.; Jiang R.; Ma H.; Pan H.; Abbas K.; Ashraf U.; Ullah J. Machine learning - A novel approach of well logs similarity based on synchronization measures to predict shear sonic logs. J. Pet. Sci. Eng. 2021, 203, 108602 10.1016/j.petrol.2021.108602. [DOI] [Google Scholar]

[ref71] Bukar I.; Adamu M. B.; Hassan U.. A Machine Learning Approach to Shear Sonic Log Prediction. Paper presented at the SPE Nigeria Annual International Conference and Exhibition, Lagos, Nigeria, August 2019. Paper Number: SPE-198764-MS. OnePetro, 2019. 10.2118/198764-MS. [DOI]

[ref72] Gupta S.; Fuehrer F.; Jeyachandra B. C.. Production Forecasting in Unconventional Resources using Data Mining and Time Series Analysis. Paper presented at the SPE/CSUR Unconventional Resources Conference – Canada, Calgary, Alberta, Canada, September 2014. Paper Number: SPE-171588-MS. OnePetro, 2014. 10.2118/171588-MS. [DOI]

[ref73] Anderson R. N.; Xie B.; Wu L.; Kressner A. A.; Frantz J. H.; Ockree M. A.; Brown K. G.. Petroleum Analytics Learning Machine to Forecast Production in the Wet Gas Marcellus Shale. In Unconventional Resources Technology Conference; Society of Exploration Geophysicists, American Association of Petroleum Geologists, Society of Petroleum Engineers, 1–3 August 2016, San Antonio, TX, 2016: pp 132–147. 10.15530/urtec-2016-2426612. [DOI]

[ref74] Liang Y.; Zhao P., A Machine Learning Analysis Based on Big Data for Eagle Ford Shale Formation. Paper presented at the SPE Annual Technical Conference and Exhibition, Calgary, Alberta, Canada, September 2019. Paper Number: SPE-196158-MS. OnePetro, 2019. 10.2118/196158-MS. [DOI] [Google Scholar]

[ref75] Xue L.; Liu Y.; Xiong Y.; Liu Y.; Cui X.; Lei G. A data-driven shale gas production forecasting method based on the multi-objective random forest regression. J. Pet. Sci. Eng. 2021, 196, 107801 10.1016/j.petrol.2020.107801. [DOI] [Google Scholar]

[ref76] Song X.; Liu Y.; Xue L.; Wang J.; Zhang J.; Wang J.; Jiang L.; Cheng Z. Time-series well performance prediction based on Long Short-Term Memory (LSTM) neural network model. J. Pet. Sci. Eng. 2020, 186, 106682 10.1016/j.petrol.2019.106682. [DOI] [Google Scholar]

[ref77] Zhou G.; Guo Z.; Sun S.; Jin Q. A CNN-BiGRU-AM neural network for AI applications in shale oil production prediction. Applied Energy 2023, 344, 121249 10.1016/j.apenergy.2023.121249. [DOI] [Google Scholar]

[ref78] Shen Y.; Cao D.; Ruddy K.. Applications of Deep Learning Methodology in Real-Time Completion Event Recognition. Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada. 2019, 14.

[ref79] Yuan B.; Zhao M.; Meng S.; Zhang W.; Zheng H. Intelligent identification and real-time warning method of diverse complex events in horizontal well fracturing. Petroleum Exploration and Development 2023, 50, 1487–1496. 10.1016/S1876-3804(24)60482-9. [DOI] [Google Scholar]

[ref80] Ramirez A.; Iriarte J.. Event Recognition on Time Series Frac Data using Machine Learning–Part II. Paper presented at the SPE Liquids-Rich Basins Conference - North America, Odessa, Texas, USA, November 2019. Paper Number: SPE-197093-MS. OnePetro, 2019. 10.2118/197093-MS. [DOI] [Google Scholar]

[ref81] Sun J. J.; Battula A.; Hruby B.; Hossaini P.. Application of Both Physics-Based and Data-Driven Techniques for Real-Time Screen-Out Prediction with High Frequency Data. Proceedings of the 8th Unconventional Resources Technology Conference; OnePetro, 2020. 10.15530/urtec-2020-3349. [DOI] [Google Scholar]

[ref82] Soroush H.; Belyadi H.; Kang H.; Murugesu M.P., Early Prediction and Prevention of Tip Screen-Out Using Deep Learning. Paper presented at the 56th U.S. Rock Mechanics/Geomechanics Symposium, Santa Fe, New Mexico, USA, June 2022. Paper Number: ARMA-2022-0052. OnePetro, 2022. 10.56952/ARMA-2022-0052. [DOI] [Google Scholar]

[ref83] Hou L.; Cheng Y.; Elsworth D.; Liu H.; Ren J. Prediction of the continuous probability of sand screenout based on a deep learning workflow. SPE Journal 2022, 27, 1520–1530. 10.2118/209192-PA. [DOI] [Google Scholar]

PERMALINK

Application of Machine Learning in Hydraulic Fracturing: A Review

Yulin Ma

Man Ye

Abstract

1. Introduction

Figure 1.

Table 1. Procedures of Hydrocarbon Production Forecast by Data.

1.1. Classification and Labeling of Hydraulic Fracturing Data

Table 2. Classification of Hydraulic Fracturing Data.

1.2. Data Preprocessing for Hydrocarbon Production Forecast

Table 3. Main Process of Data Preprocessing.

1.3. Hyperparameter Adjustment and Model Generalization Ability

2. ML Methods

2.1. Traditional ML Methods

Figure 2.

Table 4. Summary of Kernel Functions.

Table 5. Prediction of Oil and Gas Production by SVM method.

Table 6. Characteristics of Three Traditional ML-Based Hydraulic Fracturing Prediction Methods.

2.2. Neural Network and Its Variants

Figure 3.

Figure 4.

Table 7. Recursive Neural Network Based Hydraulic Fracturing Prediction Method.

2.3. Hybrid Network Model

Figure 5.

Figure 6.

2.4. Physically Informed Neural Networks

Figure 7.

Figure 8.

2.5. Summary of Data-Driven Models for Production Forecast

Table 8. Comparison of Three Hydraulic Fracturing Prediction Methods.

3. Application of ML to Hydraulic Fracturing Operations

3.1. Application of ML to Fracturing Flowback

3.2. Application of ML to Fracture Characterization in Hydraulic Fracturing Operations

Table 9. ML Methods for Predicting Cracks.

3.3. Application of ML to Missing Value Filling in Hydraulic Fracturing Operations

3.4. Application of ML to Capacity Prediction in Hydraulic Fracturing

Table 10. Comparison of Three Existing Yield Forecasting Methods.

3.5. Application of ML in Fracturing Condition Diagnosis and Risk Warning

4. Challenges and Future Directions

4.1. Challenges

4.2. Future Directions

Figure 9.

5. Conclusion

References

Associated Data

Data Citations

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases