Skip to main content
ACS Omega logoLink to ACS Omega
. 2025 Jan 14;10(3):2871–2886. doi: 10.1021/acsomega.4c08679

Data-Driven Approach for the Prediction of In Situ Gas Content of Deep Coalbed Methane Reservoirs Using Machine Learning: Insights from Well Logging Data

Qian Zhang †,‡,§, Shuheng Tang †,‡,§,*, Songhang Zhang †,‡,§, Zhaodong Xi †,‡,§, Tengfei Jia †,‡,§, Xiongxiong Yang †,‡,§, Donglin Lin †,‡,§, Wenfu Yang ∥,
PMCID: PMC11780433  PMID: 39895762

Abstract

graphic file with name ao4c08679_0014.jpg

The in situ gas content is a critical determinant of the exploitation potential and recovery of coalbed methane (CBM) resources. Deep CBM resources have enormous exploitation potential, but their intricate geological conditions hinder the acquisition of in situ gas content data. To enhance the efficiency and accuracy of acquiring in situ gas content data for deep CBM, this study integrates gray relational analysis (GRA) and the genetic algorithm (GA) into the back-propagation neural network (BPNN) model, establishing a novel prediction model for in situ gas content of deep CBM using well logging data. The results show that the multialgorithm joint model can overcome the inherent shortcomings of the BPNN. The GRA method effectively identifies the optimal input parameters for the BPNN model, the GA method optimizes the initial weights and thresholds of the BPNN, thereby enhancing the prediction accuracy and stability of the model. The mean square error (MSE) of the GRA-GA-BPNN joint model decreases by 77.60% compared with the BPNN model. Furthermore, taking the deep CBM wells in the Ningwu Basin of North China as an example, the reliability of the multialgorithm joint model was verified (4.62% average relative error only). The GRA-GA-BPNN model proposed in this study exhibits high robustness and strong generalization ability. It can achieve high-precision prediction of deep CBM in situ gas content, thereby circumventing overreliance on experimental measurements, holding significant practical application significance.

1. Introduction

In the present global energy landscape, natural gas with clean and efficient characteristics is garnering escalating attention from an expanding array of countries and regions.1,2 The effective development of natural gas contributes to the optimization and upgrading of the energy structure, thereby reducing reliance on traditional fossil fuels.3,4 Coalbed methane (CBM) is an unconventional natural gas produced and stored within coal seams. It serves as a pivotal energy source for the transition from coal to natural gas and eventually to renewable energy. The gas content of coal plays a crucial role in facilitating this transition.5

CBM resources are abundant globally, particularly in the United States, Russia, Australia, Canada, and China.69 Deep CBM resources are more abundant than shallow ones.1013 The geological resources of CBM in China with depths less than 2000 m amount to 30.05 × 1012 m3, with resources between 1000 and 2000 m depth accounting for 63%.1416 Over two-thirds of the CBM resources in the Piceance Basin in the United States are stored at depths greater than 5000 feet (about 1524 m).17 Additionally, regions such as the Alberta Basin in Canada and the Bowen Basin and Cooper Basin in Australia also boast abundant deep CBM resources.17,18 At present, the exploration and commercial development of CBM in China mainly focuses on shallow coal reservoirs with a burial depth of less than 1000 m.3,19,20 Nevertheless, compared to shallow CBM, deep CBM exhibits superior reservoir properties, including higher gas content, lower water content, overpressured reservoir conditions, and more “continuous accumulation”,20,21 rendering it a more promising resource for future development.

In situ CBM gas content serves as a crucial reference index for selecting the “sweet spot” for resource development and forms a fundamental basis for evaluating CBM reserves and designing CBM development strategies.2225 Methods for measuring gas content in coal seams are categorized into direct testing methods and indirect testing methods.2628 Direct testing methods include drilling to obtain fresh coal cores and obtaining in situ gas content in coal seams through on-site testing.29,30 The indirect testing method entails conducting adsorption experiments on collected coal cores under in situ temperature and pressure to calculate the adsorbed gas content of the seams. It is noteworthy that both the direct and indirect methods require drilling to collect coal core samples. In the case of deep CBM reservoirs, challenges such as the difficulty of core extraction, high costs, and lengthy time consumption restrict the acquisition of in situ CBM gas content data,27 unable to meet production and research needs. Hence, it is necessary to accurately predict the deep CBM in situ gas content based on readily available large-scale data and limited in situ gas content measurement data. Currently, proposed methods for predicting CBM gas content primarily encompass the gas content-depth gradient method, coal quality-ash content gas content analogy method, geohistory evolution gas generation amount simulation method, and productivity history fitting method.27,29,31,32 However, deep CBM reservoirs typically exist under high temperature and pressure conditions, often containing a significant amount of free gas within the coal seams.16,28 These conditions result in differences in gas occurrence laws between deep and shallow coal seams, thereby questioning the applicability of the existing models. Currently, there is a lack of a universally recognized mature theory or accurate calculation method for predicting in situ gas content in deep CBM reservoirs. Geophysical logging technology is characterized by its low cost, vast data volume, and high reliability, while well logging data inherently contain abundant geological information. Numerous scholars have developed models to predict various parameters such as organic carbon content, coal body structure, coal seam permeability, and gas content using well logging data, resulting in reliable outcomes.3337 Thus, employing well logging data to predict the in situ gas content of deep CBM reservoirs is feasible.

Machine learning possesses robust nonlinear approximation capabilities, enabling effective exploration of the intrinsic connections within vast data sets.3841 In recent years, amid the advancement of artificial intelligence, numerous scholars have employed various algorithmic theories to predict CBM gas content, including the support vector machine (SVM), random forest (RF), and artificial neural network (ANN), among others. They have conducted generalization tests and demonstrated the benefits of precise prediction and high confidence levels.38,42,43 The back-propagation neural network (BPNN), being the most prevalent algorithmic model in artificial neural networks, offers robust nonlinear mapping capabilities, a straightforward structure, and ease of modeling.44,45 Nevertheless, prior studies have identified inherent limitations in the standalone implementation of the BPNN, including slow convergence speed and susceptibility to local optima, resulting in subpar prediction performance.41,45,46 These drawbacks are intricately linked to the selection of input factors during network construction and the variation of network hyperparameters throughout the iterative process. Optimizing the BPNN model structure by using other algorithms can effectively overcome its inherent shortcomings.

Based on the measured deep CBM gas content in Ningwu Basin, North China, this study integrates gray relational analysis (GRA), genetic algorithm (GA), and BPNN methods to establish a novel multialgorithm coupling (GRA-GA-BPNN) model for predicting the in situ gas content of deep CBM reservoirs utilizing geophysical logging data. This model has the characteristics of low requirement for training sample size, no effect by local extremum, and strong generalization ability. The model offers significant advantages in addressing small-sample nonlinear regression and classification problems and is well suited for geological parameter prediction and resource assessment with limited sample size.

2. Data Sources

2.1. Geological Setting

The Ningwu Basin is an important coal-bearing basin in China, located in the northern part of Shanxi Province (Figure 1), positioned between the uplifts of the Lvliang Mountains and Wutai Mountains. It is a narrow and elongated mountain basin formed by multiple tectonic movements squeezing and uplifting after the Late Paleozoic coal-forming period.47,48 The structure in the center of the basin is stable, while fractures develop in the northern and western parts. The coal-bearing strata mainly consist of the Carboniferous Taiyuan Formation and the Permian Shanxi Formation, among which the No. 9 coal seam of the Taiyuan Formation is well developed with good continuity and widespread distribution.

Figure 1.

Figure 1

Geographical location of the study area, coal seam burial depth, and sampling points. (a) Position of Shanxi Province, (b) location of the Ningwu Basin in Shanxi Province, and (c) contour map of the burial depth of the No. 9 coal seam in the study area and well location map.

The No. 9 coal seam in the southern part of the Ningwu Basin is characterized by its large thickness and deep burial. The coal seam thickness ranges from 7.9 to 13.65 m, with an average thickness of 10.8 m; the burial depth of the coal seam ranges from 800 to 2700 m, with the 1500 m and deeper areas accounting for 64% of the total area (Figure 1c). The mean maximum reflection values of vitrinite (Ro, max) range from 0.88% to 1.81%, and Ro, max has a good positive correlation with burial depth. The roof lithology of the coal seam mainly consists of mudstone and sandy mudstone, occasionally with limestone, and the sealing conditions vary regionally. In the northern part, primary structure coal prevails, while in the eastern steep slope zone, the coal structure is fragmented due to tectonic uplift, with well-developed cleats and fractures. The measured CBM gas content ranges from 3.44 to 22.00 m3/t, with an average of 11.4 m3/t. Due to differences in roof and floor lithology and tectonic conditions, there is significant variation in coal seam gas content within the area, requiring the development of a reasonable prediction model to accurately evaluate CBM gas content. The Ningwu Basin is adjacent to the eastern edge of the Ordos Basin and northern of the Qinshui Basin and has abundant CBM resources with broad development prospects. Nevertheless, the deep CBM in this block remains in the experimental development stage, and regional gas content prediction is of great significance for evaluating the resource and recoverability of deep CBM.

2.2. Data Collection

In this study, 102 core samples of the No. 9 coal seam in the Ningwu Basin were collected with sampling depths greater than 1000 m. The selected samples have the following characteristics: (1) the samples have a wide range of burial depths to ensure diverse neural network learning samples. (2) Samples are kept away from lithological interfaces to reduce the influence of surrounding strata and lithological interfaces on the sample’s logging data. According to the Chinese national standard GB/T 19559-2008, fresh coal core samples were tested for gas content. The acquisition of the CBM gas content includes desorbed gas, residual gas, and lost gas. After coring, coal samples with a mass exceeding 800 g are quickly placed into a desorption tank and placed in a constant-temperature device at reservoir temperature for desorption; natural desorption continues until the average desorption volume per day is no more than 10 cm3 for seven consecutive days, at which point desorption testing is terminated to obtain desorbed gas content. The desorbed coal samples are crushed to sizes of 2–3 cm and placed in a ball mill for residual gas content determination. The lost gas content is estimated by the correlation between the desorbed gas content of the coal seam in the initial desorption stage and the square root of time.

The difference in well log curve responses can reflect variations in reservoir characteristics.16 Previous studies have demonstrated that the conditions of deep coal reservoirs differ from those of shallow ones.6,49 Compared to shallow coal seams, deep coal seams exhibit high temperature, high pressure, high free gas content, high methane percentage, low porosity, and low water content.16,20 This leads to more complex well-logging interpretations compared with shallow coal reservoirs. Considering the practicality and reliability of model establishment, we selected a complete set of geophysical logging data, including natural depth (depth, m) and gamma ray logging (GR, API), density logging (DEN, g/cm3), acoustic logging (AC, us/m), compensated neutron logging (CNL, %), spontaneous potential logging (SP, mV), deep lateral resistivity logging (RD, Ω·m), shallow lateral resistivity logging (RS, Ω·m), and caliper logging (CAL, mm), totaling 8 types of well logging data. Considering the reliability of data selection for model construction, depth correction was performed on the sample logging data before modeling. Table 1 lists the CBM gas content measured data and the corresponding statistical data for various types of logging data.

Table 1. Statistics of CBM Gas Content and Well Logging Data in Coal Seams.

parametera unit maximum minimum median average standard deviation skewness kurtosis
depth m 1995.20 1054.50 1737.41 1541.83 352.63 –0.13 –1.70
AC us/m 575.00 257.00 423.58 424.17 51.52 0.12 1.69
DEN g/cm3 2.05 1.18 1.34 1.36 0.14 1.85 5.88
CNL % 71.19 17.49 45.17 45.91 8.73 0.22 0.81
SP mV 110.00 20.00 57.84 59.97 22.53 0.28 –0.70
CAL mm 49.63 13.10 25.71 26.16 4.85 0.82 4.73
GR API 108.28 7.22 38.55 41.07 20.74 0.72 0.59
RD Ω·m 4350.17 256.39 1906.89 2074.84 1010.70 0.44 –0.69
RS Ω·m 3506.15 208.42 1344.00 1188.05 662.78 1.43 2.17
gas content m3/t 19.89 3.22 11.79 11.97 3.14 0.14 –0.06
a

Note: All parameters are statistical values of 102 sets of data.

3. Model Establishment

3.1. GRA Extracts the Feature Factors

In machine learning projects, too many variables can hinder the model’s ability to identify patterns, necessitating the reduction of data dimensionality.43,50,51 Two types of methods are commonly employed for data dimensionality reduction. The first involves altering the original data structure to extract its primary features, as exemplified by principal component analysis (PCA). The second method entails analyzing data correlations and adhering to specific rules for attribute selection or abandonment to achieve dimensionality reduction goals, as demonstrated by GRA.41,52

Well logging data exhibits a nonlinear relationship with deep CBM gas content.53 GRA is a fast and effective method for analyzing nonlinear relationships between data.40,54 GRA is used to quantify the correlation between well logging data and in situ gas content of deep CBM for feature extraction and can improve the convergence of prediction models while retaining the original data structure. Using the maximum minimum normalization method to normalize the data,52 according to eqs 1 and 2 to calculate the correlation between well logging data and gas content

3.1. 1
3.1. 2

where ζi(k) is the correlation coefficient between the i-th column of logging indicators and the gas content, ranging from 0 to 1. xo(k) is the compared sequence, namely, the gas content sequence, where k is the number of data in the i-th column of logging indicators, k = 1, 2, 3, ..., m. xi(k) is the compared sequence, namely, the data of each type of logging, where i = 1, 2, 3, ..., n, and i is the number of logging indicators. ρ is the resolution coefficient, ranging from 0 to 1; its role is to control the discriminability, typically taken as 0.5. ri is the correlation degree of the i-th index.

Figure 2a illustrates the correlation between well logging data and the deep CBM gas content, with the results showing that AC has the highest gray correlation with the gas content at 0.8694, followed by DEN, RD, RS, CNL, GR, Depth, SP, and CAL. This indicates that the importance of different types of logging parameters on the deep CBM gas content differs from previous analyses of the correlation between the gas content of shallow CBM and well logging data.10,55 The Spearman correlation coefficient describes the linear correlation between the well logging variables and the linear correlation between the well logging variables and gas content (Figure 2b). The results show that the Spearman correlation coefficient between RD and RS is 0.93, indicating high collinearity between them, while the correlation coefficients of other parameters are relatively low (less than 0.6). Collinearity among independent variables reduces the predictive ability of machine learning models and increases the risk of misjudgment. Therefore, this study takes the average value R of RD and RS as the characteristic factor representing resistivity. Previous studies suggest that a gray correlation degree above 0.8 indicates high statistical significance between independent and dependent variables;41,52,53 thus, based on the results of gray correlation analysis and Spearman correlation coefficient analysis, this study selects AC, DEN, R, CNL, and GR as the characteristic factors of the deep CBM gas content model based on the BPNN algorithm.

Figure 2.

Figure 2

Relation analysis of CBM content and well logging. (a) Gray correlation degree between each type of well logging data and in situ gas content of deep CBM. (b) Spearman correlation analysis. GC represents the CBM gas content.

3.2. Determine the Structure of the BPNN

The BPNN is currently the most widely used model in artificial neural networks.56 The BPNN model is a multilayer feedforward network that uses error backpropagation to find patterns in given samples and establish mapping models, which is used to solve nonlinear problems when the relationship between input and output data is difficult to express with explicit functions.45Figure 3 shows the schematic diagram of the constructed single-hidden-layer neural network topology pattern. The learning process includes forward calculation and backpropagation. After the hidden layer is assigned weights layer by layer, it is forward propagated to the output layer. The model output error is calculated through the output layer, and the error is backpropagated layer by layer.43 According to the “gradient descent” method, the weights (w) and thresholds (b) are adjusted, and the error is continuously debugged and modified in training to reduce it to the error.50

Figure 3.

Figure 3

Schematic diagram of the typical three-layer BPNN topology. I, H, and O are the number of nodes in the input layer, hidden layer, and output layer, respectively.

The construction of the BPNN model is implemented using the newff function in MATLAB in this study. Model construction mainly includes dividing the training set and the test set, determining the number of neurons in the input layer, output layer, and hidden layer, selecting activation functions, setting model parameters, etc. In this study, in order to mitigate the influence of human factors on the model, a random partitioning method is employed to divide the data set, and 80% (82 samples) of the 102 data sets are chosen as the training set for training the BPNN model, while 20% (20 samples) are selected as the test set for validating the accuracy of the model.

Using all well logging data in Section 2.2 as input variables (9 types), we construct a deep CBM gas content BPNN prediction model. We construct a GRA-BPNN model based on the characteristic factors (5 types) identified in Section 3.1. Therefore, the number of neurons in the input layer of these two models is 9 and 5, respectively. The deep CBM gas content is chosen as the output variable; thus, the number of output layer neurons is 1. Previous studies have shown that a single-hidden-layer neural network can accurately predict models with limited data,50,57 and the number of nodes in the hidden layer can be determined using empirical formula (eq 3).45 However, relying solely on empirical formulas cannot effectively obtain the optimal number of hidden layer nodes. Accordingly, this review employs a combination of empirical formulas and exhaustive methods to determine the optimal number of hidden layer nodes. The investigation revealed that when the number of nodes in the hidden layer falls within the range of 6–20, the model is capable of achieving convergence without the occurrence of overfitting. Furthermore, there is a discernible optimal value within this range. Combining empirical formulas and exhaustive analysis, the difference in the number of hidden layer neurons on the prediction accuracy is analyzed (Figure 4). The results show that when the number of hidden layer neurons is 12, the model MSE is minimized, thus constructing a network structure of 5-12–-1. The Levenberg–Marquardt (LM) algorithm is chosen as the model network training algorithm, which effectively shortens the training time; once the mean square error (MSE) between the target and output data reaches the minimum, the algorithm stops training. Meanwhile, parameter settings, such as the maximum number of iterations, activation function, and learning rate, are also crucial for building the model (Table 2).

3.2. 3

where a is an adjusting constant between 1 and 10.

Figure 4.

Figure 4

Relationship between the number of hidden layer nodes and MSE in the GRA-BPNN model and GRA-GA-BPNN model.

Table 2. GRA-BPNN Training Model Parameter Settings.

parameter value parameter value
number of input vector groups 5 training function trainlm
number of output vector groups 1 activation function Sigmoid
number of hidden layers 1 training algorithm LM
number of hidden layer nodes 12 building network functions newff
maximum number of iterations 1000 learning rate 0.01

3.3. GA Optimization of the BPNN Structure

The GA is an effective optimization algorithm that simulates the competition mechanism of “survival of the fittest” in biological populations, inspired by Darwin’s theory of biological evolution.54,58,59 The main feature of the GA is to directly use fitness as search information, the search process is not constrained by the continuity of the function, and it is suitable for dealing with complex nonlinear problems that traditional search methods find difficult to solve.59,60

The BPNN structure is sensitive to factors such as initial weights, thresholds, and learning rates. The random generation of initial weights and thresholds in traditional BPNN leads to model instability, slow convergence speed, and susceptibility to local optima.44 The main idea of the GA-optimized BPNN algorithm (GA-BPNN) is to utilize the global optimization capability of the GA to obtain the optimal initial weights and thresholds of the BPNN, to input the optimal values into the BPNN for training, to avoid local minima, and to thus achieve the optimal predictive performance. This optimization algorithm can effectively combine the advantages of two algorithms, achieving rapid convergence while also globally optimizing, and its performance is significantly better than using the BPNN and GA separately.38Figure 5 shows a typical flowchart of the GA. Initially, the initial population is subjected to genetic encoding. The GA evaluates each individual (chromosome) in the initial population numerically using a fitness function, and through its genetic operations (selection, crossover, mutation), it searches for the optimal individual, effectively filtering out initial individuals with low fitness, thereby reducing the impact of initial values on the model.54

  • (1)

    Selection operation. The selection operation is to select individuals from the population with a certain probability as parents for breeding offspring.54 The probability of selection is determined by fitness. The better the fitness, the greater the probability that the individual will be selected.61 The selection operation generally uses the roulette wheel method, and the selection probability Pi of each individual i is

3.3. 4

where Pi is the selection probability, N is the number of individuals in the population, and F(Xi) is the fitness function of Xi.

  • (2)

    Crossover operation. The purpose of the crossover operation is to improve the individual encoding structure using crossover operators. New individuals can be generated through crossover operations, increasing the probability of generating the optimal solution, which plays a key role in improving the generalization performance of the GA.62 The real number crossover method is generally selected, assuming crossover operations are performed on the j-th position of the chromosomes of akj and aij individuals, respectively, as follows:

3.3. 5

where c is the crossover parameter, which takes values between 0 and 1.

  • (3)

    Mutation operation. In order to improve the algorithm’s search capability and maintain the diversity of the population, individuals are selected according to a certain probability, and a uniform distribution of random numbers is used to mutate (replace) a segment of the chromosome in the individual to enhance the individual’s fitness.63 The mutation operation for the j-th gene aij of the i-th individual is as follows

3.3. 6
3.3. 7

where amax and amin are the upper and lower bounds of the initial individual gene aij, respectively; r and r2 are random numbers in the interval [0,1]; g is the current iteration value; Gmax is the maximum number of evolutions.

Figure 5.

Figure 5

Schematic diagram of the typical genetic algorithm (modified with permission from ref (54). Copyright 2017 Jun Zhou et al.).

The specific operation of optimizing the initial weights and thresholds of the BPNN in the GA is to use the initial weights, and thresholds in the BPNN algorithm are taken as the gene values of GA individuals, where the length of the individual corresponds to the total number of weights and thresholds in the BPNN, with each gene representing a weight or threshold, and the value on the gene represents the actual value of the connecting weight or threshold in the BPNN, thus forming a chromosome in the GA.61,63 A certain number of chromosomes serve as the initial population for GA training, and after iterative processes such as selection, crossover, and mutation in the GA, an optimal individual is obtained, which is then used as the initial parameter of the BPNN for training.54,57Table 3 shows the parameter settings for the GA. The specific steps are as follows:

  • (1)

    Establish the initial population using real number encoding. The initial population consists of four parts: the thresholds of the hidden layer, the connection weights between the hidden layer and the output layer, the connection weights between the input layer and the hidden layer, and the thresholds of the output layer:

3.3. 8

where S is the number of genes on a chromosome. ω1n is the weight between the input layer and the output layer, with 5 × 12 = 60; b1 is the hidden layer threshold, with 12; ω2m is the weight between the hidden layer and the output layer, with 12 × 1 = 20; b2n is the output layer threshold; there is 1.

  • (2)

    Establish the fitness function. Since the GA evolves in the direction of increasing fitness, the reciprocal of the RMSE between the predicted values and the expected values obtained from BPNN training is used as the fitness function value (eq 9). Therefore, it is known that the smaller the prediction error, the larger the corresponding fitness function value, indicating better adaptability.

3.3. 9

where F(Xi) is the fitness function, n is the number of samples, and Inline graphic and yi are the predicted and expected outputs of the model, respectively.

  • (3)

    Determine whether the population fitness meets the predetermined requirements; if not, perform selection, crossover, and mutation operations in the GA to generate a new generation of populations, until the fitness meets the requirements or the maximum set number of iterations is reached.

  • (4)

    Decode the genetic encoding of the optimal individual, input the optimal initial weights and thresholds into the BPNN, repeatedly train using normalized training samples, and finally save the trained weights, thresholds, and network structure to form the GRA-GA-BPNN model of deep CBM gas content prediction.

Table 3. GA Parameter Settings.

parameter value parameter value
population size 10 variable boundary [−1, 1]
hereditary algebras 50 selected rules roulette wheel method
maximum number of iterations 1000 cross rules/crossover rate real number crossover/0.8
iteration stop value 10–5 mutation rule/mutation rate Gauss mutation/0.2

3.4. Multialgorithm-Coupled Model Establishment

This study uses GRA to select input parameters and the GA to optimize the initial weights and thresholds of the BPNN. This multialgorithm joint model can effectively address the shortcomings of a single model and improve predictive performance.61,62

In constructing the models, this study uses all data to establish the BPNN model, subsequently utilizes GRA for data dimensionality reduction to establish the GRA-BPNN model, and further applies the GA for network parameter optimization to establish the GRA-GA-BPNN model. The main parameters of the three models are listed in Table 4. The construction, optimization, and validation process of the GRA-GA-BPNN model is completed based on the MATLAB (R2022a) platform. The workflow of model construction is shown in Figure 6, which consists of four parts: feature extraction, model establishment, parameter optimization, and model validation.

  • (1)

    Feature extraction. Normalize the CBM gas content and well logging data to eliminate dimensional effects. Use GRA to analyze the correlation between gas content and well logging data, selecting well logging data as the feature factors for model construction.

  • (2)

    Model establishment. Divide the selected sample data into training and testing sets proportionally, and establish the GRA-BPNN model. Randomly generate initial weights and thresholds, determine the number of neurons in the input layer, hidden layer, and output layer, and set the model training parameters.

  • (3)

    Parameter optimization. Use the reciprocal of the root-mean-square error (RMSE) between actual output and expected output as the objective function, and use GA to optimize the initial weights and thresholds of the BPNN structure. Incorporate the optimized optimal initial weights and thresholds into the GRA-BPNN to construct the GRA-GA-BPNN in situ gas content of the deep CBM prediction model.

  • (4)

    Model verification. Use the error between the predicted value and measured value as the verification criterion, and test the obtained GRA-GA-BPNN model with a test set; if the test passes, the model is considered reliable, and in situ gas content of deep CBM prediction can be conducted through the GRA-GA-BPNN model; if the test fails, retraining is required until the requirements are met.

Table 4. Main Parameters of Three Models before and after Optimization.

model number of input variables number of neural network layers number of hidden layer nodes output parameters
BPNN 9 3 18 gas content
GRA-BPNN 5 3 12 gas content
GRA-GA-BPNN 5 3 12 gas content

Figure 6.

Figure 6

Schematic diagram of the GRA-GA-BPNN model process.

3.5. Model Evaluation

The predictive performance of the models is described by coefficient of determination (R2) and error indicators. R2, as an important indicator of the correlation between predicted and measured values, indicates higher fitting goodness as it increases. Error indicators include mean absolute error (MAE), mean absolute percentage error (MAPE), and mean squared error (MSE) (eqs 1013), where higher error values indicate poorer predictive ability of the model for in situ gas content of deep CBM.

3.5. 10
3.5. 11
3.5. 12
3.5. 13

where m is the number of samples, yi and Inline graphic are the measured and predicted deep CBM gas content, respectively, and i is the average of the measured deep CBM gas content.

4. Results

Figure 7 depicts the relationship between the predicted and measured values of the three model training sets, validation sets, and total data sets for the BPNN model, the combined GRA and BPNN model (GRA-BPNN), and the combined GRA, GA, and BPNN model (GRA-GA-BPNN). The closer the data points are to the unit slope line (1:1 diagonal), the higher the accuracy of the predictions. Additionally, the color of the data points represents the model residuals, where red indicates positive residuals and blue indicates negative residuals. The darker the color, the greater the deviation between the predicted and measured values.

Figure 7.

Figure 7

Prediction performance of BPNN, GRA-BPNN, and GRA-GA-BPNN models on the training set, testing set, and overall data sets. The red dashed line represents a 1:1 diagonal, and data points on the line indicate that the predicted value is equal to the measured value. The black solid line represents the error line between the predicted value and the measured value. The color of the data points represents the positive, negative, and size of the residuals. The red dots indicate that the measured value is higher than the predicted value, while the blue dots indicate that the measured value is lower than the predicted value. (a–c) are the train set, test set, and overall set of the BPNN model, respectively; (d–f) are the train set, test set, and overall set of the GRA-BPNN model, respectively; (g–i) are the train set, test set, and overall set of the GRA-GA-BPNN model, respectively.

The data points of the BPNN model exhibit significant scattering, with the prediction accuracy spanning between −30% and 50% error lines, and all R2 values are below 0.7, among which the R2 of the validation set is only 0.54. The total data set’s MSE, MAE, and MAPE reached 3.08, 1.40, and 12.62%, respectively, indicating a significant prediction error and poor accuracy of the model (Figure 8).

Figure 8.

Figure 8

Comparison of prediction errors of three models before and after optimization: (a) R2; (b) MSE; (c) MAE; (d)MAPE.

The prediction accuracy of the GRA-BPNN model is scattered between the −20% and 20% error lines. Compared with the BPNN model, the total data set MSE, MAE, and MAPE of this model were reduced by 45.78%, 25%, and 25.52%, respectively. It is worth noting that the performance of the training set of the model is excellent, with an R2 of 0.94, and MSE, MAE, and MAPE reduced by 74.91%, 51.85%, and 58.47%, respectively. However, the model error is large, and the prediction performance for the test set is poor (Figure 8). Although the error of the total data set has been reduced, the test set has low fitting accuracy, indicating poor generalization of the model.

The error assessment of the test set is more reliable in evaluating the performance of the prediction model. Figure 9a compares the predicted gas content values of the three models before and after optimization with the measured values on the test set point by point. The data show that the optimized model can more accurately predict the deep CBM gas content. Figure 9b displays the distributions of residuals for different models. The error range of the BPNN model is large and relatively dispersed, indicating poor model robustness. With the addition of optimization methods, the residual gradually approaches 0 and becomes concentrated. This suggests that the proposed method of feature parameter selection (GRA) and the optimization strategy for updating network weights and thresholds (GA) are effective, as evidenced by the improvement in network convergence speed, output accuracy, and the enhancement of the model’s generalization ability and resistance to interference.

Figure 9.

Figure 9

Performance analysis of test set prediction for BPNN, GRA-BPNN, and GRA-GA-BPNN models. (a) Point-by-point comparison of test set prediction values and measured values for the three models; (b) residual distribution of prediction values for the three models, with blue boxes indicating errors within the range of −1 to 1.

5. Discussion

5.1. Effect of Incorporating Optimization Algorithms on Predicting CBM Gas Content

5.1.1. Effect of the Number of Characteristic Factors on Prediction Accuracy

The quantity of input variables greatly influences the accuracy of the BPNN model. Insufficient input variables may cause the model to fail to converge, while an excess of variables can prolong the computational time and lead to prediction errors. Hence, it is imperative to choose an appropriate number of principal controlling factors as input variables for BPNN modeling.41,43 The integrated approach of GRA and BPNN models can efficiently harness and fully utilize information regarding deep CBM gas content in well logging data. In Section 3.1, correlation calculations and rankings between well logging data and gas content (Figure 2) were conducted. The “stepwise addition method” was employed to assess the influence of the quantity of input variables on model prediction accuracy, whereby well logging data with high correlation were given priority as input variables and the number of input variables was sequentially increased based on decreasing correlation (Table 5). Various combinations of logging parameters were used to train and adjust data sets, with the R2 and MSE of the GRA-GA-BP model serving as evaluation criteria, in order to determine the optimal parameter combination for the model.

Table 5. Number of Input Data and Corresponding Geophysical Logging Types.
the number of input data geophysical logging types
2 AC, DEN
3 AC, DEN, R
4 AC, DEN, R, CNL
5 AC, DEN, R, CNL, GR
6 AC, DEN, R, CNL, GR, Depth
7 AC, DEN, R, CNL, GR, Depth, SP
8 AC, DEN, R, CNL, GR, Depth, SP, CAL

The findings indicate that as the number of input variables increases, the model’s R2 initially shows an upward trend followed by a decline, while the trend of MSE is the opposite (Figure 10). Selecting only the two parameters with the highest correlation (AC, DEN) as input variables results in a model R2 of only 0.50 and an MSE of 4.36, which is the poorest predictive performance of the model. This is because although these two logging parameters have a relatively high correlation with gas content, they do not fully capture the information regarding deep CBM gas content. When parameter R is added as an input variable to these, the model’s R2 increases to 0.79. When the model includes the first four logging parameters as input variables, its R2 increases to 0.86, thereby significantly enhancing the model’s accuracy. It is evident that as the number of independent variables increases, the information regarding deep CBM gas content in well logging data is systematically and effectively explored.

Figure 10.

Figure 10

Relationship between R2, MSE, and the number of input variables in the GRA-GA-BP model.

However, the accuracy of the model does not necessarily increase with the addition of more input variables. When all well logging data are included, the model’s R2 is only 0.69. This is because although increasing the number of input variables can capture more information about deep CBM gas content, redundant information in the data can also decrease model accuracy.64 CAL values are closely related to stratum mechanical strength, with relatively weak coal seam strength often leading to borehole enlargement.65 In theory, gas content enrichment to some extent enhances borehole enlargement.40 However, borehole diameter enlargement is primarily influenced by the mechanical strength of the coal seam, which depends mainly on the strength of the coal rock skeleton. Removing this parameter increases the model’s R2 to 0.78. The amplitude of abnormal SP logging values is influenced by factors such as formation water and mud filtrate.66,67 High mineralization coalbed water rich in methane exhibits abnormally high SP values. However, drilling fluid can impact the SP value, making it difficult for the SP value to accurately reflect the in situ coal seam gas content.68 Removing this parameter increases the model’s R2 to 0.81.

The relationship between coal seam gas content and burial depth varies between deep and shallow zones.16,19 First, the degree of coal metamorphism is a key factor affecting CBM gas content. Theoretically, burial depth influences the extent of hydrocarbon thermal decomposition, with deeper coal seams in the same tectonic region exhibiting higher thermal maturity.69 Second, the high-temperature and high-pressure characteristics of deep coal seams significantly contrast with those of shallow ones. CBM is primarily adsorbed gas; however, with increasing burial depth, both the coal seam temperature and pressure rise. Beyond a certain critical depth, the negative impact of temperature on methane adsorption capacity outweighs the positive effect of pressure, resulting in decreased adsorbed gas content.27 Finally, the sealing of coal seams determines the preservation and escape of free gases in deep coal seams,70 which is influenced by geological structures, subsequent sealing conditions, and coal accumulating environments. In summary, the relationship between the coal seam depth and gas content is a complex one controlled by various geological conditions. Removing this parameter increases the model’s R2 to 0.93, reaching the optimum.

By selecting five logging parameters (AC, DEN, R, CNL, and GR) as the optimal independent variable combination, the GRA-GA-BP model achieves the highest R2 (0.93) and the lowest MSE (0.69) (Figure 10). This finding aligns with the number of independent variables selected by using the GRA method in Section 3.1, further demonstrating the effectiveness of the GRA method.

5.1.2. Effect of the GA on Prediction Accuracy

The initial weights and thresholds of the BPNN structure are usually randomly generated (Section 3.2). It is a “gradient descent” search method that introduces significant randomness in the construction of the initial network, thereby greatly reducing the likelihood of finding the optimal solution for the BPNN. According to Figure 7f, the R2 of the GRA-BPNN model for the total data set reaches 0.85. However, its training set R2 is as high as 0.94, whereas the test set R2 is only 0.79, indicating its weak generalization ability.

The weights and thresholds of BPNN, improved by the GA, are not randomly generated but obtained through the GA optimization module. Leveraging the GA’s global optimization capability to obtain optimal initial weights and thresholds for the BPNN, which serve as the starting values for BPNN training, prevent convergence to local optima. After introducing the GA, the GRA-GA-BPNN model formed has higher prediction accuracy (Figures 7 and 8). The R2 values of the training set, test set, and total data set all exceed 0.92, and there is a significant reduction in error.

The GA optimization process is the process of finding the optimal initial weights and thresholds. As the number of iterations increases, the model fitness value continuously increases (Figure 11), indicating that the error gradually decreases. As a global optimization algorithm simulating biological evolution, the GA possesses excellent global optimization capabilities. It iteratively evolves based on a population, acquiring the optimal or approximate optimal solutions through selection, crossover, and mutation operations. The BP algorithm and GA significantly complement each other. GA’s strengths in “global search and parallel computing” effectively compensate for the high requirements of initial weights and thresholds in the BPNN, thereby overcoming its limitations in local convergence. The combination of these two algorithms can significantly enhance the algorithm performance and precision.

Figure 11.

Figure 11

Fitness value of the GRA-GA-BP model with the number of iterations.

5.2. Effect of Machine Learning Algorithms on CBM Gas Content Prediction

Machine learning methods such as the random forest (RF algorithm, support vector machine (SVM algorithm, and multiple linear regression (MLR) analysis are widely utilized in coal reservoir evaluation.40,42,71 To further validate the accuracy of the model improvement strategy in predicting the in situ gas content of deep CBM, prediction models based on the RF, SVM, and MLR methods were established using the five well logging curve data optimized by GRA (i.e., GRA-RF, GRA-SVM, and GRA-MLR). Each model was divided into a training set and testing set in an 8:2 ratio to compare the differences in predictive performance between these models and the GRA-GA-BP model proposed in this study.

The RF algorithm constructs a more stable model by integrating the outputs of multiple unrelated decision trees, with the model primarily influenced by the number of trees and the maximum depth of each tree.72 In this study, the number of trees is set to 6, and the maximum depth of each tree is set to 20. The SVM algorithm maps the nonlinear separable data set Xi in the original features to a high-dimensional feature space using a nonlinear function Φ (Xi) and finds a linear hyperplane with the maximum function margin in this high-dimensional feature space to classify the data. Model parameter acquisition mainly involves the selection of the kernel function and determination of penalty factor c. In this study, the radial basis kernel function is chosen as the inner product kernel function, and c is determined to be 10 after multiple training iterations.

Figure 12 illustrates the prediction results of different models on both the training and testing sets. With the same training data, the model prediction accuracy follows this order: GRA-GA-BPNN > GRA-RF ≈ GRA-BPNN ≈ GRA-SVM > GRA-MLR. The R2 of the training set and testing set for the GRA-MLR model are both below 0.5, with the MSE exceeding 5.0, indicating significant errors in the prediction results (Figure 12a). This is because there is a nonlinear relationship between the well logging data and the deep CBM gas content. However, the MLR method is limited by the linear assumption of the model, resulting in insufficient modeling capability for variables with nonlinear relationships and hence the poor predictive performance of the GRA-MLR model for deep CBM gas content. The accuracies of the GRA-SVM and GRA-RF models on the training and testing sets are similar, with R2 values ranging between 0.85 and 0.90 for both. The MSE and MAE of the GRA-SVM model are slightly lower than those of the GRA-RF model, consistent with the findings of Guo, 2022.53

Figure 12.

Figure 12

Relationship between predicted gas content and measured gas content of deep CBM of four machine learning models. R2 of the data point belongs to the total data set; the black line is the unit slope line. (a) GRA-MLR model; (b) GRA-SVM model; (c) GRA-RF model; (d) GRA-GA-BPNN model.

Previous studies have achieved good results using such machine learning algorithms to predict the gas content of shallow CBM.40,53 However, the prediction of the gas content of deep CBM in this study is not ideal because the relationship between deep coal well logging data and gas content is more complex. The free gas content in shallow CBM is extremely low, and when analyzing the relationship between the well logging data and gas content, the focus is mainly on the relationship between the well logging data and adsorbed gas characteristics. However, the free gas content in deep CBM cannot be ignored, and the response characteristics of well logging data to the gas content are more complex in deep CBM than in shallow CBM; thus, a single traditional machine learning model cannot fully express the relationship between the well logging data and deep CBM gas content.

Additionally, SVM and RF algorithms have unique features. The GRA-SVM model has advantages in dealing with small sample data sets, exhibiting slightly better prediction performance on the testing set than the training set. The GRA-SVM model has advantages in dealing with small sample data sets, exhibiting slightly better prediction performance on the testing set than the training set. However, deep CBM well logging data are influenced by fluctuations in logging instruments, geological features, and other environmental factors, which result in large variations in well logging data. Nevertheless, the SVM algorithm lacks the ability to handle outliers during regression calculations, as manifested by significant prediction deviations in individual data points (Figure 12b), leading to an unsatisfactory overall predictive performance of the model. The GRA-RF model integrates multiple decision trees to enhance the generalization ability and robustness of the predictive model. However, previous studies have found that individual decision trees are susceptible to the influence of noisy data when the model data set is small, thus affecting the overall predictive performance of the model.73 In practical situations, the collected data set is always limited, which results in the predictive performance of the GRA-RF model being affected (Figure 12c).

The GRA-GA-BPNN model proposed in this study enhances the model’s generalization ability and resistance to noise interference by selecting feature parameters and optimizing the strategy for updating network weights and thresholds, thus improving the convergence speed and prediction performance of the neural network. Moreover, it exhibits high prediction accuracy, even when faced with a small amount of input data (Figure 12d).

5.3. Example Application and Evaluation

A deep CBM in situ gas content evaluation model was established using the multiple algorithms (GRA-GA-BPNN) method combined with logging data. The accuracy of the model was tested using a test set, and the differences between different models were compared to demonstrate the effectiveness of the GRA-GA-BPNN model. However, due to the possibility that the test set and training set data may be from the same well and layer, the generalization of the model cannot be verified. Therefore, based on the completion of the gas content evaluation model, a new parameter well under the same geological structure background was introduced as the validation set for generalization testing. The data from this well do not belong to the 102 sets of data used for modeling. That is, the Z8 well that did not participate in the model establishment was used for validation, thus verifying the practical application of the model (Figure 13). The No. 9 coal seam of the well lies at a depth of 1846.8–1858.3 m, with the measured gas content ranging from 7.14 to 16.3 m3/t.

Figure 13.

Figure 13

Comparison between predicted and measured gas content in Well Z8.

The prediction results indicate that the estimated gas content from the GRA-GA-BPNN model closely matches the measured values with an average relative error of only 4.62%. Approximately 80% of the data shows a prediction error of less than 5%, and 93% of the data have a prediction error of less than 10%. The gas content curve predicted by the BPNN model exhibits the largest deviation from the measured values, and its amplitude is small, indicating its poor sensitivity to the response of logging curves along the vertical profile. In contrast, the amplitude of the gas content prediction curve of the GRA-GA-BPNN model is relatively large, indicating significant variations in the gas content along the vertical direction. This suggests that the model, with the introduction of GRA and GA optimization algorithms, possesses stronger data analysis capabilities, thus enhancing generalization and credibility. We observed that the prediction curves of the three models before and after optimization showed a decrease at the coal gangue interface (between 1852.3 and 1852.8 m) (Figure 13). This phenomenon indicates that the learning modes of the three models are consistent and also reflects the reliability of using well logging data to establish a deep CBM gas content prediction model. The measured gas content at this point is 7.14 m3/t. According to the proximate analysis, the ash yield at this location is as high as 45.19% and the average ash yield of the well is 18.19%. The sudden change in the ash yield is the reason for the low gas content at this location. Furthermore, the prediction curves of each model demonstrate significant fluctuations at lithological interfaces, which may be attributed to large variations in logging curve values at lithological interfaces, indicating insufficient or even inadequate learning of the models in this area.

The determination of the gas content in deep coal seams necessitates the implementation of intricate sampling and testing procedures, which are encumbered by the disadvantages of elevated sampling expenses and the complexity of measurement. This article presents a solution that is not susceptible to operator influence, thereby facilitating the prediction of gas content in deep coal seams. The method allows for the precise prediction of in situ gas content in deep CBM, thus reducing the reliance on experimental measurements and offering a significant practical application. The difficulty in obtaining samples of deep coal seams has resulted in the collection of limited data about the gas content of deep coal seams in the study area. This may limit the accuracy of the model predictions. In the future, collecting a larger amount of data may facilitate the enhancement of model training samples, thereby enabling neural networks to learn with greater precision and improving the accuracy and applicability of evaluation models.

6. Conclusions

In this study, based on deep coal reservoir logging data, the GRA and GA were introduced into the BPNN model to establish a multialgorithm coupled CBM gas content prediction model (GRA-GA-BPNN). The results indicate that the established model can effectively evaluate the in situ gas content of the deep CBM. The following conclusions were drawn:

  • (1)

    Based on the response mechanism of geophysical logging data to the gas content of deep CBM, the nonlinear correlation between the in situ gas content and the type of logging curve was evaluated using the GRA method. The model attained its highest prediction accuracy when selecting well logging data with correlation coefficients above 0.8 (AC, DEN, R, CNL, GR) as input variables; any increase or decrease in these variables would diminish the model’s predictive capability.

  • (2)

    The combined utilization of multiple algorithms proves more effective in predicting the in situ gas content of deep CBM. In comparison to the BPNN model, the GRA-GA-BPNN combined model showcased a reduction in MSE, MAE, and MAPE by 77.60%, 63.57%, and 64.82%, respectively. This reduction demonstrates the effectiveness and reliability of the optimized algorithm.

  • (3)

    The GRA-GA-BPNN combined model, among the tested machine learning methods, has demonstrated superior prediction accuracy and enhanced stability in predicting the deep CBM gas content compared to artificial neural networks like RF, SVM, and MLR.

  • (4)

    This study demonstrates the feasibility and reliability of the new model in predicting the in situ gas content of deep CBM. In the future, more accurate testing techniques can be adopted to obtain the gas content of coal seams, while collecting an expanded data set to improve the model’s generalization ability and enhance its practical applicability.

Acknowledgments

This work was financially supported by the National Natural Science Foundation of China (Grant Nos. 42272197, 42430805, and 42302203).

The authors declare no competing financial interest.

References

  1. Wang H.; Chen L.; Qu Z.; Yin Y.; Kang Q.; Yu B.; Tao W. Modeling of multi-scale transport phenomena in shale gas production — A critical review. Appl. Energy 2020, 262, 114575. 10.1016/j.apenergy.2020.114575. [DOI] [Google Scholar]
  2. Isaac O. T.; Pu H.; Oni B. A.; Samson F. A. Surfactants employed in conventional and unconventional reservoirs for enhanced oil recovery—A review. Energy Rep. 2022, 8, 2806–2830. 10.1016/j.egyr.2022.01.187. [DOI] [Google Scholar]
  3. Hamawand I.; Yusaf T.; Hamawand S. G. Coal seam gas and associated water: A review paper. Renewable Sustainable Energy Rev. 2013, 22, 550–560. 10.1016/j.rser.2013.02.030. [DOI] [Google Scholar]
  4. Wang E.; Feng Y.; Guo T.; Li M. Oil content and resource quality evaluation methods for lacustrine shale: A review and a novel three-dimensional quality evaluation model. Earth-Sci. Rev. 2022, 232, 104134. 10.1016/j.earscirev.2022.104134. [DOI] [Google Scholar]
  5. Akdaş S. B.; Fişne A. A data-driven approach for the prediction of coal seam gas content using machine learning techniques. Appl. Energy 2023, 347, 121499. 10.1016/j.apenergy.2023.121499. [DOI] [Google Scholar]
  6. Qin Y.; Moore T.; Shen J.; Yang Z.; Shen Y.; Wang G. Resources and geology of coalbed methane in China: a review. Int. Geol. Rev. 2018, 60 (5–6), 777–812. 10.1080/00206814.2017.1408034. [DOI] [Google Scholar]
  7. Tao S.; Chen S.; Pan Z. Current status, challenges, and policy suggestions for coalbed methane industry development in China: A review. Energy Sci. Eng. 2019, 7 (4), 1059–1074. 10.1002/ese3.358. [DOI] [Google Scholar]
  8. Cheng Y.; Pan Z. Reservoir properties of Chinese tectonic coal: A review. Fuel 2020, 260, 116350. 10.1016/j.fuel.2019.116350. [DOI] [Google Scholar]
  9. Liu Z.; Liu D.; Cai Y.; Yao Y.; Pan Z.; Zhou Y. Application of nuclear magnetic resonance (NMR) in coalbed methane and shale reservoirs: A review. Int. J. Coal Geol. 2020, 218, 103261. 10.1016/j.coal.2019.103261. [DOI] [Google Scholar]
  10. Fu X.; Qin Y.; Wang G. G. X.; Rudolph V. Evaluation of gas content of coalbed methane reservoirs with the aid of geophysical logging technology. Fuel 2009, 88 (11), 2269–2277. 10.1016/j.fuel.2009.06.003. [DOI] [Google Scholar]
  11. Li S.; Tang D.; Xu H.; Yang Z. The pore-fracture system properties of coalbed methane reservoirs in the Panguan Syncline, Guizhou, China. Geosci. Front. 2012, 3 (6), 853–862. 10.1016/j.gsf.2012.02.005. [DOI] [Google Scholar]
  12. Moore T. A. Coalbed methane: A review. Int. J. Coal Geol. 2012, 101, 36–81. 10.1016/j.coal.2012.05.011. [DOI] [Google Scholar]
  13. Lu C.; Zhang S.; Xue D.; Xiao F.; Liu C. Improved estimation of coalbed methane content using the revised estimate of depth and CatBoost algorithm: A case study from southern Sichuan Basin, China. Comput. Geosci. 2022, 158, 104973. 10.1016/j.cageo.2021.104973. [DOI] [Google Scholar]
  14. Luo X.; Zhang X.; Zhang L.; Huang G. Visualization of Chinese CBM Research: A Scientometrics Review. Sustainability 2017, 9 (6), 980. 10.3390/su9060980. [DOI] [Google Scholar]
  15. Ou C.; Li C.; Zhi D.; Xue L.; Yang S. Coupling accumulation model with gas-bearing features to evaluate low-rank coalbed methane resource potential in the southern Junggar Basin, China. AAPG Bull. 2018, 102 (1), 153–174. 10.1306/03231715171. [DOI] [Google Scholar]
  16. Li S.; Qin Y.; Tang D.; Shen J.; Wang J.; Chen S. A comprehensive review of deep coalbed methane and recent developments in China. Int. J. Coal Geol. 2023, 279, 104369. 10.1016/j.coal.2023.104369. [DOI] [Google Scholar]
  17. Kuuskraa V. A., Wyman R. E.. Deep Coal Seams: An Overlooked Source for Long-Term Natural Gas Supplies, SPE Gas Technology Symposium 1993, SPE; SPE-26196-MS 10.2118/26196-MS. [DOI] [Google Scholar]
  18. Bachu S.; Michael K. Possible controls of hydrogeological and stress regimes on the producibility of coalbed methane in Upper Cretaceous–Tertiary strata of the Alberta basin, Canada. AAPG Bull. 2003, 87 (11), 1729–1754. 10.1306/06030302015. [DOI] [Google Scholar]
  19. Wang T.; Zhou G.; Fan L.; Zhang D.; Shao M.; Ding R.; Li Y.; Hu H.; Deng Z.; Saydut A. Full-scale pore and microfracture characterization of deep coal reservoirs: a case study of the Benxi formation coal in the Daning–Jixian block, China. Int. J. Energy Res. 2024, 2024, 5772264. 10.1155/2024/5772264. [DOI] [Google Scholar]
  20. Zhang B.; Tao S.; Sun B.; Tang S.; Chen S.; Wen Y.; Ye J. Genesis and accumulation mechanism of external gas in deep coal seams of the Baijiahai uplift, Junggar basin, China. Int. J. Coal Geol. 2024, 286, 104506. 10.1016/j.coal.2024.104506. [DOI] [Google Scholar]
  21. Liu Z.; Zhou H.; Chen B.; Song L.; Sun X. Integrated NMR analysis for evaluating pore-fracture structures and permeability in deep coals: a one-stop approach. Energy Fuels 2024, 38 (8), 6854–6867. 10.1021/acs.energyfuels.3c05166. [DOI] [Google Scholar]
  22. Cui X.; Bustin R. M. Volumetric strain associated with methane desorption and its impact on coalbed gas production from deep coal seams. AAPG Bull. 2005, 89 (9), 1181–1202. 10.1306/05110504114. [DOI] [Google Scholar]
  23. Pashin J. C. Variable gas saturation in coalbed methane reservoirs of the Black Warrior Basin: Implications for exploration and production. Int. J. Coal Geol. 2010, 82 (3), 135–146. 10.1016/j.coal.2009.10.017. [DOI] [Google Scholar]
  24. Kędzior S.; Kotarba M. J.; Pękała Z. Geology, spatial distribution of methane content and origin of coalbed gases in Upper Carboniferous (Upper Mississippian and Pennsylvanian) strata in the south-eastern part of the Upper Silesian Coal Basin, Poland. Int. J. Coal Geol. 2013, 105, 24–35. 10.1016/j.coal.2012.11.007. [DOI] [Google Scholar]
  25. Liu S.; Harpalani S. Evaluation of in situ stress changes with gas depletion of coalbed methane reservoirs. J. Geophys. Res.:Solid Earth 2014, 119 (8), 6263–6276. 10.1002/2014JB011228. [DOI] [Google Scholar]
  26. Metcalfe R. S.; Yee D.; Seidle J. P.; Puri R.. Review of Research Efforts in Coalbed Methane Recovery; SPE, 1991; . 10.2118/23025-MS.SPE Asia-Pacific Conference [DOI] [Google Scholar]
  27. Hou X.; Liu S.; Zhu Y.; Yang Y. Evaluation of gas contents for a multi-seam deep coalbed methane reservoir and their geological controls: In situ direct method versus indirect method. Fuel 2020, 265, 116917. 10.1016/j.fuel.2019.116917. [DOI] [Google Scholar]
  28. Xu H.; Pan Z.; Hu B.; Liu H.; Sun G. A new approach to estimating coal gas content for deep core sample. Fuel 2020, 277, 118246. 10.1016/j.fuel.2020.118246. [DOI] [Google Scholar]
  29. Nazimko V. A method for measuring coalbed methane content in coal strata without the loss of the gas. Acta Geodyn. Geomater. 2018, 15, 379–393. 10.13168/AGG.2018.0028. [DOI] [Google Scholar]
  30. Tian Z.; Zhou S.; Wu S.; Xu S.; Zhou J.; Cai J. Direct method to estimate the gas loss characteristics and in-situ gas contents of shale. Gondwana Res. 2024, 126, 40–57. 10.1016/j.gr.2023.09.012. [DOI] [Google Scholar]
  31. Diamond W. P.; Schatzel S. J. Measuring the gas content of coal: A review. Int. J. Coal Geol. 1998, 35 (1), 311–331. 10.1016/S0166-5162(97)00040-2. [DOI] [Google Scholar]
  32. Saghafi A. Discussion on determination of gas content of coal and uncertainties of measurement. Int. J. Min. Sci. Technol. 2017, 27 (5), 741–748. 10.1016/j.ijmst.2017.07.024. [DOI] [Google Scholar]
  33. Ahmadi M.; Ahmad Z.; Phung L. T. K.; Kashiwao T.; Bahadori A. Estimation of water content of natural gases using particle swarm optimization method. Pet. Sci. Technol. 2016, 34 (7), 595–600. 10.1080/10916466.2016.1153655. [DOI] [Google Scholar]
  34. Ge X.; Liu D.; Cai Y.; Wang Y. Gas Content Evaluation of Coalbed Methane Reservoir in the Fukang Area of Southern Junggar Basin, Northwest China by Multiple Geophysical Logging Methods. Energies 2018, 11 (7), 1867. 10.3390/en11071867. [DOI] [Google Scholar]
  35. Kang J.; Fu X.; Elsworth D.; Liang S. Vertical heterogeneity of permeability and gas content of ultra-high-thickness coalbed methane reservoirs in the southern margin of the Junggar Basin and its influence on gas production. J. Nat. Gas Sci. Eng. 2020, 81, 103455. 10.1016/j.jngse.2020.103455. [DOI] [Google Scholar]
  36. Zhang Z.; Qin Y.; Wang G.; Sun H.; You Z.; Jin J.; Yang Z. Evaluation of Coal Body Structures and Their Distributions by Geophysical Logging Methods: Case Study in the Laochang Block, Eastern Yunnan, China. Nat. Resour. Res. 2021, 30 (3), 2225–2239. 10.1007/s11053-021-09834-4. [DOI] [Google Scholar]
  37. Shi J.; Zhao X.; Zeng L.; Zhang Y.; Dong S. Identification of coal structures by semi-supervised learning based on limited labeled logging data. Fuel 2023, 337, 127191. 10.1016/j.fuel.2022.127191. [DOI] [Google Scholar]
  38. Wu Y.; Gao R.; Yang J. Prediction of coal and gas outburst: A method based on the BP neural network optimized by GASA. Process Saf. Environ. Prot. 2020, 133, 64–72. 10.1016/j.psep.2019.10.002. [DOI] [Google Scholar]
  39. Chao Z.; Dang Y.; Pan Y.; Wang F.; Wang M.; Zhang J.; Yang C. Prediction of the shale gas permeability: A data mining approach. Geomech. Energy Environ. 2023, 33, 100435. 10.1016/j.gete.2023.100435. [DOI] [Google Scholar]
  40. Yang C.; Qiu F.; Xiao F.; Chen S.; Fang Y. CBM Gas Content Prediction Model Based on the Ensemble Tree Algorithm with Bayesian Hyper-Parameter Optimization Method: A Case Study of Zhengzhuang Block, Southern Qinshui Basin, North China. Processes 2023, 11 (2), 527. 10.3390/pr11020527. [DOI] [Google Scholar]
  41. Zhang J.; Hou X.; Liu S.; Chen L.; Wang Y. New Data-Driven Method for In situ Coalbed Methane Content Evolution: A BP Neural Network Prediction Model Optimized by Grey Relation Theory and Particle Swarm. Energy Fuels 2023, 37 (14), 10344–10354. 10.1021/acs.energyfuels.3c01143. [DOI] [Google Scholar]
  42. Tang H.; Cheng J.; Wang S.. Support Vector Machine Regression Model of CBM Content and Application; IEEE, 2009; pp 99–102. 10.1109/ICICISYS.2009.5357929.Proceedings - 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems [DOI] [Google Scholar]
  43. Zhao Z., Xin H., Ren Y., Guo X.. Application and Comparison of BP Neural Network Algorithm in MATLAB. 2010International Conference on Measuring Technology and Mechatronics Automation, pp 590–593. 10.1109/ICMTMA.2010. [DOI] [Google Scholar]
  44. Chang T.; Chao R. Application of back-propagation networks in debris flow prediction. Eng. Geol. 2006, 85 (3), 270–280. 10.1016/j.enggeo.2006.02.007. [DOI] [Google Scholar]
  45. Cui K.; Jing X. Research on prediction model of geotechnical parameters based on BP neural network. Neural Comput. Appl. 2019, 31 (12), 8205–8215. 10.1007/s00521-018-3902-6. [DOI] [Google Scholar]
  46. Zhang S.; Wang B.; Li X.; Chen H. Research and Application of Improved Gas Concentration Prediction Model Based on Grey Theory and BP Neural Network in Digital Mine. Procedia CIRP 2016, 56, 471–475. 10.1016/j.procir.2016.10.092. [DOI] [Google Scholar]
  47. Hao H.; Li J.; Wang J.; Liu Y.; Sun Y. Distribution characteristics and enrichment model of valuable elements in coal: An example from the Nangou Mine, Ningwu Coalfield, northern China. Ore Geol. Rev. 2023, 160, 105599. 10.1016/j.oregeorev.2023.105599. [DOI] [Google Scholar]
  48. Zhang P.; Tang S.; Lin D.; Chen Y.; Wang X.; Liu Z.; Han F.; Lv P.; Yang Z.; Guan X.; Hu J.; Gao Y. Diagenesis and Diagenetic Mineral Control on Reservoir Quality of Tight Sandstones in the Permian He8Member, Southern Ningwu Basin. Processes 2023, 11 (8), 2374. 10.3390/pr11082374. [DOI] [Google Scholar]
  49. Zhou F.; Yao G.; Tyson S. Impact of geological modeling processes on spatial coalbed methane resource estimation. Int. J. Coal Geol. 2015, 146, 14–27. 10.1016/j.coal.2015.04.010. [DOI] [Google Scholar]
  50. Wang L.; Zeng Y.; Chen T. Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst. Appl. 2015, 42 (2), 855–863. 10.1016/j.eswa.2014.08.018. [DOI] [Google Scholar]
  51. Guo J.; Zhang Z.; Xiao H.; Zhang C.; Zhu L.; Wang C. Quantitative interpretation of coal industrial components using a gray system and geophysical logging data: A case study from the Qinshui Basin, China. Front. Earth Sci. 2023, 10, 1031218. 10.3389/feart.2022.1031218. [DOI] [Google Scholar]
  52. Wang Y. Combining grey relation analysis with FMCGDM to evaluate financial performance of Taiwan container lines. Expert Syst. Appl. 2009, 36 (2), 2424–2432. 10.1016/j.eswa.2007.12.027. [DOI] [Google Scholar]
  53. Guo Y. Selection of machine learning algorithms in coalbed methane content predictions. Appl. Geophys. 2022, 20 (4), 518–533. 10.1007/s11770-022-0997-4. [DOI] [Google Scholar]
  54. Zhou J.; Liang G.; Deng T.; Gong J.; Romagnoli J. A. Route Optimization of Pipeline in Gas-Liquid Two-Phase Flow Based on Genetic Algorithm. Int. J. Chem. Eng. 2017, 2017, 1640303. 10.1155/2017/1640303. [DOI] [Google Scholar]
  55. Deng S.; Hu Y.; Chen D.; Ma Z.; Li H. Integrated petrophysical log evaluation for coalbed methane in the Hancheng area, China. J. Geophys. Eng. 2013, 10 (3), 035009. 10.1088/1742-2132/10/3/035009. [DOI] [Google Scholar]
  56. Ye Y.; Tang S.; Xi Z.; Jiang D.; Duan Y. A new method to predict brittleness index for shale gas reservoirs: Insights from well logging data. J. Pet. Sci. Eng. 2022, 208, 109431. 10.1016/j.petrol.2021.109431. [DOI] [Google Scholar]
  57. Feng J.; Duan T.; Bao J.; Li Y. An improved Back Propagation Neural Network framework and its application in the automatic calibration of Storm Water Management Model for an urban river watershed. Sci. Total Environ. 2024, 915, 169886. 10.1016/j.scitotenv.2024.169886. [DOI] [PubMed] [Google Scholar]
  58. Zhu J.; Zhao Y.; Hu Q.; Zhang Y.; Shao T.; Fan B.; Jiang Y.; Chen Z.; Zhao M. Coalbed Methane Production Model Based on Random Forests Optimized by a Genetic Algorithm. ACS Omega 2022, 7 (15), 13083–13094. 10.1021/acsomega.2c00519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Fang Y.; Yao Y.; Lin X.; Wang J.; Zhai H. A feature selection based on genetic algorithm for intrusion detection of industrial control systems. Comput. Secur. 2024, 139, 103675. 10.1016/j.cose.2023.103675. [DOI] [Google Scholar]
  60. Sayyafzadeh M.; Keshavarz A. Optimisation of gas mixture injection for enhanced coalbed methane recovery using a parallel genetic algorithm. J. Nat. Gas Sci. Eng. 2016, 33, 942–953. 10.1016/j.jngse.2016.06.032. [DOI] [Google Scholar]
  61. Guo J.; Huang Y.; Li Z.; Li J.; Jiang C.; Chen Y. Performance prediction and optimization of lateral exhaust hood based on back propagation neural network and genetic algorithm. Sustain. Cities Soc. 2024, 113, 105696. 10.1016/j.scs.2024.105696. [DOI] [Google Scholar]
  62. Xue X. Prediction of daily diffuse solar radiation using artificial neural networks. Int. J. Hydrogen Energy 2017, 42 (47), 28214–28221. 10.1016/j.ijhydene.2017.09.150. [DOI] [Google Scholar]
  63. Li J.; Xia X.; Sun C.; Chen X. Estimation of time-dependent laser heat flux distribution based on BPNN improved by multiple population genetic algorithm. Int. J. Heat Mass Transfer 2024, 233, 125997. 10.1016/j.ijheatmasstransfer.2024.125997. [DOI] [Google Scholar]
  64. Tong Z.; Meng Y.; Zhang J.; Wu Y.; Li Z.; Wang D.; Li X.; Ou G. Coal structure identification based on geophysical logging data: insights from wavelet transform (wt) and particle swarm optimization support vector machine (pso-svm) algorithms. Int. J. Coal Geol. 2024, 282, 104435. 10.1016/j.coal.2023.104435. [DOI] [Google Scholar]
  65. Bustin A. M. M.; Bustin R. Contribution of non-coal facies to the total gas-in-place in Mannville coal measures, Central Alberta. Int. J. Coal Geol. 2016, 154–155, 69–81. 10.1016/j.coal.2015.12.002. [DOI] [Google Scholar]
  66. Banerjee A.; Chatterjee R. A methodology to estimate proximate and gas content saturation with lithological classification in coalbed methane reservoir, bokaro field, india. Nat. Resour. Res. 2021, 30 (3), 2413–2429. 10.1007/s11053-021-09828-2. [DOI] [Google Scholar]
  67. Bai Z.; Liu Q.; Tan M.; Bai Y.; Wu H. Interpreting coal component content in logging data by combining gray relational analysis and hybrid neural network. Interpretation 2023, 11 (4), T735–T744. 10.1190/INT-2022-0077.1. [DOI] [Google Scholar]
  68. Yan T.; Liu Z.; Xing L.; Luo Y.; Bai Y.; Huang S. Evaluation of the Gas Content of Coal Reservoirs with Geophysical Logging in Weibei Coalbed Methane Field, Southeastern Ordos Basin, China. Adv. Mater. Res. 2013, 734–737, 331–334. 10.4028/www.scientific.net/AMR.734-737.331. [DOI] [Google Scholar]
  69. Zheng Y.; Jiang B.; Ren B.; Lin H.; Tao W.; Wang S. Evaluation and analysis of methane adsorption capacity in deep-buried coal seams. Greenhouse Gases:Sci. Technol. 2022, 12 (3), 376–393. 10.1002/ghg.2149. [DOI] [Google Scholar]
  70. Tao C.; Zhansong Z.; Xueqing Z.; Jianhong G.; Hang X.; Chenyang T.; Ruibao Q.; Jie Y. Prediction model of coalbed methane content based on well logging parameter optimization. Coal Geol. Explor. 2021, 49 (3), 227–235. 10.3969/j.issn.1001-1986.2021.03.029. [DOI] [Google Scholar]
  71. Guo J.; Zhang Z.; Guo G.; Xiao H.; Zhu L.; Zhang C.; Tang X.; Zhou X.; Zhang Y.; Wang C.; Wang Q. Evaluation of coalbed methane content by using kernel extreme learning machine and geophysical logging data. Geofluids 2022, 2022, 3424367. 10.1155/2022/3424367. [DOI] [Google Scholar]
  72. Guo J.; Zhang Z.; Guo G.; Xiao H.; Zhao Q.; Zhang C.; Lv H.; Zhu Z.; Wang C. Optimized Random Forest Method for 3D Evaluation of Coalbed Methane Content Using Geophysical Logging Data. ACS Omega 2024, 9 (33), 35769–35788. 10.1021/acsomega.4c04305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Tavakolian M.; Najafi-Silab R.; Chen N.; Kantzas A. Modeling of methane and carbon dioxide sorption capacity in tight reservoirs using Machine learning techniques. Fuel 2024, 360, 130578. 10.1016/j.fuel.2023.130578. [DOI] [Google Scholar]

Articles from ACS Omega are provided here courtesy of American Chemical Society

RESOURCES