Spatial variability and uncertainty associated with soil moisture content using INLA-SPDE combined with PyMC3 probability programming

Yujian Yang; Xueqin Tong

doi:10.1038/s41598-024-74624-w

. 2024 Oct 12;14:23900. doi: 10.1038/s41598-024-74624-w

Spatial variability and uncertainty associated with soil moisture content using INLA-SPDE combined with PyMC3 probability programming

Yujian Yang ^1,^✉, Xueqin Tong ^2,^#

PMCID: PMC11470935 PMID: 39396095

Abstract

Spatial variability and uncertainty associated with soil volumetric moisture content (SVMC) is crucial in moisture prediction accuracy, this paper sets out to address this point of SVMC by developing data-driven model. Grid samples of SVMC covered approximately a 3-ha field during the jointing growth stage of winter wheat, and SVMC were measured by Time Domain Reflectometry (TDR), located in North China Plain, China. Bayesian inference was performed to explore spatial heterogeneity, robustness, transparency, interpretability and uncertainty related to SVMC using python-based PyMC3 combined with Integrated Nested Laplace Approximation with the Stochastic Partial Differential Equation (INLA-SPDE) model. The results showed that the prediction surface of SVMC, the lower and upper limits of 95% credible intervals quantified uncertainty associated with SVMC, cauchy prior of the flexibility and adaptability to obtain state-of-the-art predictive performance is more robust than gaussian prior for SVMC prediction, the transparency and interpretability of SVMC prediction model were revealed by MCMC (Markov-Chain Monte-Carlo) trace plots, KDE (Kernel density estimates), and rank plots. The uncertainty associated with SVMC can explicitly be described using the highest-posterior density interval, the prediction lower and upper limits.

Keywords: Soil moisture content, INLA-SPDE model, Uncertainty, Transparency and interpretability, Cauchy prior

Subject terms: Ecology, Environmental sciences

Introduction

Soil moisture which links soil process, especially surface moisture top layer (0–20 cm depth) plays a fundamental and important role in the land–atmosphere exchange proces^1–3. The better spatial estimation of surface soil moisture can help to improve climate prediction accuracy and support water resource decision-makers^4,5. Improving soil moisture across different scales of resolution to increase water-use efficiency and response to the environmental stress are one of the critical challenges released by The National Academies of Sciences, Engineering and Medicine (NASEM)⁶.

Geostatistics has widely been applied for digital soil mapping⁷. Kriging-based techniques have become important tools of soil properties variability⁸. Geostatistics, including kriging, co-kriging, regression kriging, probability and universal kriging quantify the uncertainties of estimation with reduction of investigation costs³. However, the estimation accuracy is usually limited by the density and distribution of sample sites, and this approach also causes “smoothing effects”^3,9,10. As spatio-temporal geostatistics, bayesian maximum entropy (BME) which is used to estimate soil texture and soil textural fractions, has been shown to be more accurate than geostatistics^11,12. Using soft information based-BME has the advantages on a sound theoretical basis³. However, construction method of limited soft information and prediction high-efficiency face with challenge at present¹³. Additionally, sequential gaussian simulation and quantile regression forest produced the better accuracy uncertainty models to quantify the spatial uncertainty of soil organic carbon stock in Hungary¹⁴. Apart from geostatistical approach, bayesian inference is also used to assess spatial uncertainty for digital soil mapping, this approach has tended to rely on flexible MCMC simulation, but computationally and time-intensive simulations¹⁵. A computationally efficient alternative to MCMC was developed for so-called the INLA-SPDE model¹⁶. As an emerging model shown potential in digital soil mapping, this model was utilized to map soil organic matter with the advantages of good assessment of uncertainty¹⁴. INLA-SPDE was adopted for modeling spatio–temporal evolution of soil organic matter on the regional scale¹⁷. The robustness of mapping soil pH was demonstrated with sparse datasets on farm scale utilizing INLA-SPDE model¹⁸. INLA-SPDE model allows to construct soil moisture maps in a more sensible and efficient way¹⁹, and it not only provides a flexible and robust approach taking into account the spatio-temporal correlation with uncertainty description²⁰, but also is a computationally effective approach for latent gaussian model including a wide and flexible models ranging from (generalized) linear mixed to spatial and spatio-temporal models models^21,22. Thus, spatial variability and uncertainty associated with SVMC were chosen to develop using INLA-SPDE model in the study.

Bayesian probabilistic programming is an alternative stochastic simulation technique assessing the interpretability, uncertainty and interactions with high posterior density interval²³. A growing body of literatures explore bayesian probabilistic programming, discusses of the state-of-the-art advances in machine learning and artificial intelligence field, and this approach is conducive to address the robustness, transparency, and interpretability of model^24–29. Theano, as PyMC3 core component, its application in Bayesian probabilistic programming, which is based on deep learning principles, offers unique advantages which include the model flexibility, transparency, and interpretability of the results derived by integrating prior and posterior probabilities from a probabilistic perspective^30,31. Additionally, MCMC trace-plots, KDE and rank plots from python can effectively reveal the dynamic change of parameters in transparency and interpretability manner, different prior distributions based on python package Theano reflect the robustness of data prediction, and highest-posterior density interval can explicitly describe the uncertainty associated with model variables³¹. Therefore, the robustness, transparency, interpretability associated with SVMC prediction model can be addressed by PyMC3 bayesian probabilistic programming.

Traditionally, TDR and gravimetric measurements are still accurate at the point scale, soil moisture also interacts with the groundwater table depth and its variations³². Soil moisture during winter wheat growth process reflects soil processes, and has a large influence on winter wheat growth, the jointing stage is more important than other growth stages and needs water to promote wheat growth, SVMC in topsoil is a certain indicator for the correct volume of water for irrigation during the winter wheat jointing stage¹⁰. Thus, soil moisture top layer (0–20 cm depth) is critical to moisture precision management. Evidently, the spatial variability, transparency, interpretability and uncertainty associated with SVMC is important to improve the accuracy of SVMC data-driven model and smart irrigation.

As summarized above, we develop the spatial variability and uncertainty associated with SVMC during winter wheat jointing growth stage. The paper aimed to analyze spatial variability and quantify spatial uncertainty related to SVMC using INLA-SPDE model. Furthermore, the study set out to explore the robustness, transparency and interpretability related to SVMC based on PyMC3 probability programming.

Materials and methods

Study area and TDR field sampling

The study area (117°04.130’E, 36°42.979’N), covering approximately three hectares, is located in the north side of the Xiaoqinghe River in Shandong province, China, the important area in North China Plain. The north-south distance between two adjacent soil sampling points is about 10 m, and the east-west distance is 8 m, and grid sampling was used to measure SVMC. The case region is in the half moist monsoon climate region and the soil is sandy clay loam. The wheat-maize rotation was conducted yearly in the crop fields in the case study.

In situ measurements of soil moisture are invaluable provided information that facilitates the study of the spatial variability of soil moisture at different scales^4,33. Several techniques are available for measuring soil moisture content in situ, such as gravimetric method, neutron probes, cosmic-ray neutrons, electromagnetic techniques. However, TDR probes using electromagnetic techniques are non-destructive and can be easily set up for automated operation with a data logger. A number of sites are established to sample in the field, a TDR reading is taken followed by the extraction of a known volume of soil at each site, and the wet weight of this soil must be determined. The volumetric water content is calculated as follows:

where, Inline graphic and denotes mass (g) of wet and dry soil respectively, represents the total soil volume (ml), means density of water (1 g/ml).

In our experiment, SVMC was the ratio of the volume of moisture in a given volume of soil to the total soil volume, expressed as either a decimal or a percent. At each sampling point, five SVMC measurements were made within a 1-m diameter circle. The SVMC for each sample point was displayed on an LCD (liquid crystal display) screen of TDR¹⁰. We chose to save the SVMC data using the Excel formation file to further prepare for SVMC analysis. Each SVMC measurement was geo-referenced using a Differential Global Positioning System (DGPS). 231 SVMC samples were collected from the surface layer (0–20 cm) in the jointing stage during winter wheat growth in 2017.

INLA-SPDE model

Basic principles of INLA-SPDE model

Gaussian Random field (GRF) which is a collection of random variables where the observations occur in a continuous domain is widely applicable in geostatistics, ecology, epidemiology, as well as environmental risk assessment^21,22. Matérn covariance functions are the most common type in geostatistics models³⁴, the covariance matrix of GRF is constructed from the following Matérn covariance function which is given as:

Here, Inline graphic denotes the Euclidean distance between the point and point , denotes the marginal variance of the spatial field, is Gamma function, is the modified Bessel function, which is related to the range is the scale parameter, and the mean square differentiability of the process was determined by the smoothness parameter Inline graphic ^[20,22,35. A GRF with a Matérn covariance matrix mainly depends on the scale parameter and the smoothness parameter .

Let Inline graphic denote the observations of a spatial variable at locations ,…, , n = 1,…,231. SVMC samples are located in the set locations ,…, of 231 sites. Here, denotes a Matérn field³⁶. Where the spatial domain is a fixed subset of and the spatial index varies continuously throughout Inline graphic ^20,22,37. GRF models with Matérn covariance functions can be expressed as solutions to the following SPDE

Here, Inline graphic which is jointly given by the parameters (max.edge, cutoff, and offset) of the INLA-SPDE model is the spatial scale parameter, and associated with the range in geostatistics^20,22, to avoid the boundary effect and sharp corners, the parameter offset is used to extend the domain of interest by a distance^20,21, Inline graphic is the Laplacian, which is a positive integer related to parameter controls the smoothness () related to the SPDE, the variance is controlled by parameter τ, Z(s) is a GRF and is a gaussian spatial white noise process. The marginal variance of Matérn covariance function is related to the SPDE through

From this we can identify the exponential covariance of Matérn function with Inline graphic .

Here Inline graphic is the number of the triangulation vertices, represents the basis functions to provide the link between the GRF and GMRF making it easier as implemented in the R-INLA package^20–23. Specifically, the finite element method which leads to a triangulated mesh with nodes and basis functions was used to obtain an approximate solution of the SPDE. Basis functions Inline graphic on each triangle that is equal to 1 at vertex , and equal to 0 at the other vertices. Then, the GRF is represented as a GMRF by the basis functions given on the triangulated mesh. The joint distribution of the weight vector is assigned a gaussian distribution that approximates the solution Inline graphic of the SPDE in the mesh nodes, and the approximation from the mesh nodes was transformed into the other spatial locations by the basis functions. Here, the appropriate precision matrix for the weights is given by sparse matrices yielded piece-wise linear basis functions defined by a triangulation of the domain of interest, whether two-dimensional domains, or one-dimensional domains^20–22.

Key points using the INLA-SPDE model

SVMC occurs continuously in space. As a spatially continuous variable, SVMC can be modeled using a GRF, we can use SPDE implemented in the R-INLA package to fit a spatial model and predict the SVMC at unsampled locations.

Triangulated mesh construction

A triangulated mesh was created by the finite element method to obtain an approximate solution of the SPDE by the inla.mesh.2d() function of the R-INLA package^21,22. Here, parameters of this function, such as offset, max.edge and cutoff needed to be set. In our study, we took into account the computational cost and modeling accuracy, offset = c(-0.15, 70) is specified to have an outer extension of size 70 around the locations, -0.15 denotes the coverage diameter of the data range will be increased by 15% as the mesh extension, which also avoids the boundary effects and sharp corners²². cutoff = 1 was set to avoid building many small triangles, we set max.edge = c(7, 20) to use small triangles within the region, and larger triangles in the extension²². Once the mesh was constructed, Matérn function also defined the spatial correlation structure of the SPDE^15,17.

INLA-SPDE model construction

The INLA-SPDE model was used to predict SVMC, components of prediction model can be expressed as follows:

Here, Inline graphic means a zero-mean gaussian distribution with mean and variance , it is a spatially gaussian random effect which captures spatial variability of SVMC. while the mean is expressed as the intercept and which is a spatially structured random effect with Matérn covariance function. This step, the parameter Inline graphic ( was set by inla.spde2.matern() function of the R-INLA package to build the SPDE model on the mesh. is associated with the smoothness parameter of the process¹⁶. In our study, we set the smoothness parameter equal to 1 and , thus .

Space mapping and plotting of INLA-SPDE model

The index set for the SPDE model was generated utilizing the function inla.spde.make.index() from R- INLA package, and a projection matrix using inla.spde.make.A() function passing the triangulated mesh and the coordinates was constructed to project the GRF from the observations to the triangulation vertices^20,22. Non-informative prior distribution of model and the default parameters and hyperparameters were selected and adopted in model. A matrix with the coordinates of the locations was constructed where we will predict the SVMC, mainly constructing a grid with 50 × 50 locations by using expand.grid() and combining vectors which contain coordinates in the range of the study border, inla.stack() function was used to organize the data and projection matrices. Moreover, the INLA-SPDE model formula to perform bayesian inference is specified by including the fixed and random effects. Finally, the SVMC pred_mean with the posterior mean and SVMC pred_ll and pred_ul with the lower and upper limits of 95% credible intervals were created in the study to quantify the uncertainty associated with SVMC, respectively.

PyMC3 bayesian probabilistic programming

PyMC3 probabilistic programming is flexible platform for building complex statistical models using custom likelihood functions. Theano, which encapsulates the gradient calculations and automatic differentiation required for the NUTS algorithm, was used for Bayesian probabilistic programming as a core component of PyMC3^31,38,40. The NUTS algorithm automates the selection of an appropriate path length overcoming HMC sensitivity to parameters such as the size and required number of steps^31,38, it uses a smart recursive simulation algorithm to identify potential candidate points heuristically³⁹. Therefore, the NUTS algorithm samples from models with continuous parameters more efficiently and quickly than traditional methods by leveraging log posterior-density gradient information^40,41. In the present study, we used the NUTS algorithm for sampling from posterior distributions.

In this work, the Bayesian inference was developed using the PyMC3 package (Theano) on the Python platform. The NUTS algorithm explores the target distribution more efficiently and achieves faster convergence. Both cauchy prior and gaussian prior is used to analyze the robustness associated with SVMC.

The NUTS algorithm Bayesian inference was performed using data from 231 SVMC sampling points in the case region. For the posterior analysis process, PyMC3 provides plotting and summarization functions for inspecting the sampling output, as well as a simple posterior plot that can be created using a trace-plot. KDE and traceplot of mu and sigma were obtained based on the generated NUTS samples after 1100 iterations in this study. The Gaussian prior distribution is expressed as follows:

where, the parameter mu is derived from a uniform distribution with upper and lower bounds a and b, and the parameter sigma is derived from a half-normal distribution with a standard deviation Inline graphic . follows a Gaussian distribution with parameters mu and sigma. According to our previous knowledge, we set and , and our option is to set ^42,43.

The histogram as shown in Fig. 1(a) presented heavy-tail distribution of SVMC, heavy tail means the outliers that deviate from the mean, the cauchy prior distribution which has heavy tail characteristics is more effective because the distribution is not clustered near the mean like the gaussian distribution⁴⁴. Thus, the cauchy prior was used to replace the gaussian prior, correspondingly, we had rewritten the Cauchy prior model as follows:

Fig. 1 — (a) The histogram and kernel density curve of SVMC. Here, the green solid line is kernel density line of SVMC. (b) Q-Q plot of SVMC from 231 sample points. Here, the blue sample points denote observed data of SVMC, the red solid line is theoretical normal line.

Cauchy prior has one more parameter Inline graphic of the prior than the gaussian prior, here, was set as an exponential distribution with a mean of 20⁴³, actually, the exponential prior is a weakly informative prior indicating the should be around 20.

Evaluation and validation of SVMC

The deviance information criterion (DIC) which is a commonly-used index for measuring model performance in INLA-SPDE model is based on a trade-off between the fit of the data to the model with smaller values of DIC indicating a better mode²². A smaller DIC indicates a better model fit^15,17. Moreover, the Condition Predictive Ordinate (CPO) and probability integral transform(PIT) which was used as effective index to evaluate the predictions was also calculated in our study²². In order to assess the validity of the estimated INLA-SPDE model of SVMC, we also perform a simple residual analysis by calculating Root Mean Squared Error (RMSE) between observations and predictions of SVMC corresponding to the 69 validation sites.

Here Inline graphic is the sample size of validation sites, and is predicted and observed SVMC value at the corresponding validation sites.

Results and analysis

Statistics characteristics of SVMC

Descriptive statistics(such as variance, coefficient of variation) of SVMC based on 231 samplings was analyzed and calculated in the R(Version4.3.0) packages“pastecs” during the jointing stage of winter wheat in 2017. The variance of SVMC (Percent volumetric moisture content) is 11.533% and the standard deviation is 3.396%. SVMC ranged from 12.6 to 34.8% with a mean (20.25%). The coefficient of variation (CV) of SVMC was 0.168, which indicated the medium variability of SVMC. The asymmetry of SVMC is measured by the skewness (the skewness is 1.0) which is the departure from normality, while the peakedness of a distribution is expressed by kurtosis whose significance relates mainly to the normal distribution (the kurtosis is 2.31 in the study), as shown in Fig. 1(a). QQ-plot and the histograms are used to explore if the data is normally distributed (a bell-shaped curve) in the study. As illustrated in Fig. 1(b), we can see that the plot which is close to a straight line showed the approximate normality though the main departure from this line occurs at high values of SVMC. It should be noted that the histogram as shown in Fig. 1(a) presented heavy-tail distribution of SVMC which provided the foundation for more reasonable prior distribution in the following study.

Spatial uncertainty associated with SVMC based on INLA-SPDE model

The triangulated mesh constructed is shown in Fig. 2, and the number of vertices of the triangulated mesh is 2330 based on the mesh using a boundary of the region of study (the green boundary line).

Finally, sequential gradient between the color green (low) and red (high) were correspondingly created in the study. The three maps were created in the same plot with one unit on the x-axis and on the y-axis of the map, these maps included the maps of SVMC prediction mean, the lower limits of 95% credible intervals map of the predictions and the upper limits of 95% credible intervals map derived using the INLA-SPDE model, as illustrated in Fig. 3. INLA-SPDE results exhibited a consistent spatial distribution of SVMC, the predicted SVMC is uneven. The posterior mean of SVMC is 20.253%, the standard deviation is 0.216%, the 2.5% percentiles (the value is 19.828%) and 97.5% percentiles (the value is 20.677%), corresponding to the lower limit and upper limit of the SVMC prediction.

Fig. 3 — The SVMC posterior predictions, the lower and upper limits of 95% credible intervals. The pred_mean with the posterior mean of SVMC and pred_ll of SVMC and pred_ul of SVMC with the lower and upper limits of 95% credible intervals, respectively. Unit of horizontal scale and vertical coordinate distance is meter. The Figure was created by open-source R4.3.0 (https://www.r-project.org/) combined with R-INLA_23.06.12 (www.r-inla.org).

We continuously adjust the parameter values in the model prediction to improve the accuracy, run the model multiple times, and compare the results of the model. Finally, the mean of CPO is 0.719, the mean of PIT is 0.498, DIC is equal to -1106.054, and RMSE is equal to 1.705 between observed and predicted SVMC. Figure 4 showed the scatter plots and regression line between observed and predicted SVMC based on 69 validation sites (69 red points), The regression equation of predicted and observed SVMC is y = 2.881 + 0.829x (y denotes predicted SVMC, x represents observed SVMC, R-squared equal to 0.701, p-value less than 0.01), which indicated the good performance of INLA-SPDE model of SVMC prediction.

Fig. 4 — Regression line(the green line) and scatter plots of observed and predicted SVMC based on 69 validation sites from INLA-SPDE model. Unit of vertical coordinate and horizontal scale is %.

Transparency, interpretability and uncertainty related to SVMC

Transparency, interpretability related to SVMC

As described above, the lower limit and upper limit map of the SVMC prediction present the spatial distribution and quantify uncertainty derived using the INLA-SPDE model. In this section, we mainly use cauchy prior and gaussian prior to explore the transparency, interpretability, robustness and uncertainty associated with SVMC.

The marginal distributions of parameter Inline graphic and was generated using PyMC3. As shown in Fig. 5(a), On the left panel, KDE of two Markov Chains was calculated by PyMC3. Apparently, there is the difference in the KDE and trace plots between each of the chains, each plotted line (solid line and dashed line) represents a single independent chain in parallel. The KDE and trace plots of Inline graphic belong to the same distribution between both chains, these characteristics showed MCMC methods convergence. Similarly, the KDE and trace plots of belong to the same distribution as there are only small (random) differences between both chains, which indicated both chains are good mixing of Inline graphic and ^42,43 Trace plots which seems to be similar to the one from good chains are made at each iteration from both chains on the right panel from Fig. 5(a).

Evaluating MCMC convergence is important and necessary to check whether the gaussian prior makes sense, rank plots are used in convergence diagnosis combining with the effective sample size (ESS), potential scale reduction factor Inline graphic , monte carlo standard error (MCSE)^44–46.The ranks are very close to uniform and that both chains look similar to each other with not distinctive patterns in Fig. 5(b), which shows good mixing of both chains and makes sense of the gaussian prior⁴⁷.

Figure 6(a) shows what the bi-dimensional posterior looks like and the marginal distributions of the parameters Inline graphic , , and from cauchy prior.

Similarly, rank plots (Fig. 6(b)) are used to evaluate MCMC methods convergence.

ESS_bulk mainly assesses how well the center of the gaussian and cauchy prior was resolved in the study, while ESS_tail, corresponds to the minimum ESS which is close to the actual number of samples at the percentiles 5 and 95. Inline graphic ( ⪅1.01) is considered safe and reasonable samples⁴². As a result, the summary was compared from gaussian model prior and cauchy prior, including the mean, standard deviation (sd), and 94% HDI interval (HDI 3% and HDI 97%), ESS, and MCSE, as shown in Tables 1 and 2.

Table 1.

Posterior summary of parameters from gaussian prior.

	Mean	SD	hdi_3	hdi_97	msce_sd	ESS_bulk	ESS_tail	R_hat
mu	20.26	0.225	19.86	20.71	0.002	3819	2915	1.0
sigma	3.41	0.16	3.12	3.71	0.002	3883	3136	1.0

Open in a new tab

Table 2.

Posterior summary of parameters from cauchy prior.

	Mean	SD	hdi_3	hdi_97	msce_sd	Ess_bulk	Ess_tail	R_hat
mu	20.03	0.20	19.66	20.41	0.002	2421	2354	1.0
sigma	2.71	0.22	2.31	3.11	0.004	2083	2136	1.0
nu	6.37	3.01	2.63	11.18	0.065	1914	2089	1.0

Open in a new tab

Compare posterior summary from gaussian prior (Table 1) with the trace of cauchy prior (Table 2), the estimation of mu between both models is similar, with a difference of ≈ 0.2. The estimation of sigma changes from ≈ 3.41 to ≈ 2.71, this is mainly because the cauchy prior gives less weight by values away from the mean⁴². nu ≈ 6 indicated a very cauchy-like distribution with heavy tails of SVMC. MCMC trace plots, KDE and rank plots reflect the transparency and interpretability of SVMC prediction model with numbers and plots.

Robustness and uncertainty associated with SVMC

Moreover, we will perform the posterior predictive check of the cauchy prior and gaussian prior. 100 predictions from the posterior were generated to check and simulate SVMC how consistent the simulated value with the measured value. As shown in Fig. 7(a), the blue solid line which is a KDE of SVMC measurement represents the measured data and the semitransparent (red) lines which reflect the uncertainty of the 100 predictions from gaussian model. The mean value of the sampling value is slightly to the right, and the change of sampling value is larger than the original SVMC measurement value, which is a direct consequence of some measurements that are separated from the bulk of the data. Though gaussian prior is a reasonable and useful representation of SVMC. Nevertheless, the model does not correctly handle the heavy-tailed distribution, so we explored how to get predictions that match the data even closer. As shown in Fig. 7(b), cauchy prior can better fit the SVMC in terms of the peak and shape of the distribution, the predicted values look very close to measured data, in particular in tails. Posterior predictive checks of SVMC bayesian inference confirmed the cauchy prior had the better robustness, and higher prediction accuracy because the outliers reduce the value of normal parameters, the mean is more estimated from the measured center data^43,44.

Fig. 7 — (a) Uncertainty and posterior prediction of gaussian prior. As shown in Figure, the blue solid line represents the observed data and the semitransparent (red) ones predictions from an gaussian prior. (b) Uncertainty and posterior predictive checks of cauchy prior. As shown in Figure, the blue solid line represents the observed data and the semitransparent (red) ones predictions from cauchy prior. (c) The reported HDI corresponding from cauchy prior.

The uncertainty associated with SVMC can explicitly be described using the highest-posterior density interval (HDI). The posterior distribution is represented using a KDE, the mean and the limits of the HDI 94% are represented in the Fig. 7(c)^48–51. Here, the 94% HDI as a black line at the bottom of the plot, it can help us make a decision depending on the posterior results of cauchy prior. A vertical (orange) line and the proportion of the posterior above and below our reference value present the uncertainty associated with SVMC, if observed data of SVMC is equal to 20.1, the value is a vertical (orange) line, 66% of the posterior is below the value, only 34% of the posterior is above the value, which reflects the uncertainty associated with SVMC using probability mode.

Discussion

Bayesian inference is known as probabilistic description which is built using probabilities, using probability to model uncertainty of SVMC is a reasonable methodological approach. Generally, we can describe the process of SVMC bayesian inference in 3 steps: (1) Based on SVMC data, a model of SVMC is designed mainly by combining and transforming way.(2)According to Bayes’ theorem to condition on the SVMC, a posterior which refers to the probability distribution of the parameters in the model rather than a value was calculated based on prior distribution and likelihood function.(3)Checking and diagnosing convergence of model according to different parameters and criteria^42,44,48.

The posterior which is a balance between the prior and the likelihood has more flexibility, adaptability and simplicity and can provide posterior mean of the variables, It is clear that bayesian inference can be influenced by priors, whether non-informative priors (also known as flat, vague priors) or weakly-informative priors have the least possible amount of impact on the bayesian inference, weakly-informative priors is a better selection following the recommendations of Gelman, McElreath, Kruschke^42,44,49. For the bayesian inference of SVMC, gaussian prior distribution or cauchy prior distribution has different influence on the posterior distribution of SVMC. This uncertainty related to SVMC is a robust model of SVMC observation using bayesian inference (including cauchy prior and gaussian prior) during the winter wheat jointing growth stage. The simulation blue solid line from the cauchy prior model (as shown in Fig. 7(b)) which represents the observed data and the semitransparent (red) ones predictions is higher accuracy than gaussian prior model, the cauchy prior model performs well for MCMC simulation of SVMC, cauchy prior of SVMC predictions is more robust than that of gaussian model, the bayesian-based inference uncertainty and the highest-posterior density interval of SVMC can explicitly be revealed and described in the study, which benefits the intelligent decision-making of smart agriculture. As is known to us, the result of bayesian inference is a posterior distribution which contains all of parameters. Thus, by summarizing the posterior, we are summarizing the logical consequences of a model and data, so the prior and posterior for model parameter is an important issue, an alternative is to use the more powerful and flexible model, the Dirichlet process mixture models how to add flexibility to models by mixing simpler distributions to build more complex ones, to deeply perform the simulation and prediction for object variables. A limitation of our study is that the model used in our analysis only incorporates the cauchy prior based on heavy-tail data of SVMC, not considering of Dirichlet process mixture models for SVMC posterior prediction.

It is important for bayesian inference which is emerging as a powerful framework to express and understand next-generation deep neural networks⁴⁸. Quantifying uncertainty of soil properties and soil moisture offers unique opportunities through big data analysis and machine learning approaches^14,50. The hierarchical model, or how to solve the problem structurally to better perform bayesian inference, and by partially “pooling” information of different groups, following shrinkage estimation which different groups share part of the data through hyper-prior is conducive to more stable inference⁴⁴. Another limitation of the study is short of shringkage, over-fitting and under-fitting of SVMC based-bayesian inference process.

Currently, the robustness, transparency, interpretability and uncertainty of model have important developments at the cutting-edge, machine learning algorithms, deep learning algorithms and variational autoencoders which are hierarchical probabilistic models explain data at multiple levels, and thereby accelerate learning^51–53. INLA-SPDE model covers a class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models^17,20,22. Using RS-based indices covariates may have the potential to serve as effective predictors of SVMC. However, we didn’t develop and perform the spatial predictions of SVMC combined with RS-based indices covariates based on machine learning algorithm and INLA-SPDE model.

Conclusions

Open source-based bayesian inference was performed to explore spatial heterogeneity, transparency, interpretability and uncertainty associated with SVMC using python-based PyMC3 combined with INLA-SPDE model. The conclusions are as follows:

Spatial variability and uncertainty associated with SVMC based on INLA-SPDE model during the jointing growth stage of winter wheat.

We create the maps of SVMC prediction mean, the lower limits of 95% credible intervals of the predictions and the upper limits of 95% credible intervals derived using the INLA-SPDE model, these maps exhibit a consistent spatial pattern of SVMC and describe the uneven characteristics, and maps of the lower limits of 95% credible intervals of the predictions and the upper limits of 95% credible intervals quantify the uncertainty associated with SVMC.

2.
Robustness, transparency, interpretability and uncertainty associated with SVMC based on PyMC3 probability programming prediction during the jointing growth stage of winter wheat.

This paper makes use of bayesian inference to give the flexibility and adaptability to obtain state-of-the-art predictive performance of SVMC. Maps of 95% credible intervals quantify the uncertainty associated with SVMC based on INLA-SPDE model. Cauchy prior of SVMC predictions is more robust than that of gaussian prior. The transparency and interpretability of SVMC prediction model were revealed by MCMC trace plots, KDE and rank plots. The based-bayesian inference uncertainty associated with SVMC can explicitly be revealed and described using the highest-posterior density interval.

Acknowledgements

We appreciate colleagues for their help and support from School of Civil Engineering and Geomatics of Shandong University of Technology.

Author contributions

Writing—Original Draft Preparation, Investigation, Data Curation, and Analysis, Y.Y.; Writing—Review and Editing, Investigation, Data Curation, and Analysis, X.T.

Funding

This research was funded by t the project“Preliminary application of smart agriculture based on 3S technology” (project code: 4041/421024) from Shandong University of Technology, 2021.

Data availability

Data availability statementAll data included in this study are available upon request by contact with the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Xueqin Tong: Co-first author

References

1.Jacob, S. et al. State-of-the-art global models underestimate impacts from climate extremes. Nat. Commun.10, 1005 (2019). https://www.nature.com/articles/s41467-019-08745-6 [DOI] [PMC free article] [PubMed]
2.Eric, H. NASA’s new soil moisture satellite could improve forecasts. Science. 10.1126/science.aaa6407 (2015). [Google Scholar]
3.Gao, S. G. et al. Estimating the spatial distribution of soil moisture based on bayesian maximum entropy method with auxiliary data from remote sensing. Int. J. Appl. Earth Obs. 32, 54–66. 10.1016/j.jag.2014.03.003 (2014). [Google Scholar]
4.Dorigo, W. A. et al. The International Soil Moisture Network: A data hosting facility for global in situ soil moisture measurements. Hydrol. Earth Syst. Sc. 15(5), 1675–1698 (2011). [Google Scholar]
5.Dirmeyer, P. A. et al. GSWP-2: Multimodel analysis and implications for our perception of the land surface. B Am. Met. Soc.87, 1381–1397. 10.1175/BAMS-87-10-1381 (2006). [Google Scholar]
6.National Academies of Sciences, Engineering, and Medicine. Science Breakthroughs to Advance Food and Agricultural Research by 2030 (National Academies, 2019). 10.17226/25059
7.Martínez, M. J. F., Hueso, G. P. & Ruiz, S. J. D. Topsoil moisture mapping using geostatistical techniques under different Mediterranean climatic conditions. Sci. Total Environ.595, 400–412. 10.1016/j.scitotenv.2017.03.291 (2017). [DOI] [PubMed] [Google Scholar]
8.Goovaerts, P. Geostatistical modelling of uncertainty in soil science. Geoderma. 103(1–2), 3–26. 10.1016/S0016-7061(01)00067-2 (2001). [Google Scholar]
9.Diggle, P. J. & Paulo, J. R. Model-Based Geostatistics (1st ed.) 12–57. Springer Series in Statistics (2007).
10.Yujian, Y., Yanbo, H., Yong, Z. & Xueqin, T. Optimal Irrigation Mode and Spatio-temporal variability characteristics of Soil Moisture Content in different growth stages of winter wheat. Water. 10(9), 1180 (2018). [Google Scholar]
11.Douaik, A., Meirvenne, M. V. & Tóth, T. Soil salinity mapping using spatio-temporal kriging and bayesian maximum entropy with interval soft data. Geoderma. 128(3–4), 234–248. 10.1016/j.geoderma.2005.04.006 (2005). [Google Scholar]
12.Christakos, G., Serre, M. L. & Kovitz, J. L.BME representation of particulate matter distributions in the state of California on the basis of uncertain measurements. J. Geophys. Res. Atmos.106(9), 9717–9731. 10.1029/2000JD900780 (2001). [Google Scholar]
13.Chutian, Z. & Yong, Y. Can the spatial prediction of soil organic matter be improved by incorporating multiple regression confidence intervals as soft data into BME method? Catena. 178, 322–334 (2019).
14.Gábor, S. & László, P. Comparison of various uncertainty modelling approaches based on geostatistics and machine learning algorithms. Geoderma. 337, 1329–1340. 10.1016/j.geoderma.2018.09.008 (2019). [Google Scholar]
15.Poggio, L., Gimona, A., Spezia, L. & Brewer, M. J. Bayesian spatial modelling of soil properties and their uncertainty: The example of soil organic matter in Scotland using R-INLA. Geoderma. 277, 69–82. 10.1016/j.geoderma.2016.04.026 (2016). [Google Scholar]
16.Rue, H., Martino, S. & Chopin, N. Approximate bayesian inference for latent gaussian models using integrated nested Laplace approximations. J. R Stat. Soc. B. 71, 319–392 (2009). [Google Scholar]
17.Chenconghai, Y., Lin, Y., Lei, Z. & Chenghu, Z. Soil organic matter mapping using INLA-SPDE with remote sensing based soil moisture indices and Fourier transforms decomposed variables. Geoderma. 437, 116571. 10.1016/j.geoderma.2023.116571 (2023). [Google Scholar]
18.Huang, J., Malone, B. P., Minasny, B., McBratney, A. B. & Triantafilis, J. Evaluating a bayesian modelling approach (INLA-SPDE) for environmental mapping. Sci. Total Environ.609, 621–632. 10.1016/j.scitotenv.2017.07.201 (2017). [DOI] [PubMed] [Google Scholar]
19.Carbó, E. et al. Modeling influence of Soil properties in different gradients of Soil moisture: the case of the Valencia Anchor Station Validation Site. Spain Remote Sens.13, 5155. 10.3390/rs13245155 (2021). [Google Scholar]
20.Moraga, P. Spatial Statistics for data Science: Theory and Practice with R 20–208 (Chapman & Hall/CRC Data Science Series, 2023).
21.Lindgren, F., Rue, H. & Lindstrom, J. An explicit link between Gaussian fields and Gaussian Markov random fields the SPDE approach. J. R Stat. Soc. B 423–498 (2011).
22.Virgilio Gómez Rubio. Bayesian Inference with INLA1-257 (Chapman&Hall/CRC, 2020).
23.Maeder, P. et al. Soil fertility and Biodiversity in Organic Farming. Science. 296(5573), 1694–1697 (2002). [DOI] [PubMed] [Google Scholar]
24.Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature. 521, 452–459 (2015). [DOI] [PubMed] [Google Scholar]
25.Doucet, A., Freitas, J. F. G. & Gordon, N. J. Sequential Monte Carlo Methods in Practice, 23-98. Springer (2000).
26.Tenenbaum, J. B., Kemp, C., Griths, T. L. & Goodman, N. D. How to grow a mind: statistics, structure, and abstraction. Science. 331, 1279–1285 (2011). [DOI] [PubMed] [Google Scholar]
27.Neal, R. M. MCMC using hamiltonian dynamics. In (eds Brooks, S., Gelman, A. & Meng, X. L.) G. J. Handbook of Markov Chain Monte Carlo (Chapman & Hall / CRC, (2010).
28.Pekel, J. F., Cottam, A., Gorelick, N. & Belward, A. High-resolution mapping of global surface water and its long-term changes. Nature. 540, 418–422 (2016). [DOI] [PubMed] [Google Scholar]
29.Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science. 324, 81–85 (2009). [DOI] [PubMed] [Google Scholar]
30.Christopher, K. & Mark, B. Probabilistic programming: A review for environmental modellers. Environ. Modell Softw.114, 40–48. 10.1016/j.envsoft.2019.01.014 (2019). [Google Scholar]
31.Guiming, W. Bayesian regression models for ecological count data in PyMC3. ECOL. Inf.10.1016/j.ecoinf.2021.101301 (2021). 63,101301. [Google Scholar]
32.Bradford, M. A. et al. Managing uncertainty in soil carbon feedbacks to climate change. Nat. Clim. Change. 6, 751–758 (2016). [Google Scholar]
33.Brocca, L., Melone, F., Moramarco, T., Wagner, W. & Hasenauer ASCAT soil wetness index validation through in situ and modeled soil moisture data in central Italy. Remote Sens. Environ.114, 2745–2755 (2010). [Google Scholar]
34.Alan, E., Diggle, G., Guttorp, P. J., Fuentes & P. and M. Handbook of Spatial Statistics (Chapman & Hall/CRC, 2010).
35.Finn, L. & Rue, H. B. Spatial modelling with R-INLA. J. Stat. Softw.10.18637/jss.v063.i19 (2015). .63. [Google Scholar]
36.Cameletti, M., Finn, L., Simpson, D. & Rue, H. Spatio-temporal modeling of particulate matter concentration through the SPDE approach. ASTA-Adv Stat. Anal.97(2), 109–131. 10.1007/s10182-012-0196-3 (2012). [Google Scholar]
37.Carbó, E. et al. Modeling influence of Soil properties in different gradients of Soil moisture: the case of the Valencia Anchor Station Validation Site, Spain. Remote Sens.13, 5155. 10.3390/rs13245155 (2021). [Google Scholar]
38.Christian, P. R., Cornuet, J. M., Marin, J. M. & Pillai, N. S. Lack of confidence in approximate bayesian computation model choice. P Natl. Acad. Sci.108(37), 15112–15117 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Brooks, S., Gelman, A., Jones, G. & Meng, X. L. Handbook of Markov Chain Monte Carlo (Chapman & Hall/CRC Handbooks of Modern Statistical Methods (CRC, 2011).
40.Patil, A., Huard, D., Fonnesbeck, C. & PyMC Bayesian stochastic modelling in python. J. Stat. Softw.35(4), 1–81 (2010). [PMC free article] [PubMed]
41.Salvatier, J., Wiecki, T. V. & Fonnesbeck, C. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci.2, e55 (2016). [Google Scholar]
42.Martin, O. Packt Publishing Press,. Bayesian Analysis with Python: Introduction to Statistical Modeling and Probabilistic Programming Using PyMC3 and ArviZ, 2nd Edition (2018).
43.Martin, O., Kumar, R. & Junpeng, L. Bayesian Modeling and Computation in Python (CRC, 2022).
44.Carpenter, B. et al. Ridell. Stan: A probabilistic programming language. J. Stat. Softw.76(1), 1–32 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Hoffman, M. D. & Gelman, A. The No-UTurn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res.15(1), 1593–1623 (2014). [Google Scholar]
46.Vehtari, A. et al. Rank-Normalization, folding, localization: an improved widehat R$ for assessing convergence of MCMC. Bayesian Anal. 1–30 (2021).
47.McElreath, R. & Rethinking Statistical rethinking course and book package. R Package Version. 1, 59 (2017). https://github.com/rmcelreath/rethinking [Google Scholar]
48.Yujian, Y. & Yingqiang, S. Application of poisson process to drought prediction—the case study of Yucheng city. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLVIII–3/W1, 73–78 (2022).
49.Davidson, P. C. (2015). http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
50.Vereecken, H., Amelung, W. & Bauke, S. L. Soil hydrology in the Earth system. Nat. Rev. Earth Environ.3, 573–587 (2022). [Google Scholar]
51.Karianne, J., Bergen, P. A., Maarten, J. V. H. & Gregory, C. B. Machine learning for data-driven discovery in solid Earth geoscience. Science. 363 (6433), eaau0323. 10.1126/science.aau0323 (2019). [DOI] [PubMed] [Google Scholar]
52.Kingma, D. P. & Welling, M. Auto-encoding Variational Bayes. (2013). arXiv:1312.6114 [stat.ML].
53.Patel, A. B., Nguyen, M. T. & Baraniuk, R. A probabilistic framework for deep learning. Adv. Neural Inf. Process. Syst.29, 2558–2566 (2016). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data availability statementAll data included in this study are available upon request by contact with the corresponding author.

[CR1] 1.Jacob, S. et al. State-of-the-art global models underestimate impacts from climate extremes. Nat. Commun.10, 1005 (2019). https://www.nature.com/articles/s41467-019-08745-6 [DOI] [PMC free article] [PubMed]

[CR2] 2.Eric, H. NASA’s new soil moisture satellite could improve forecasts. Science. 10.1126/science.aaa6407 (2015). [Google Scholar]

[CR3] 3.Gao, S. G. et al. Estimating the spatial distribution of soil moisture based on bayesian maximum entropy method with auxiliary data from remote sensing. Int. J. Appl. Earth Obs. 32, 54–66. 10.1016/j.jag.2014.03.003 (2014). [Google Scholar]

[CR4] 4.Dorigo, W. A. et al. The International Soil Moisture Network: A data hosting facility for global in situ soil moisture measurements. Hydrol. Earth Syst. Sc. 15(5), 1675–1698 (2011). [Google Scholar]

[CR5] 5.Dirmeyer, P. A. et al. GSWP-2: Multimodel analysis and implications for our perception of the land surface. B Am. Met. Soc.87, 1381–1397. 10.1175/BAMS-87-10-1381 (2006). [Google Scholar]

[CR6] 6.National Academies of Sciences, Engineering, and Medicine. Science Breakthroughs to Advance Food and Agricultural Research by 2030 (National Academies, 2019). 10.17226/25059

[CR7] 7.Martínez, M. J. F., Hueso, G. P. & Ruiz, S. J. D. Topsoil moisture mapping using geostatistical techniques under different Mediterranean climatic conditions. Sci. Total Environ.595, 400–412. 10.1016/j.scitotenv.2017.03.291 (2017). [DOI] [PubMed] [Google Scholar]

[CR8] 8.Goovaerts, P. Geostatistical modelling of uncertainty in soil science. Geoderma. 103(1–2), 3–26. 10.1016/S0016-7061(01)00067-2 (2001). [Google Scholar]

[CR9] 9.Diggle, P. J. & Paulo, J. R. Model-Based Geostatistics (1st ed.) 12–57. Springer Series in Statistics (2007).

[CR10] 10.Yujian, Y., Yanbo, H., Yong, Z. & Xueqin, T. Optimal Irrigation Mode and Spatio-temporal variability characteristics of Soil Moisture Content in different growth stages of winter wheat. Water. 10(9), 1180 (2018). [Google Scholar]

[CR11] 11.Douaik, A., Meirvenne, M. V. & Tóth, T. Soil salinity mapping using spatio-temporal kriging and bayesian maximum entropy with interval soft data. Geoderma. 128(3–4), 234–248. 10.1016/j.geoderma.2005.04.006 (2005). [Google Scholar]

[CR12] 12.Christakos, G., Serre, M. L. & Kovitz, J. L.BME representation of particulate matter distributions in the state of California on the basis of uncertain measurements. J. Geophys. Res. Atmos.106(9), 9717–9731. 10.1029/2000JD900780 (2001). [Google Scholar]

[CR13] 13.Chutian, Z. & Yong, Y. Can the spatial prediction of soil organic matter be improved by incorporating multiple regression confidence intervals as soft data into BME method? Catena. 178, 322–334 (2019).

[CR14] 14.Gábor, S. & László, P. Comparison of various uncertainty modelling approaches based on geostatistics and machine learning algorithms. Geoderma. 337, 1329–1340. 10.1016/j.geoderma.2018.09.008 (2019). [Google Scholar]

[CR15] 15.Poggio, L., Gimona, A., Spezia, L. & Brewer, M. J. Bayesian spatial modelling of soil properties and their uncertainty: The example of soil organic matter in Scotland using R-INLA. Geoderma. 277, 69–82. 10.1016/j.geoderma.2016.04.026 (2016). [Google Scholar]

[CR16] 16.Rue, H., Martino, S. & Chopin, N. Approximate bayesian inference for latent gaussian models using integrated nested Laplace approximations. J. R Stat. Soc. B. 71, 319–392 (2009). [Google Scholar]

[CR17] 17.Chenconghai, Y., Lin, Y., Lei, Z. & Chenghu, Z. Soil organic matter mapping using INLA-SPDE with remote sensing based soil moisture indices and Fourier transforms decomposed variables. Geoderma. 437, 116571. 10.1016/j.geoderma.2023.116571 (2023). [Google Scholar]

[CR18] 18.Huang, J., Malone, B. P., Minasny, B., McBratney, A. B. & Triantafilis, J. Evaluating a bayesian modelling approach (INLA-SPDE) for environmental mapping. Sci. Total Environ.609, 621–632. 10.1016/j.scitotenv.2017.07.201 (2017). [DOI] [PubMed] [Google Scholar]

[CR19] 19.Carbó, E. et al. Modeling influence of Soil properties in different gradients of Soil moisture: the case of the Valencia Anchor Station Validation Site. Spain Remote Sens.13, 5155. 10.3390/rs13245155 (2021). [Google Scholar]

[CR20] 20.Moraga, P. Spatial Statistics for data Science: Theory and Practice with R 20–208 (Chapman & Hall/CRC Data Science Series, 2023).

[CR21] 21.Lindgren, F., Rue, H. & Lindstrom, J. An explicit link between Gaussian fields and Gaussian Markov random fields the SPDE approach. J. R Stat. Soc. B 423–498 (2011).

[CR22] 22.Virgilio Gómez Rubio. Bayesian Inference with INLA1-257 (Chapman&Hall/CRC, 2020).

[CR23] 23.Maeder, P. et al. Soil fertility and Biodiversity in Organic Farming. Science. 296(5573), 1694–1697 (2002). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature. 521, 452–459 (2015). [DOI] [PubMed] [Google Scholar]

[CR25] 25.Doucet, A., Freitas, J. F. G. & Gordon, N. J. Sequential Monte Carlo Methods in Practice, 23-98. Springer (2000).

[CR26] 26.Tenenbaum, J. B., Kemp, C., Griths, T. L. & Goodman, N. D. How to grow a mind: statistics, structure, and abstraction. Science. 331, 1279–1285 (2011). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Neal, R. M. MCMC using hamiltonian dynamics. In (eds Brooks, S., Gelman, A. & Meng, X. L.) G. J. Handbook of Markov Chain Monte Carlo (Chapman & Hall / CRC, (2010).

[CR28] 28.Pekel, J. F., Cottam, A., Gorelick, N. & Belward, A. High-resolution mapping of global surface water and its long-term changes. Nature. 540, 418–422 (2016). [DOI] [PubMed] [Google Scholar]

[CR29] 29.Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science. 324, 81–85 (2009). [DOI] [PubMed] [Google Scholar]

[CR30] 30.Christopher, K. & Mark, B. Probabilistic programming: A review for environmental modellers. Environ. Modell Softw.114, 40–48. 10.1016/j.envsoft.2019.01.014 (2019). [Google Scholar]

[CR31] 31.Guiming, W. Bayesian regression models for ecological count data in PyMC3. ECOL. Inf.10.1016/j.ecoinf.2021.101301 (2021). 63,101301. [Google Scholar]

[CR32] 32.Bradford, M. A. et al. Managing uncertainty in soil carbon feedbacks to climate change. Nat. Clim. Change. 6, 751–758 (2016). [Google Scholar]

[CR33] 33.Brocca, L., Melone, F., Moramarco, T., Wagner, W. & Hasenauer ASCAT soil wetness index validation through in situ and modeled soil moisture data in central Italy. Remote Sens. Environ.114, 2745–2755 (2010). [Google Scholar]

[CR34] 34.Alan, E., Diggle, G., Guttorp, P. J., Fuentes & P. and M. Handbook of Spatial Statistics (Chapman & Hall/CRC, 2010).

[CR35] 35.Finn, L. & Rue, H. B. Spatial modelling with R-INLA. J. Stat. Softw.10.18637/jss.v063.i19 (2015). .63. [Google Scholar]

[CR36] 36.Cameletti, M., Finn, L., Simpson, D. & Rue, H. Spatio-temporal modeling of particulate matter concentration through the SPDE approach. ASTA-Adv Stat. Anal.97(2), 109–131. 10.1007/s10182-012-0196-3 (2012). [Google Scholar]

[CR37] 37.Carbó, E. et al. Modeling influence of Soil properties in different gradients of Soil moisture: the case of the Valencia Anchor Station Validation Site, Spain. Remote Sens.13, 5155. 10.3390/rs13245155 (2021). [Google Scholar]

[CR38] 38.Christian, P. R., Cornuet, J. M., Marin, J. M. & Pillai, N. S. Lack of confidence in approximate bayesian computation model choice. P Natl. Acad. Sci.108(37), 15112–15117 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Brooks, S., Gelman, A., Jones, G. & Meng, X. L. Handbook of Markov Chain Monte Carlo (Chapman & Hall/CRC Handbooks of Modern Statistical Methods (CRC, 2011).

[CR40] 40.Patil, A., Huard, D., Fonnesbeck, C. & PyMC Bayesian stochastic modelling in python. J. Stat. Softw.35(4), 1–81 (2010). [PMC free article] [PubMed]

[CR41] 41.Salvatier, J., Wiecki, T. V. & Fonnesbeck, C. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci.2, e55 (2016). [Google Scholar]

[CR42] 42.Martin, O. Packt Publishing Press,. Bayesian Analysis with Python: Introduction to Statistical Modeling and Probabilistic Programming Using PyMC3 and ArviZ, 2nd Edition (2018).

[CR43] 43.Martin, O., Kumar, R. & Junpeng, L. Bayesian Modeling and Computation in Python (CRC, 2022).

[CR44] 44.Carpenter, B. et al. Ridell. Stan: A probabilistic programming language. J. Stat. Softw.76(1), 1–32 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Hoffman, M. D. & Gelman, A. The No-UTurn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res.15(1), 1593–1623 (2014). [Google Scholar]

[CR46] 46.Vehtari, A. et al. Rank-Normalization, folding, localization: an improved widehat R$ for assessing convergence of MCMC. Bayesian Anal. 1–30 (2021).

[CR47] 47.McElreath, R. & Rethinking Statistical rethinking course and book package. R Package Version. 1, 59 (2017). https://github.com/rmcelreath/rethinking [Google Scholar]

[CR48] 48.Yujian, Y. & Yingqiang, S. Application of poisson process to drought prediction—the case study of Yucheng city. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLVIII–3/W1, 73–78 (2022).

[CR49] 49.Davidson, P. C. (2015). http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/

[CR50] 50.Vereecken, H., Amelung, W. & Bauke, S. L. Soil hydrology in the Earth system. Nat. Rev. Earth Environ.3, 573–587 (2022). [Google Scholar]

[CR51] 51.Karianne, J., Bergen, P. A., Maarten, J. V. H. & Gregory, C. B. Machine learning for data-driven discovery in solid Earth geoscience. Science. 363 (6433), eaau0323. 10.1126/science.aau0323 (2019). [DOI] [PubMed] [Google Scholar]

[CR52] 52.Kingma, D. P. & Welling, M. Auto-encoding Variational Bayes. (2013). arXiv:1312.6114 [stat.ML].

[CR53] 53.Patel, A. B., Nguyen, M. T. & Baraniuk, R. A probabilistic framework for deep learning. Adv. Neural Inf. Process. Syst.29, 2558–2566 (2016). [Google Scholar]

PERMALINK

Spatial variability and uncertainty associated with soil moisture content using INLA-SPDE combined with PyMC3 probability programming

Yujian Yang

Xueqin Tong

Abstract

Introduction

Materials and methods

Study area and TDR field sampling

INLA-SPDE model

Basic principles of INLA-SPDE model

Key points using the INLA-SPDE model

Triangulated mesh construction

INLA-SPDE model construction

Space mapping and plotting of INLA-SPDE model

PyMC3 bayesian probabilistic programming

Fig. 1.

Evaluation and validation of SVMC

Results and analysis

Statistics characteristics of SVMC

Spatial uncertainty associated with SVMC based on INLA-SPDE model

Fig. 2.

Fig. 3.

Fig. 4.

Transparency, interpretability and uncertainty related to SVMC

Transparency, interpretability related to SVMC

Fig. 5.

Fig. 6.

Table 1.

Table 2.

Robustness and uncertainty associated with SVMC

Fig. 7.

Discussion

Conclusions

Acknowledgements

Author contributions

Funding

Data availability

Declarations

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases