Abstract
Soil hydraulic parameters are vital for precisely characterizing soil hydrological processes, which are critical indicators for regulating climate change effects on terrestrial ecosystems and governing feedbacks between water, energy, and carbon–nitrogen cycles. Although many studies have integrated comprehensive soil datasets, data quality and cost challenges result in data completeness deficiencies, especially for deep soil information. These gaps not only impede methodological endeavours but also constrain soil parameter-based ecosystem process studies spanning from local profiles to global earth system models. We established a soil dataset across the entire Yellow River Basin (YRB) (795,000 km2) using high-density in situ field sampling. This observation-based dataset contains records of soil texture (2924), bulk density (2798), saturated hydraulic conductivity (2782), and water retention curve parameters (1035) down to a maximum depth of 5 m. This dataset, which extends the recorded data range for deep soil hydraulic parameters, is valuable as a direct data resource for environmental, agronomical and hydrological studies in the YRB and regions with similar pedological and geological backgrounds around the world.
Subject terms: Hydrology, Ecology
Background & Summary
Soils serve as a crucial interface between atmosphere, biosphere, hydrosphere and lithosphere1,2, profoundly influencing matter and energy cycling within terrestrial ecosystems3–5. In particular, soil hydrological processes play a pivotal role in regulating the impact of climate change on terrestrial ecosystems and feedback mechanisms between water, energy, and carbon–nitrogen cycles6–10. Soil hydraulic parameters, which are in turn largely determined by soil texture and structure, serve as key factors in accurately depicting soil hydrological processes11–13. For instance, the saturated conductivity (Ks) is a major control of moisture movement, distribution, and fluctuations within the soil profile14,15. The matrix potential, which describes the strength of adhesive forces between soil moisture and solid components of the soil, determines the plant-availability of soil moisture. The soil water retention curve (SWRC), which defines the relationship between soil moisture content and matrix potential16,17, affects a range of processes including evaporation18. Thus, it is one of the fundamental attributes that characterise soil hydraulics19.
The main methods for acquiring the aforementioned parameters encompass in situ sampling20 and the use of pedotransfer function (PTF)21–23. The considerable costs of in situ sampling has led to a growing interest in the establishment and use of PTFs9,24. However, most PTFs are developed for specific regions, and their applicability to areas with different soil and climatic conditions is limited, necessitating re-calibration based on field measurements12. The challenges associated with obtaining soil parameters not only impede methodological endeavours, such as up-scaling based on PTFs, but also impose limitations on ecosystem process studies that rely on soil parameters25. These limitations affect research at a broad range of scales from site-level profile investigations26,27 to calibration and parametrization of comprehensive ecosystem models2. Therefore, the accurate measurement of soil parameters is highly beneficial for assessing soil hydrological processes not just at local scale, but also for up-scaling to regional scales, and thus facilitating multiscale ecohydrological process studies28.
Currently, a multitude of datasets, including Florida Soil Characterization Data29, WoSIS30, and UNSODA31, are dedicated to aggregating a diverse range of soil parameters derived from field measurements, comprising, in particular, the essential SWRC parameters. However, a considerable proportion of these data exhibits vague sample point coordinates and insufficient data pairs for establishing the SWRC, often lacking the wet end of the SWRC (water head ≤ 0.2 m)24. To address the limitations of field measurements, some studies have employed integrated PTFs to derive soil hydraulic parameters at national and global scales32,33. As illustrated by Gupta, Papritz24, integrated field measurements with PTFs to effectively globally extend the applicability of soil hydraulic parameters by supplementing missing measurement data. However, the extensive datasets mentioned above still contain limited information regarding deep soil profiles, particularly regarding the scarcity of soil information below a depth of 3 m34. Deep soil water, which is largely mediated by vegetation35, acts an important role in enabling vegetation to withstand drought stress36,37 and water is also a key factor affecting the soil’s ability to sequester carbon38. Hence, deep soil hydraulic processes play an important role in terrestrial hydrology and soil carbon budgets39. Given the potential impact of soil profile heterogeneity on hydraulic parameters9,28, which constrains the applicability of shallow soil data, it becomes necessary to broaden the depth of investigation for soil hydraulic parameters. The compilation of deep soil profile information and incorporation of detailed field records would serve as a valuable complement to existing soil datasets.
Given the limitations of the current datasets outlined above, the objective of this study was to utilise geographically precise field measurements from deep soil profiles to extend existing soil datasets with reliable deep soil property records. Furthermore, we sought to provide a quantitative foundation to facilitate the development of PTFs that rely on original data. We conducted in situ sampling across the entire Yellow River Basin (YRB). The YRB is extensive (795,000 km²), irrigating over 15% of China’s cultivated land and sustaining more than 12% of China’s population40. Furthermore, this basin encompasses most of China’s important ecological barrier belt41, including the Loess Plateau (LP), the world’s largest loess deposition region. Historically, severe soil erosion in this region has led to substantial loss of soil carbon to the ocean via the Yellow River, profoundly impacting the land carbon budget42–44. Over the past two decades, China has been one of the leading contributors to the land greening observed around the globe45, with the LP taking a prominent role through the “Grain for Green” program for ecological restoration12,46. Given the significance of the YRB for global carbon cycling, climate change, food security, and ecological stability, the investigation of soil parameters in this region does not only hold the value for regional environmental and agronomic studies, but also provide some valuable supplementation to the current global pool of soil hydraulic datasets. Moreover, our dataset offers more possibilities for ecohydrological studies including observation and modelling that focus on deep profiles by providing soil hydraulic parameters down to a profile depth of 5 m.
During three years (2008, 2018, 2019) of fieldwork, we collected a total of 2925 disturbed soil samples and 2800 undisturbed soil samples throughout the whole YRB. This extensive, and high-density observation grid contains measurements of soil hydraulic properties down to a maximum depth of 5 m. The profiles were analyzed in the laboratory, and measurements were subjected to comprehensive data quality control and cleansing processes. Furthermore, we employed the “soilhypfit” package47 in R (4.2.3 version) to fit the SWRC via the van Genuchten (VG) model. It should be noted that all SWRC records were derived from 10 pairs of corresponding soil matrix potential and moisture content data, covering a broad range of matrix potentials from 0.1 bar to 10 bars. For our dataset, we finally retained 2924 records of soil texture, 2798 records of soil bulk density (BD), 2782 records of Ks, and 1035 SWRC records. All records were consolidated into a unified dataset. This dataset further provides detailed meta-information for each sample, including sampling time, coordinates, elevation, depth, and land use type. We opted to preserved as much of the observed data as possible, but assigned categories of data quality which may help users to balance between quantity and quality of data depending on their research objectives and requirements. This dataset will be of value as a direct resource for environmental, agronomical and hydrological studies, as well as for calibrating PTFs. Although the spatial coverage of this dataset is limited, it covers the extensive YRB, filling the data gaps in this region and will also provide a useful data resource for studying other regions with comparable environmental setting worldwide. Finally, this dataset effectively extends the range of recorded data for deep soil hydraulic parameters around the world.
Methods
Study area and sampling site layout
The study area comprised the whole YRB (Fig. 1), which covers an approximate area of 795,000 km2 (95°53′–119°5′E and 32°10′–42°50′N)2,48. The Yellow River spans a length of 5464 km49, ranking as the fifth longest river in the world. We acquired disturbed and undisturbed soil samples by conducting large-scale in situ sampling in two phases. The first phase involved high-density shallow-profile sampling from April to November 2008. The second phase comprised medium-density deep-profile sampling conducted from September to December 2018 and from October to November 2019. We selected the sampling sites by overlaying digital maps of the sampling area by a high-density sampling grid. This grid ensured uniform partitioning of the entire basin, with the centre of each grid serving as the initial choice of the sampling site. Subsequently, the sampling locations were adjusted based on topography, soil depth, and vegetation type to increase their representativeness. Ultimately, 382 sampling sites were established in the first phase and 93 in the second phase (Fig. 1).
Fig. 1.
Spatial distribution of soil sampling sites in the Yellow River Basin.
Field methods
In the first phase, we excavated a 40 cm deep profile at each sampling point and collected disturbed and undisturbed soil samples from two layers (0–5 cm and 20–25 cm). During this stage, 764 disturbed and 764 undisturbed soil samples were collected. Undisturbed soil cores were placed into metal cylinders after collection to facilitate the subsequent measurement of soil hydraulic parameters50.
In the second phase, to facilitate deep undisturbed soil sampling (the targeted depth is 5 m), we employed a hand-held drilling machine (CHPD78, Christie Engineering Pty Ltd., Australia). To prevent compression in the soil core, a dual-tube setup was used within the drilling pipe, with an inner retrievable tube designed to accommodate the soil cores. The core diameter was 37 mm, and the inner tube was replaced every 1 m during drilling to ensure the sample integrity. To ensure sample correspondence, two boreholes (with 0.5 m distance) were drilled at each sampling point to retrieve the disturbed and undisturbed soil samples (Fig. 2). For the surface layer, the disturbed and undisturbed soil samples were obtained from the depth of 0.05 m. Subsequently, the sampling was carried out every 20 cm starting from the depth of 0.2 m. During this phase, 2161 disturbed and 2036 undisturbed soil samples were collected. Owing to constraints related to soil depth and the structure in certain layers, the number of undisturbed samples was lower than that of disturbed samples. As in the first phase, the undisturbed soil samples were placed into metal cylinders after collection. To prevent samples inside the metal cylinders from disturbance, we preserved them in a shockproof foam box after sampling and promptly returned them to the laboratory for the storage. Ultimately, a total of 2925 disturbed soil samples and 2800 undisturbed soil samples were collected in the two phases.
Fig. 2.
Schematic of in situ soil sampling using a handheld drilling machine.
Laboratory methods
All disturbed soil samples underwent preprocessing, including weed removal, air-drying, grinding, and sieving (using a 1 mm mesh), before particle size distribution was analyzed. Mastersizer laser particle size analysers (Mastersizer 3000, Malvern Panalytical, UK) were used to determine soil particle size distribution. Subsequently, the soil particle sizes were categorised according to the United States Department of Agriculture (USDA) standards into clay particles (< 0.02 mm), silt particles (0.02–0.5 mm), and sand particles (> 0.5 mm), leading to the classification of soil texture following USDA standards51.
The undisturbed soil samples were initially immersed for 24 h to achieve full saturation. Subsequently, we performed the determination of Ks using the constant-head method52, which involves in maintaining a constant water head infiltration through the Mariotte bottle until a stable infiltration rate is reached. Then, the amount of water passing through the sample within a fixed time were measured to calculate the Ks. Each sample was measured three times to ensure the accuracy. Centrifuge and pressure plate instrument methods are the most widely used methods for SWRC in the laboratory53. The distinction between the two methods is as follows: In the low-suction range, the pressure plate method yields fewer data points and leads to a lower precision, whereas the centrifuge method provides relatively higher precision. However, the centrifuge method can be notably affected by density changes in soils with coarser textures, potentially resulting in lower precision. In the high-suction range, the pressure plate method may yield less accurate results for soils with high clay and silt contents because of inadequate drainage during the measurement process. In this case, the centrifuge method is more suitable. Considering the high silt content of most samples in this study and the time costs of pressure plate instrument method, the centrifuge method is more suitable for determining the SWRC. Utilising a centrifuge (CR21N, Hitachi, Japan), we set a series of different rotate speeds to correspond to different suction conditions (as being outlined in Table 1). After implementing each centrifugation process corresponding to different rotate speeds under a constant temperature of 20 °C, we removed the metal cylinders from the rotor, weighed, and recorded the total mass of the metal cylinders and the internal soil sample. Then, using the final measurement of the metal cylinders and dry soil mass, we calculated the gravimetric soil water contents corresponding to different suctions. Prior to measuring the SWRC, the soil saturation water content (θs) was initially tested. Subsequently, the BD was assessed after oven-drying (at 105 °C for 10 h), enabling the conversion of gravimetric water content to volumetric water content.
Table 1.
Soil matrix potential and corresponding water head range when measuring the soil water retention curve based on centrifugation.
| No. | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Ψ (bar) | 0.01 | 0.10 | 0.30 | 0.60 | 0.80 | 1.00 | 3.00 | 6.00 | 8.00 | 10.00 |
| h (mH2O) | 0.10 | 1.02 | 3.06 | 6.12 | 8.16 | 10.20 | 30.59 | 61.18 | 81.58 | 101.97 |
SWRC fitting and parameter acquisition based on the VG model
Upon obtaining soil water suction and volumetric moisture content data for each sampling point, we employed the “soilhypfit” package in R for fitting the SWRC using the “fit_wrc_hcc” function, in line with existing research24. “soilhypfit” is an R package designed for the parametric modelling of soil water retention and hydraulic conductivity data. This function allows the estimation of SWRC parameters based on the van Genuchten (VG) model17, with the constraint m = 1-1/n. The VG equation (Eq. 1) is as follows:
| 1 |
where θ(ψ) (m3/m3) denotes the volumetric soil water content at matric potential ψ, and θs (m3/m3) and θr (m3/m3) represent the saturated and residual water contents, respectively. The (m−1) is a parameter related to the inverse of air entry pressure, and n is a dimensionless shape parameter of the VG model. During the prediction process, the “fit_wrc_hcc” function estimates parameters of the SWRC from respective measurements using the maximum likelihood method, optionally subject to physical constraints on the estimated parameters, and utilises the optimisation algorithm from the NLopt library54 or the Stochastic Complex Evolution (SCE) algorithm55. According to existing research24, we constrained n within the range from 1.0 to 7.0 and α within the range from 0 to 100 (m−1) during the fitting process. Field capacity (FC) and permanent wilting point (PWP) are two key parameters that determine the soil water availability for plants and the maximum soil water-holding capacity56. Hence, utilising the SWRC curves derived from the fitted VG models at each point, we projected the volumetric water content corresponding to FC (−1/3 bar, −3.37 mH2O) and PWP (−15 bar, −152.96 mH2O) for these sampling locations57.
Data Records
After collating the measured and predicted soil parameters, a comprehensive soil hydraulic parameter dataset for the YRB was established. This dataset encompasses sampling points spanning the entire basin in terms of horizontal spatial distribution, with 382 shallow profile points at a resolution of 40 × 40 km and 93 deep profile points at a resolution of 100 × 100 km. The dataset has been uploaded and can be accessed via the link of https://doi.pangaea.de/10.1594/PANGAEA.96500458.
All the data in the dataset, excluding the SWRC curve parameters, were derived from direct measurements. According to USDA classification, the soil texture in the dataset falls into two major categories: loamy and sandy soils (Fig. 3).
Fig. 3.

Soil texture classification of samples from the Yellow River Basin. Classification was based on USDA soil texture classification standards.
The loamy soil category includes sandy loam, loam, silt, silty loam, silty clay loam, and clay loam. The sandy soil category includes sand and loamy sand. Among them, silty loam constituted the highest proportion (74.53%), followed by sandy loam (9.40%). The remaining soil texture classes constitute less than 5% of sampling point (Fig. 4).
Fig. 4.

Soil texture proportion of samples from the Yellow River Basin. Sa, Sand; LoSa, Loamy Sand; SaLo, Sandy Loam; Lo, Loam; SiClLo, Silt Clay Loam; SiLo, Silt Loam; Si, Silt; and ClLo, Clay Loam.
The original data for the SWRC curve of each sampling point was also derived from direct measurements by fitting a VG model to derive the θs, θr, and shape parameters (α and n). According to the kernel density plot, it can be observed that the fitted VG model parameters are generally distributed within a reasonable range (Fig. 5). We applied the necessary data cleaning and quality control procedures (see Technical Validation). To preserve the integrity of the original measurement data, we introduced the relative error range information (θs_RE_range) into the dataset to describe the quality of the SWRC parameter fitting.
Fig. 5.
Kernel density plots of van Genuchten (VG) model parameters distribution. θs represents the saturated water content (a); θr represents residual water content (b); α (c) and n (d) are the shape parameters of the VG model.
Ultimately, 2925 disturbed field soil samples and 2800 undisturbed field soil samples were collected, and most profiles covered a depth down to 5 m. This dataset comprises 31 variables and 2925 records, and ultimately contains 2924 records for soil texture, 2798 records for soil BD, 2782 records for Ks, and 1035 records for SWRC parameters after data quality control and cleaning. A detailed description of each variable is provided in Table 2. Furthermore, a graphical representation (Venn diagram) illustrates the overlap among the different measurement indicators (Fig. 6).
Table 2.
List of 31 variables in the Yellow River Basin soil hydraulic parameter dataset and their descriptions and units.
| Header | Description | Unit |
|---|---|---|
| site_id | Number of the sampling site | — |
| sample_id | Number of samples at each sampling site | — |
| longitude | Longitude coordinates based on the WGS84 system | — |
| latitude | Latitude coordinates based on the WGS84 system | — |
| elevation | Elevation of the surface of the sampling site | m |
| land_use | Land-use type of the sampling site | — |
| sampling_year | The year in which the soil sample was collected | — |
| sampling_depth | The depth at which the soil sample was collected | m |
| clay | Clay (< 0.02 mm) content in soil samples | % |
| silt | Silt (0.02–0.5 mm) content in soil samples | % |
| sand | Sand (> 0.5 mm) content in soil samples | % |
| method_particle | Method for measuring soil particle composition | — |
| soil_texture_quality | Quality level of soil particle composition measuring, the levels of “A”, “B” and “C” represent high, medium and low data quality, respectively | — |
| soil_texture_class1 | Soil texture classification based on USDA (broad categories) | — |
| soil_texture_class2 | Soil texture classification based on USDA (subclasses) | — |
| BD | Bulk density of soil | g/cm3 |
| method_BD | Method for measuring soil bulk density | — |
| Ks | Soil saturated water conductivity | cm/min |
| method_Ks | Method for measuring soil saturated water conductivity | — |
| method_SWRC | Method for measuring soil water retention curves | — |
| meaured_θs | Measured saturated soil water content from soil samples | m3/m3 |
| fit_θs | Fitted saturated soil water content based on “soilhypfit” package in R | m3/m3 |
| fit_θr | Fitted residual soil water content based on “soilhypfit” package in R | m3/m3 |
| fit_α | Fitted shape parameter of van Genuchten model based on “soilhypfit” package in R | m−1 |
| fit_n | Fitted shape parameter of van Genuchten model based on “soilhypfit” package in R | — |
| fit_m | Fitted shape parameter of van Genuchten model based on “soilhypfit” package in R | — |
| fit_FC | Field capacity predicted by fitted SWRC curve based on “soilhypfit” package in R | m3/m3 |
| fit_PWP | Permanent wilting point predicted by fitted SWRC curve based on “soilhypfit” package in R | m3/m3 |
| fit_r2 | Coefficient of determination when fitting SWRC based on “soilhypfit” package in R | — |
| θs_RE | Absolute relative error of fitted and measured saturated soil water content | % |
| θs_RE_range | Quantile range of absolute relative error of fitted and measured saturated soil water content. “Q1”, “Q3”, “Min”, and “Max” represent the “first quartile”, “third quartile”, “Q1 minus 1.5IQR”, and “Q3 plus 1.5IQR”, respectively. “-” is a symbol used to represent the range of values. | — |
Fig. 6.

Venn diagram illustrating the number of various measurement indicators derived from soil samples in the Yellow River Basin. BD represents the soil bulk density, Ks represents the soil saturated hydraulic conductivity, and vg Parameters represents the van Genuchten model parameters.
Technical Validation
Data verification and cleaning
Prior to analysis, all field-collected samples underwent a preliminary inspection to ensure the integrity and non-mixing of the disturbed samples, and the undisturbed samples in the metal cylinders were free from vibration-induced cracking or any damage. Subsequently, the original measurement data were subjected to thorough validation and data cleansing procedures. Regarding the BD, we eliminated the sample results with BD > 2.65 g/cm3 during the quality control process59. Regarding the particle size distribution data, by comprehensively referring to the existed national measurement standards and literatures24, we directly excluded the samples (11 records, denoted as “Error” in dataset) when the sum of particle size class contributions (clay + silt + sand) was not within 100 ± 3%. Subsequently, samples were classified based on the absolute difference between the sum of particle size class fractions and 100% as follows: Level A (< 1%), Level B (1% ≤ difference < 2%), and Level C (2% ≤ difference < 3%).
Constraints on VG fitting
In order to assess the quality of SWRC fitting based on the VG model using the “soilhypfit” package, we computed the coefficient of determination (R2) for each model fit. As the “soilhypfit” package lacks a built-in function for directly calculating R2, we employed the following approach (Eq. 2):
| 2 |
where SSE represents the sum of squared errors obtained from the “ssq_wc” output of the “fit_wrc_hcc” function, while SST represents the total sum of squares total calculated based on the variances of the measured data at each point. All VG model fits yielded R2 values above 0.9, indicating very high fitting performance.
To further assess the fitted data quality, we attempted to retrieve field surveys of soil hydraulic parameters from the same research area for the comparison. After filtering, we selected and plotted the spatial distribution of the mean hydraulic parameters within 0–5 m of the LP, which is the main body of the YRB (Fig. 7). The results show that the spatial distribution of θs, θr, , and n all have zonal characteristics, exhibiting obvious spatial heterogeneity. Moreover, θs, α, and n have similar spatial distribution characteristics with existed investigation results60 in most areas within the LP, which validates the reliability of our survey to some extents.
Fig. 7.
Spatial distribution of the averaged soil hydraulic parameters within 0–5 m in the Chinese Loess Plateau. θs represents the saturated water content (a); θr represents residual water content (b); α (c) and n (d) are the shape parameters of the van Genuchten model. The green dots represent our sampling sites.
It should be noted that finding an investigation that perfectly matches our dataset in terms of survey range, depth, and number of sample sites still presents a challenge, limiting the quantitative comparison in space. Therefore, the quality of the fitted data needs to be further assessed. Besides, the inherent limitations of the VG model for fitting to soils with high sand/clay content also need to be considered. Hence, we further calculated the relative errors between fitted θs and measured θs to quantitatively evaluate the fitting quality of each sample point. The calculation method of |RE| is as follows:
| 3 |
Subsequently, we identified the distribution characteristics of |RE| by calculating the quartiles of these relative errors (Fig. 8a). The first (Q1) and third (Q3) quartiles were 1.20% and 6.44%, respectively. The Q3 + 1.5IQR was 14.23%, and the Q1 – 1.5IQR was 4.95E-4%. The results indicated that there were no |RE| values lower than Q1 – 1.5IQR. Upon comparing the fitted θs and measured θs before and after outlier removal, it can be observed that the points after outlier removal are largely distributed along the 1:1 line (Fig. 8b,c). Therefore, in the final dataset, we further classified the SWRC parameters according to the quartile range of |RE|, and marked the sample points where |RE| exceeds the range of Q3 + 1.5IQR as “outliers”. To retain as much of the original data as possible, we included all the RE levels in the dataset.
Fig. 8.
Boxplot of the absolute relative error (|RE|) between the saturated water content (θs) fitted using the “soilhypfit” R package and measured θs (a); the distribution of measured and fitted θs for all data records (b); and the distribution of measured and fitted θs after removal of outliers of |RE| (c). The red line in the figures represents a 1:1 ratio.
By comparing the soil hydraulic parameters of the two main soil types in our datasets (loamy soil and sandy soil) after the outlier removal, we observed that for the loamy soil, all θs, θr, PWP, and FC were higher than those for sandy soil. In contrast, α, n, BD, and Ks for loamy soil were lower than those for sandy soil (Fig. 9). Furthermore, following the study by Goldberg et al.61, we further removed points with FC > 48% and PWP > 36%.
Fig. 9.
Boxplots of the distribution of hydraulic parameters for major soil texture categories. θs represents the saturated water content, θr represents the residual water content, PWP represents the permanent wilting point and FC represents the field capacity (a); α (b) and n (c) represent the shape parameters of the van Genuchten model; BD represents the bulk density (d); and Ks represents the saturated hydraulic conductivity (e).
Usage Notes
Considering that all data in this dataset originated from measurements of in situ samples, we strived to preserve the maximum number of sample test records and provided a grading system based on our quality assessment of soil texture measurements and SWRC fitting. Our intention was to allow researchers to freely choose which data to use, and choose between quantity and quality of data according to their requirements. For soil texture measurements, we suggest to use the data at Level A (< 1%) with confidence, while to use the data at level B (1% ≤ difference < 2%) and C (2% ≤ difference < 3%) selectively based on their specific requirements. For SWRC data, despite our efforts, some parameters still exceeded the predefined validity range, which included: one n parameter reached 7 and ten α parameters reached 100 m−1 (comprising 0.09% and 0.96% of the total valid SWRC count, respectively). Moreover, 236 θr were predicted as zero due to their inherently small actual values and 116 |RE| of θs were listed as outliers (comprising 22.8% and 11.2% of the total valid SWRC count, respectively). We recommend cautious utilisation of these records. Constrained by sampling costs, the volume of the dataset remains limited. Nonetheless, we believe that this dataset, entirely based on measured data from in situ samples and encompassing soil hydraulic records down to a profile depth of 5 m, can effectively address the gaps in the pool of existing observational data, and the absence of deep soil information in particular.
Acknowledgements
This research was supported by National Key Research and Development Program of China (No. 2019YFA0607303), National Natural Science Foundation of China (No. U2243204, 42177306, and 41977422), Innovation Cross Team-Key Laboratory project of the Chinese Academy of Sciences, the Youth Innovation Promotion Association CAS, Shaanxi Province Natural Science Basic Research Program (No. 2023JC-XJ-10), and Shaanxi Province Innovation Capability Support Plan Project (No. 2024RS-CXTD- 45).
Author contributions
Yongping Tong: Investigation, Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review &editing. Yunqiang Wang: Conceptualization, Project administration, Supervision, Writing – review & editing. Jingxiong Zhou: Investigation. Xiangyu Guo: Investigation. Ting Wang: Investigation. Yuting Xu: Investigation. Hui Sun: Data curation. Pingping Zhang: Writing – review & editing. Zimin Li: Writing – review & editing. Ronny Lauerwald: Conceptualization, Supervision, Writing – review & editing.
Code availability
The code used to calculate SWRC parameters can be found on Github (https://github.com/TONGYP1116/SoilHydraulicParameter.git).
Competing interests
The authors declare that they have no competing interests. This includes, but is not limited to, financial interests, non-financial interests, patents, or any other interests that may be perceived as influencing the research or its presentation. The corresponding author affirms that this information has been discussed and agreed upon by all authors listed.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Paustian K, et al. Climate-smart soils. Nature. 2016;532:49–57. doi: 10.1038/nature17174. [DOI] [PubMed] [Google Scholar]
- 2.Fatichi S, et al. Soil structure is an important omission in Earth System Models. Nat. Commun. 2020;11:522. doi: 10.1038/s41467-020-14411-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jung M, et al. Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature. 2010;467:951–954. doi: 10.1038/nature09396. [DOI] [PubMed] [Google Scholar]
- 4.Cleveland CC, et al. Patterns of new versus recycled primary production in the terrestrial biosphere. P. Natl. Acad. Sci. 2013;110:12733–12737. doi: 10.1073/pnas.1302768110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Peng S, et al. Simulated high-latitude soil thermal dynamics during the past 4 decades. Cryosphere. 2016;10:179–192. doi: 10.5194/tc-10-179-2016. [DOI] [Google Scholar]
- 6.Seneviratne SI, et al. Investigating soil moisture–climate interactions in a changing climate: A review. Earth-Sci. Rev. 2010;99:125–161. doi: 10.1016/j.earscirev.2010.02.004. [DOI] [Google Scholar]
- 7.Lohse KA, Brooks PD, McIntosh JC, Meixner T, Huxman TE. Interactions between biogeochemistry and hydrologic systems. Annu. Rev. Env. Resour. 2009;34:65–96. doi: 10.1146/annurev.environ.33.031207.111141. [DOI] [Google Scholar]
- 8.Green JK, et al. Large influence of soil moisture on long-term terrestrial carbon uptake. Nature. 2019;565:476–479. doi: 10.1038/s41586-018-0848-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Vereecken H, et al. Soil hydrology in the Earth system. Nat. Rev. Earth Env. 2022;3:573–587. doi: 10.1038/s43017-022-00324-6. [DOI] [Google Scholar]
- 10.Zhu Q, Castellano MJ, Yang G. Coupling soil water processes and the nitrogen cycle across spatial scales: Potentials, bottlenecks and solutions. Earth-Sci. Rev. 2018;187:248–258. doi: 10.1016/j.earscirev.2018.10.005. [DOI] [Google Scholar]
- 11.Lin, H. et al. Hydropedology: Synergistic integration of pedology and hydrology. Water Resour. Res. 42, 10.1029/2005WR004085 (2006).
- 12.Bai X, Shao MA, Jia X, Zhao C. Prediction of the van Genuchten model soil hydraulic parameters for the 5-m soil profile in China’s Loess Plateau. CATENA. 2022;210:105889. doi: 10.1016/j.catena.2021.105889. [DOI] [Google Scholar]
- 13.Wang T, et al. Evaluating climate and soil effects on regional soil moisture spatial variability using EOFs. Water Resour. Res. 2017;53:4022–4035. doi: 10.1002/2017WR020642. [DOI] [Google Scholar]
- 14.Zhao C, Shao MA, Jia X, Nasir M, Zhang C. Using pedotransfer functions to estimate soil hydraulic conductivity in the Loess Plateau of China. CATENA. 2016;143:1–6. doi: 10.1016/j.catena.2016.03.037. [DOI] [Google Scholar]
- 15.Usowicz B, Lipiec J. Spatial variability of saturated hydraulic conductivity and its links with other soil properties at the regional scale. Sci. Rep. 2021;11:8293. doi: 10.1038/s41598-021-86862-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Brooks RH, Corey AT. Hydraulic properties of porous media and their relation to drainage design. T. ASAE. 1964;7:26–0028. doi: 10.13031/2013.40684. [DOI] [Google Scholar]
- 17.van Genuchten MT. A Closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci. Soc. Am. J. 1980;44:892–898. doi: 10.2136/sssaj1980.03615995004400050002x. [DOI] [Google Scholar]
- 18.Ciocca F, Lunati I, Parlange MB. Effects of the water retention curve on evaporation from arid soils. Geophys. Res. Lett. 2014;41:3110–3116. doi: 10.1002/2014GL059827. [DOI] [Google Scholar]
- 19.Assouline S, Tessier D, Bruand A. A conceptual model of the soil water retention curve. Water Resour. Res. 1998;34:223–231. doi: 10.1029/97WR03039. [DOI] [Google Scholar]
- 20.Vereecken, H. et al. On the value of soil moisture measurements in vadose zone hydrology: A review. Water Resour. Res. 44, 10.1029/2008WR006829 (2008).
- 21.Vereecken H, et al. Using pedotransfer functions to estimate the van Genuchten–Mualem soil hydraulic properties: A review. Vadose Zone J. 2010;9:795–820. doi: 10.2136/vzj2010.0045. [DOI] [Google Scholar]
- 22.Zhang Y, Schaap MG. Weighted recalibration of the Rosetta pedotransfer model with improved estimates of hydraulic parameter distributions and summary statistics (Rosetta3) J. Hydrol. 2017;547:39–53. doi: 10.1016/j.jhydrol.2017.01.004. [DOI] [Google Scholar]
- 23.Van Looy K, et al. Pedotransfer functions in Earth system science: Challenges and perspectives. Rev. Geophys. 2017;55:1199–1256. doi: 10.1002/2017RG000581. [DOI] [Google Scholar]
- 24.Gupta S, et al. Global soil hydraulic properties dataset based on legacy site observations and robust parameterization. Sci. Data. 2022;9:444. doi: 10.1038/s41597-022-01481-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Comber, A. et al. A generic approach for live prediction of the risk of agricultural field runoff and delivery to watercourses: linking parsimonious soil-water-connectivity models with live weather data apis in decision tools. Front. Sustain. Food Syst. 3, 42, 10.3389/fsufs.2019.00042 (2019).
- 26.Hartemink AE, Minasny B. Towards digital soil morphometrics. Geoderma. 2014;230-231:305–317. doi: 10.1016/j.geoderma.2014.03.008. [DOI] [Google Scholar]
- 27.Vereecken H, et al. Infiltration from the pedon to global grid scales: An overview and outlook for land surface modeling. Vadose Zone J. 2019;18:180191. doi: 10.2136/vzj2018.10.0191. [DOI] [Google Scholar]
- 28.Pan T, Hou S, Liu Y, Tan Q. Comparison of three models fitting the soil water retention curves in a degraded alpine meadow region. Sci. Rep. 2019;9:18407. doi: 10.1038/s41598-019-54449-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Grunwald, S. Florida soil characterization data. Soil and water science department, IFAS-Institute of Food and Agriculture Science, Tech. Rep., University of Florida, website: https://www.sgrunwald.org/big-data (2020).
- 30.Batjes NH, Ribeiro E, van Oostrum A. Standardised soil profile data to support global mapping and modelling (WoSIS snapshot 2019) Earth Syst. Sci. Data. 2020;12:299–320. doi: 10.5194/essd-12-299-2020. [DOI] [Google Scholar]
- 31.Nemes A, Schaap MG, Leij FJ, Wösten JHM. Description of the unsaturated soil hydraulic database UNSODA version 2.0. J. Hydrol. 2001;251:151–162. doi: 10.1016/S0022-1694(01)00465-6. [DOI] [Google Scholar]
- 32.Dai Y, et al. Development of a China dataset of soil hydraulic parameters using pedotransfer functions for land surface modeling. J. Hydrometeorol. 2013;14:869–887. doi: 10.1175/JHM-D-12-0149.1. [DOI] [Google Scholar]
- 33.Zhang Y, Schaap MG, Wei Z. Development of hierarchical ensemble model and estimates of soil water retention with global coverage. Geophys. Res. Lett. 2020;47:e2020GL088819. doi: 10.1029/2020GL088819. [DOI] [Google Scholar]
- 34.Qiao J, Zhu Y, Jia X, Huang L, Shao M. Development of pedotransfer functions for soil hydraulic properties in the critical zone on the Loess Plateau, China. Hydrol. Processes. 2018;32:2915–2921. doi: 10.1002/hyp.13216. [DOI] [Google Scholar]
- 35.Wang Y, et al. Soil moisture decline in China’s monsoon loess critical zone: More a result of land-use conversion than climate change. P. Natl. Acad. Sci. 2024;121:e2322127121. doi: 10.1073/pnas.2322127121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Miguez-Macho G, Fan Y. Spatiotemporal origin of soil water taken up by vegetation. Nature. 2021;598:624–628. doi: 10.1038/s41586-021-03958-6. [DOI] [PubMed] [Google Scholar]
- 37.Gao X, et al. Disentangling the impact of event- and annual-scale precipitation extremes on critical-zone hydrology in semiarid loess vegetated by apple trees. Water Resour. Res. 2023;59:e2022WR033042. doi: 10.1029/2022WR033042. [DOI] [Google Scholar]
- 38.Heckman KA, et al. Moisture-driven divergence in mineral-associated soil carbon persistence. P. Natl. Acad. Sci. 2023;120:e2210044120. doi: 10.1073/pnas.2210044120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Dawson TE, Hahm WJ, Crutchfield-Peters K. Digging deeper: what the critical zone perspective adds to the study of plant ecophysiology. New Phytol. 2020;226:666–671. doi: 10.1111/nph.16410. [DOI] [PubMed] [Google Scholar]
- 40.Lin M, Biswas A, Bennett EM. Spatio-temporal dynamics of groundwater storage changes in the Yellow River Basin. J. Environ. Manage. 2019;235:84–95. doi: 10.1016/j.jenvman.2019.01.016. [DOI] [PubMed] [Google Scholar]
- 41.Yin L, et al. Trade-offs and synergy between ecosystem services in National Barrier Zone. Geogr. Res. 2019;38:2162–2172. doi: 10.11821/dlyj020180578. [DOI] [Google Scholar]
- 42.Ran L, et al. Spatial and seasonal variability of organic carbon transport in the Yellow River, China. J. Hydrol. 2013;498:76–88. doi: 10.1016/j.jhydrol.2013.06.018. [DOI] [Google Scholar]
- 43.Yang Y, et al. Estimating soil organic carbon redistribution in three major river basins of China based on erosion processes. Soil Res. 2020;58:540–550. doi: 10.1071/SR19325. [DOI] [Google Scholar]
- 44.Wang, X., Ma, H., Li, R., Song, Z. & Wu, J. Seasonal fluxes and source variation of organic carbon transported by two major Chinese rivers: The Yellow River and Changjiang (Yangtze) River. Global Biogeochem. Cy. 26, 10.1029/2011GB004130 (2012).
- 45.Chen C, et al. China and India lead in greening of the world through land-use management. Nat. Sustain. 2019;2:122–129. doi: 10.1038/s41893-019-0220-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Chen Y, et al. Balancing green and grain trade. Nat. Geosci. 2015;8:739–741. doi: 10.1038/ngeo2544. [DOI] [Google Scholar]
- 47.Papritz, A. Soilhypfit: modelling of soil water retention and hydraulic conductivity data. R package version 0.1-7. website: https://rdrr.io/cran/soilhypfit/ (2022).
- 48.Wang W, Zhang Y, Tang Q. Impact assessment of climate change and human activities on streamflow signatures in the Yellow River Basin using the Budyko hypothesis and derived differential equation. J. Hydrol. 2020;591:125460. doi: 10.1016/j.jhydrol.2020.125460. [DOI] [Google Scholar]
- 49.Xie P, et al. Spatial-temporal variations in blue and green water resources, water footprints and water scarcities in a large river basin: A case for the Yellow River basin. J. Hydrol. 2020;590:125222. doi: 10.1016/j.jhydrol.2020.125222. [DOI] [Google Scholar]
- 50.Wang Y, Shao MA, Liu Z, Horton R. Regional-scale variation and distribution patterns of soil saturated hydraulic conductivities in surface and subsurface layers in the loessial soils of China. J. Hydrol. 2013;487:13–23. doi: 10.1016/j.jhydrol.2013.02.006. [DOI] [Google Scholar]
- 51.Soil Science Division Staff. Soil survey manual. In USDA Handbook 18 (Government Printing Office, Washington, D.C., 2017).
- 52.Klute, A. & Dirksen, C. in Methods of Soil Analysis, 687-734, 10.2136/sssabookser5.1.2ed.c28 (SSSA Book Series, 1986).
- 53.Rahardjo H, Nong XF, Lee D, Leong EC, Fong Y. Expedited soil–water characteristic curve tests using combined centrifuge and chilled mirror techniques. Geotech. Test. J. 2018;41:207–217. doi: 10.1520/GTJ20160275. [DOI] [Google Scholar]
- 54.Johnson, S. G. The NLopt Nonlinear-Optimization Package, website: https://nlopt.readthedocs.io/en/latest/ (2014).
- 55.Duan Q, Sorooshian S, Gupta VK. Optimal use of the SCE-UA global optimization method for calibrating watershed models. J. Hydrol. 1994;158:265–284. doi: 10.1016/0022-1694(94)90057-4. [DOI] [Google Scholar]
- 56.Li X, Shao MA, Zhao C. Estimating the field capacity and permanent wilting point at the regional scale for the Hexi Corridor in China using a state-space modeling approach. J. Soil. Sediment. 2019;19:3805–3816. doi: 10.1007/s11368-019-02314-6. [DOI] [Google Scholar]
- 57.Reynolds CA, Jackson TJ, Rawls WJ. Estimating soil water-holding capacities by linking the Food and Agriculture Organization Soil map of the world with global pedon databases and continuous pedotransfer functions. Water Resour. Res. 2000;36:3653–3662. doi: 10.1029/2000WR900130. [DOI] [Google Scholar]
- 58.Tong Y, 2024. Dataset of Soil Hydraulic Parameters in the Yellow River Basin. PANGAEA. https://doi.pangaea.de/10.1594/PANGAEA.965004 [DOI] [PMC free article] [PubMed]
- 59.Morales-Durán N, Fuentes S, Chávez C. A soil database from Queretaro, Mexico for assessment of crop and irrigation water requirements. Sci. Data. 2023;10:429. doi: 10.1038/s41597-023-02332-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Fang S, et al. The distribution of Van Genuchten model parameters on soil-water characteristic curves in Chinese Loess Plateau and new predicting method on unsaturated permeability coefficient of loess. PLoS ONE. 2023;18:e0278307. doi: 10.1371/journal.pone.0278307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Goldberg, D., Gornat, B. & Rimon, D. Drip Irrigation: Principles, Design and Agricultural Practices (Drip Irrigation Scientific Publications, 1976).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- Tong Y, 2024. Dataset of Soil Hydraulic Parameters in the Yellow River Basin. PANGAEA. https://doi.pangaea.de/10.1594/PANGAEA.965004 [DOI] [PMC free article] [PubMed]
Data Availability Statement
The code used to calculate SWRC parameters can be found on Github (https://github.com/TONGYP1116/SoilHydraulicParameter.git).






