Skip to main content
Science Advances logoLink to Science Advances
. 2024 Sep 27;10(39):eadp8149. doi: 10.1126/sciadv.adp8149

Evaluation of the lithium resource in the Smackover Formation brines of southern Arkansas using machine learning

Katherine J Knierim 1,*, Madalyn S Blondes 2, Andrew Masterson 2, Philip Freeman 2, Bonnie McDevitt 2, Amanda Herzberg 2, Peng Li 3, Ciara Mills 3, Colin Doolan 2, Aaron M Jubb 2, Scott M Ausbrooks 3, Jessica Chenault 2
PMCID: PMC11430454  PMID: 39331718

Abstract

Global demand for lithium, the primary component of lithium-ion batteries, greatly exceeds known supplies, and this imbalance is expected to increase as the world transitions away from fossil fuel energy sources. High concentrations of lithium in brines have been observed in the Smackover Formation in southern Arkansas (>400 milligrams per liter). We used published and newly collected brine lithium concentration data to train a random forest machine-learning model using geologic, geochemical, and temperature explanatory variables and create a map of predicted lithium concentrations in Smackover Formation brines across southern Arkansas. Using these predicted lithium maps with reservoir parameters and geologic information, we calculated that there are 5.1 to 19 million tons of lithium in Smackover Formation brines in southern Arkansas, which represents 35 to 136% of the current US lithium resource estimate. Based on these calculations, in 2022, 5000 tons of dissolved lithium were brought to the surface within brines as waste streams of the oil, gas, and bromine industries.


Smackover Formation brines in southern Arkansas may contain 5.1 to 19 million tons of lithium based on machine learning.

INTRODUCTION

Lithium is identified as a critical mineral because of its use in batteries and the growing importance of transitioning from fossil fuel–driven internal combustion engines to electric and hybrid vehicles (1). As of 2023, commercial-scale lithium production in the US was only from Nevada and Utah (2), but extensive lithium deposits occur throughout the US (3). The Upper Jurassic (Oxfordian) Smackover Formation (hereafter referred to as the Smackover or Smackover Formation) is an example of a laterally extensive petroleum and oilfield brine system in the Gulf of Mexico region that includes locally high concentrations of bromide (>5000 mg/liter) and lithium (>300 mg/liter), especially in southern Arkansas (35) (Fig. 1). Projects are ongoing to extract lithium from these Smackover Formation brines, in some cases leveraging the wastewater from commercial bromine production (6). Thus, the Smackover Formation in southern Arkansas represents a potentially important region of US lithium production to meet the demands of a growing market. In addition, lithium in brines co-produced during oil and gas production represents an opportunity to extract a commodity from what is otherwise a waste stream (7).

Fig. 1. Map showing observed lithium concentrations in brines in the Gulf Coast region.

Fig. 1.

Lithium concentrations shown for brines from the Smackover Formation and other geologic units in southern Arkansas and the Gulf Coast region. Three lithium concentrations of >500 mg/liter from southern Arkansas are shown for reference but were not used for summary statistics or modeling.

Basin and oilfield brines have received increasing attention for lithium exploration and extraction (710), especially in the past decade as lithium production has increased (2) and future demand is forecast to increase substantially (11). Oilfield brines may be an important lithium resource because these brines occur globally, are otherwise considered a waste product from the oil, gas, and brine industries, and—depending on the viability of direct lithium extraction technologies—would not require a large footprint similar to the evaporative processes used to concentrate basin brines (7, 12). Despite the global prevalence of oilfield brines, lithium concentrations and correlations with geologic features and other geochemical constituents vary within and across basins (4, 9, 13, 14). For example, lithium in brines is found to generally increase with total dissolved solids (9) and depth (1416). Lithium mass yields were found to vary across the Marcellus Shale in Pennsylvania, based on variable concentration and volumes of produced waters (13). Within the Gulf Coast of the US, lithium in Smackover Formation brines was found to correlate with potassium, possibly from interaction with feldspathic minerals or clays (15, 17). These assessments of lithium provide insights into the variability, geochemical relations, and possible sources of lithium in oilfield brines. Despite these general relations, using an average concentration from brine samples for resource assessments may not fully capture the spatial variability of lithium across basins, and quantifying this variability is important for commercial lithium extraction (7).

In recent decades, machine learning has become an important tool to characterize the spatial variability of geochemical constituents in subsurface waters (18) as the algorithms can identify complex and nonlinear patterns and handle large and diverse explanatory variable datasets (19). The machine-learning models can incorporate mapped geologic explanatory variables, thus providing the models with information about geologic features that may be important predictors of groundwater chemistry (18, 20). Using machine-learning models to map mineral prospectivity is an emerging field (21), but similar research to predict shallow groundwater chemistry using machine learning has shown the ability to produce accurate maps of aquifer chemistry (18, 20, 22). The goals of this work were to generate a map of predicted lithium concentrations using machine learning—specifically random forest (RF)—and use the predictions with reservoir and geologic characteristics to determine the mass of lithium in Smackover Formation brines across southern Arkansas.

Lithium in Smackover Formation brines has been evaluated across southern Arkansas, but a spatially continuous prediction of lithium in subsurface brines and estimate of lithium mass have not been completed. Individual Smackover Formation fields within southern Arkansas have been assessed for commercial lithium production (23, 24), and the area has been included as part of a regional lithium economic assessment (25). Machine learning has been used to create spatially continuous predictions of lithium concentrations in drinking water across the US but at much shallower depths than Smackover brines (18). In addition, a classification machine-learning model was used to investigate how brine geochemistry can be used to predict lithium in the Smackover Formation, but this investigation only predicted lithium at brine sample locations (17). In this study, we use published and newly collected brine lithium concentration data to train a machine-learning model and create a spatially continuous map of predicted lithium in Smackover Formation brines across southern Arkansas using geologic, geochemical, and temperature explanatory variables. The predicted lithium concentrations can be used with geologic and reservoir characteristics—such as formation thickness, porosity, and water-to-oil ratios—to quantify the mass of lithium in brines. Although the focus of this study is the Smackover Formation, brine samples were also collected from units overlying the Smackover Formation, which provides additional context about potential mixing or lithium resources in other geologic formations that have received less attention (15).

RESULTS

Lithium concentrations ranged from 0.08 to 1700 mg/liter in brines across the Gulf Coast region, but only 3 of 544 samples were >500 mg/liter (Fig. 1). These three samples are associated with US Bureau of Mines samples collected between 1965 and 1970 and compiled in the US Geological Survey’s Produced Waters Geochemical Database (PWGD) version 3.0 (4); concentrations of >500 mg/liter could not be verified from source data. In addition, at two of the three locations, Smackover Formation brines were sampled in 2022 from nearby wells (the exact wells could not be resampled), and concentrations were found to be much lower (between 95 and 178 mg/liter). Therefore, lithium concentrations of >500 mg/liter were not used for subsequent modeling or summary statistics. Lithium concentrations within the model domain of southern Arkansas ranged from 0.27 to 477 mg/liter (Table 1 and table S1) and varied by formation (Figs. 1 and 2). Brine samples used in the RF model were collected from 1965 to 2022, with most samples collected in the 1960s and 1970s. The Smackover Formation had the highest lithium concentrations compared to other units, with a median of 110 mg/liter (Fig. 2). Although lithium concentrations were lower in units overlying the Smackover Formation, the Cotton Valley Formation contained elevated lithium concentrations up to approximately 100 mg/liter (Fig. 2 and table S1).

Table 1. Observed and predicted lithium concentrations in Smackover Formation brines in southern Arkansas.

Statistic Observed Predicted (wells) Predicted (raster cells)
Size 221 221 2524
Minimum 0.27 3.7 13
Median 98 97 87
Mean 142 141 109
Maximum 477 408 389

Fig. 2. Lithium concentrations in brines by geologic formation or group in the southern Arkansas model domain.

Fig. 2.

Box plots represent the interquartile range, with whiskers extending to points within 1.5 times the interquartile range. For geologic ages and formal names, refer to the main text. Grp, group.

Lithium concentrations predicted by the RF model for all samples in the model domain ranged from 3.7 to 408 mg/liter (Table 1 and Fig. 3). The final RF machine-learning model had hyperparameters of 6 mtry (number of randomly selected explanatory variables), 4 min_n (the minimum number of observations per node), and 200 trees (the number of trees grown); final hyperparameters were found using 10-fold cross-validation tuning on 80% of the data (refer to Materials and Methods for details). Model performance was evaluated using the remaining 20% of the data (holdout data), and the final model had a root mean square error (RMSE) of 36 mg/liter and R2 of 0.93. High lithium concentrations (>400 mg/liter) were slightly underpredicted (Fig. 3), which is typical of tree-based machine-learning models (26). Of concern with the small training dataset is the potential for overfitting, as model accuracy was quite high for both training and holdout data. Cross-validation tuning, evaluating a holdout dataset, and choosing a model within one SE of the model with the lowest RMSE as the final model were all meant to protect against overfitting. One mechanism to evaluate model accuracy would be increasing lithium concentration data available for model training; as of 2023, all available data were used.

Fig. 3. Observed versus predicted lithium concentrations.

Fig. 3.

Data represent lithium concentration data in brines from the Smackover Formation and other geologic units (see main text for details). Training data were used to tune the RF machine-learning model, and holdout data were used to evaluate model performance.

The top five most important explanatory variables for predicting lithium were dissolved hydrogen sulfide (H2S) concentrations in Smackover Formation brines, depth of brine sample, altitude of the top of the Smackover Formation, whether a brine sample was collected from the Smackover Formation or another geologic unit, and thickness of the Smackover Formation (Fig. 4). Detailed digital three-dimensional geologic information mapped across basins is often limited in the public domain; thus, geologic information from historical maps and datasets was digitized and used as explanatory variables for the RF model (refer to the Supplementary Materials). The source of the lithium was not tested with explanatory variables in the RF model. Other geologic information may be important for predicting lithium and can be tested in subsequent modeling efforts. Shapley additive explanation (SHAP) values do provide an explanation that as H2S increases, predicted lithium concentration increases (Fig. 4), especially where the Smackover Formation depth is between 2500 and 3000 m deep. Therefore, in southern Arkansas, H2S is an important predictor of high lithium concentrations in brines.

Fig. 4. Explanatory variable importance based on calculated SHAP values.

Fig. 4.

The top 15 explanatory variables in the lithium RF machine-learning model, with the importance score shown in parentheses. Higher SHAP values correspond to a higher predicted lithium concentration, and lower SHAP values correspond to a lower predicted lithium concentration. The coloring for each explanatory variable represents the magnitude of that explanatory variable for predicting the corresponding SHAP values.

The trained RF model was used to predict lithium at the midpoint altitude of the Reynolds oolite unit of the Smackover Formation at 4-km2 resolution (Fig. 5). Predicted lithium concentrations ranged from 13 to 389 mg/liter (Table 1). Lithium could not be predicted in the northeastern part of the model domain where the Smackover Formation facies was calcarenite-carbonate mudstone (27) because this facies category was not represented with any lithium observations for RF model training. Predictions were also masked along the northwestern and southern boundaries of the model domain where explanatory variables were missing. The Smackover Formation thins and pinches out to the north, so the prediction maps represent the northern limit of predicting brine chemistry in the Smackover Formation. The model could be extended south and west with additional brine samples and expansion of the geologic framework beyond the current model domain. Explanatory variables were smoothed when resampled to the 4-km2 model resolution, such that the holdout accuracy of observed lithium concentrations compared to predictions at the raster cell where the sample was collected was an RMSE of 81 mg/liter.

Fig. 5. Maps of predicted lithium concentrations and uncertainty.

Fig. 5.

Maps showing spatially continuous predictions of lithium concentrations in Smackover Formation brines compared to observations at wells, range of predicted lithium concentrations between the 95th and 5th percentiles of all one-SE model predictions at each cell, and qualitative uncertainty showing the likelihood of predicted lithium greater or less than 100 mg/liter.

Uncertainty in RF model predictions—based on the difference between the 95th and 5th percentiles of predictions across all model cells using 425 one-SE models—was 1 to 83 mg/liter (Fig. 5). The model domain used a 4-km2 grid cell to balance having as many cells as possible with a lithium concentration sample while minimizing cells with multiple samples. Of the 2054 cells representing the Smackover Formation, approximately 6% included at least one lithium concentration value and 2% included two to eight samples. The SD for cells with multiple samples was <1 to 193 mg/liter, with a median of 11 mg/liter. Therefore, the range of observed lithium concentrations within model cells is on the same order of magnitude as the uncertainty of the RF model predictions based on one-SE models. The uncertainty predictions were used to classify whether a model cell was likely to be greater or less than 100 mg/liter, which is a concentration cutoff used for some direct lithium extraction technologies (7). Lithium concentrations were predicted to be very likely >100 mg/liter throughout a broad swath of the Smackover Formation, which also co-occurs with field-sampled brines with observed lithium concentrations of >100 mg/liter (Fig. 5). Within the model domain, we estimate that approximately 42% of the Smackover Formation contains brines with lithium concentrations of >100 mg/liter.

DISCUSSION

To calculate the volume of Smackover Formation brines, the thickness, porosity, and water-to-oil ratio of the unit must be known. The thickness of the Reynolds oolite unit of the Smackover Formation varied between 0 (pinching out) and over 122 m thick across the model domain (27). When sampled to the 4-km2 model resolution, the thickness averaged 73 m. Porosity ranged from 0.5 to 29% in three wells in Lafayette County that were sampled between approximately 2499 and 2652 m deep (n = 363) (28). Porosity was found to be highest in distinct zones that were between 5 m and 15 m thick (28). Average reservoir porosity values ranged from 10 to 31% (29). To assess the volume of brines within the Smackover Formation, porosity was assumed to be 10, 20, and 30% across the model domain. Water-to-oil ratios in 43 reservoirs producing from the Smackover Formation ranged from 0.11 to 0.44 or between 11 and 44% brine (29). When applied to the 4-km2 model resolution, 100 cells included water-to-oil ratios of <0.9 where oil reservoirs were located, and all other cells were assigned a water-to-oil ratio of 0.9 (assuming 90% brine). On the basis of these thickness and porosity ranges, brine volume ranged from 65,000 to 200,000 billion liters (410 to 1200 billion barrels). The entirety of the Reynolds oolite unit of the Smackover Formation does not have a single porosity value spatially (30) or vertically (28), such that mapped porosity would improve brine volume estimates. Lacking such information, the estimated brine volumes provide a first approximation for the region.

Many areas of the model domain have no oil production, and, thus, no co-produced brines, from the Smackover Formation since 1982. Most oil production since 1982 has occurred in the Magnolia field in Columbia County and brine production from two fields in Columbia and Union counties. The lowest nonzero production from a 4-km2 area was 1 barrel in 2000, and the highest was 26 million barrels in 2020 (31). In total across the model domain, between 56 (1982) and 300 (in 2011) million barrels of brine have been extracted from the Smackover Formation since 1982 (fig. S3). In 2022, brine production ranged from 0 (no production) to 14.7 million barrels across the model domain and totaled 175 million barrels (fig. S3). Southern Arkansas represents an area where lithium in brines of >100 mg/liter occurs and some portion of the brines are brought to the surface as part of existing commercial oil, gas, and brine waste streams.

The mass of predicted lithium in the Reynolds oolite unit of the Smackover Formation ranged from 5.1 to 19 million tons (or 27 to 100 million tons of lithium carbonate equivalent) based on the prediction maps (Fig. 5) and porosity between 10 and 30% (Table 2). If the 5th percentile of predicted lithium concentrations were used (low concentration scenario), then lithium ranged from 5.1 to 15.2 million tons. If the 95th percentile of predicted lithium concentrations were used (high concentration scenario), then lithium ranged from 6.3 to 19 million tons. The range of estimated lithium mass represents uncertainty associated with the 90th percentile prediction interval from the RF machine-learning model and porosity ranging from 10 to 30%. The difference in calculated lithium mass based on the range of porosity values (i.e., 10 to 30%) was greater than the difference between the 5th and 95th percentiles of predicted lithium for each porosity value (i.e., 10, 20, or 30%) (Table 2). For example, assuming a 30% porosity, the difference between the 95th percentile lithium prediction (high concentration model scenario) and 5th percentile lithium prediction (low concentration model scenario) is approximately 3.8 million tons of lithium. In contrast, assuming the median lithium prediction, the difference between 10 and 30% porosity is approximately 11.3 million tons (Table 1). Therefore, better estimates of Smackover Formation porosity could decrease the uncertainty in estimated lithium mass. Porosity measurements from individual wells, formation horizons, or reservoirs must be extrapolated to regional maps for such an estimate to be used across the model domain. Because of how porosity effects the lithium mass calculation, a companion US Geological Survey data release provides Python code to evaluate different porosity scenarios for estimating brine volumes (32).

Table 2. Estimated mass of lithium in Smackover Formation brines in southern Arkansas.

Lithium mass estimates were based on predicted lithium brine concentrations from low (5th percentile), median (50th percentile), and high (95th percentile) RF machine-learning models and a range of porosity values for the Reynolds oolite unit of the Smackover Formation (10, 20, and 30%). The mass of lithium brought to the surface was based on water production volumes in 2022 associated with gas, oil, or brine wells (31).

Li model* Average predicted Li (mg/liter) Average porosity (%) Brine volume (liters) Lithium (tons) Lithium LCE* (tons) Lithium, brought to surface (tons) Proportion lithium brought to surface (%)
Low 96 10 6.5 × 1013 5.1 × 106 2.7 × 107 4454 0.09
Median 107 10 6.5 × 1013 5.7 × 106 3.0 × 107 4854 0.09
High 119 10 6.5 × 1013 6.3 × 106 3.4 × 107 5142 0.08
Low 96 20 1.3 × 1014 1.0 × 107 5.4 × 107 4454 0.04
Median 107 20 1.3 × 1014 1.1 × 107 6.0 × 107 4854 0.04
High 119 20 1.3 × 1014 1.3 × 107 6.8 × 107 5142 0.04
Low 96 30 2.0 × 1014 1.5 × 107 8.1 × 107 4454 0.03
Median 107 30 2.0 × 1014 1.7 × 107 9.0 × 107 4854 0.03
High 119 30 2.0 × 1014 1.9 × 107 1.0 × 108 5142 0.03

*LCE, lithium carbonate equivalent.

The estimated mass of lithium is dependent on the accuracy of the underlying RF model, which requires a training dataset that adequately represents the lithium resource. In addition, RF models are not an appropriate extrapolation tool because an accurate RF model will predict the range of values in the training data (33). For the lithium RF model, we relied heavily on historical data (4), with newly collected samples in 2022 targeting areas that were previously unsampled or wells with notably high lithium concentrations (32). Because machine-learning models are inherently data greedy and computer resource intensive, it is only in the past decades, and even more recently in the past few years, that mapped predictions from machine-learning models are possible (33); therefore, reliance on historical data is a requirement. Any lithium exploration efforts will have to rely on such historical data, at least in part. The mapped predictions represent lithium concentrations from 1965 to 2022 and do not consider any changes in lithium concentrations over time, as repeat samples from the same well were not available.

Uncertainty in mapped, predicted lithium concentrations has implications for lithium exploration and commercial extraction. The RF model slightly overpredicted low values and underpredicted high values (Table 1 and Fig. 3), as is typical of tree-based machine-learning algorithms (26). Underpredicting high concentrations will be of more interest to industry extracting lithium from oilfield brines, especially since the overpredicted concentrations are generally <50 mg/liter (Fig. 3). Uncertainty in mapped predictions was highest in the western part of the model domain, especially where there were no lithium concentration data (Fig. 5). Mapped predictions of lithium from machine-learning models will benefit from ongoing sampling, especially across areas that have little brine chemistry data. Despite these challenges, RF models represent an important tool to create prediction maps of critical minerals that will provide more reasonable estimates than assuming constant lithium concentrations across whole regions or basins. The RF model provides an estimate of the lithium resource in the Smackover Formation across southern Arkansas and, therefore, may not represent conditions at individual wells. This is especially important given that local-scale variation in lithium concentrations—along with other brine constituents that may interfere with direct lithium extraction technologies—can greatly affect the feasibility of commercial lithium recovery (7). In addition, the RF model is only suitable for predicting lithium in brines in the Smackover Formation, as most of the lithium concentration data used to train the model were collected from the Smackover Formation. Notably, the Cotton Valley Formation was found to have lithium concentrations up to approximately 100 mg/liter (Fig. 2), but data were insufficient to predict lithium in Cotton Valley brines.

The 2023 estimated global lithium resource was 105 million tons, of which 14 million tons were estimated in the US (2). The US total includes estimates of continental brines such as the Smackover Formation (2). Without knowing what proportion of the US total includes the model domain, the lithium resource in the Smackover Formation brines within southern Arkansas would represent approximately 36% to more than the current (2023) US resource estimate (136%). On the basis of the volume of brine extracted in 2022, approximately 5000 tons—or less than 0.1% of the available lithium resource in the Smackover Formation—has been brought to the surface within brines as waste streams of the oil, gas, and bromine industries (Table 1). Assuming 100% extraction efficiency of lithium from the brines, this would cover the estimated US consumption in 2022 (34). Predicted lithium concentrations and estimates of lithium mass in the subsurface represent a first approximation of the in-place resource, and future modeling may benefit from including analysis of technical recoverability.

MATERIALS AND METHODS

Study design

The mass of available lithium in brines in the Smackover Formation of southern Arkansas was quantified by (i) acquiring lithium concentrations in brines, (ii) predicting a spatially continuous map of lithium concentrations using an RF machine-learning model, and (iii) calculating the mass of lithium in the Smackover Formation and assessing the mass extracted associated with historical oil and brine production. Model resolution was a 2-km by 2-km square (4 km2) grid of 90 columns and 32 rows (Fig. 1). A 4-km2 resolution was used for the model domain based on the density of lithium brine data compared to the total area of the model domain. Scripts associated with this modeling workflow are available in a US Geological Survey data release (32).

Geologic framework

The southern Arkansas shelf includes Triassic through Cretaceous sedimentary units (fig. S1) deposited in marine, near-shore marine, or coastal settings with the thickness and structure of units controlled by Triassic rifting, subsequent filling of the Gulf of Mexico, and deformation of salt (35). The Triassic-age Eagle Mills Formation unconformably overlies late Paleozoic deposits and is composed of fluvial, deltaic, and lacustrine “red beds” containing diabase and basalt sills (36). The Jurassic Louann Salt overlies the Eagle Mills Formation and is a relatively pure-phase halite with minor anhydrite (35). The Jurassic Norphlet Formation was deposited as a terrigenous siliciclastic unit and is composed of thin shale, sandstone, and gravel deposits in the north-central Gulf of Mexico (37). The Jurassic-age Smackover Formation is typically underlain by either the Norphlet Formation or the Louann Salt where the Norphlet is absent. The Smackover Formation is predominantly limestone in southern Arkansas and is divided into two informal members (38): an upper oolitic to chalky porous limestone (known informally as the Reynolds oolite) and a lower member composed of dense argillaceous limestone and dark calcareous shale (known informally as the Brown Dense) (27). Most petroleum is produced from the Reynolds oolite (39). The lower part of the Smackover Formation serves as an effective regional source rock in the onshore interior salt basins in the north central and northeastern Gulf of Mexico (40). Approximately 151 fields across southern Arkansas have produced more than 500 million barrels of oil and condensate from the Smackover Formation since production began in 1936 (41, 42). Brine extraction to produce bromine began in 1957 (43). The Buckner Member of the Upper Jurassic Haynesville Formation, composed of red shale and white to pink anhydrite, conformably overlies the Smackover Formation. It can be absent locally and may vary in thickness up to 91 m (35).

The upper Jurassic to Cretaceous section of the southern Arkansas shelf includes 100 s of meters of units with locally described oil reservoirs often in lenticular sand lenses; here, only the units that have been sampled for lithium concentrations in brines are described. There is a sharp change in lithology, indicative of a disconformity, between the underlying Upper Jurassic Louark Group and the overlying Upper Jurassic to Lower Cretaceous Cotton Valley Group, which is a nearshore red bed facies in southern Arkansas approximately 914 m thick (38). The Lower Cretaceous Nuevo Leon Group includes the Hosston Formation (red shale with interbedded lenses of white sandstone) and Sligo Formation (gray to brown shale with lenses of dense gray limestone and sandstone). Locally, the Sligo Formation includes porous oolitic limestone lenses of the Pettet Limestone Member. The Lower Cretaceous Trinity Group is subdivided, in ascending order, into the Pine Island Shale, James Limestone, Rodessa Formation, Ferry Lake Anhydrite, and Mooringsport Formation (38); brines from the James Limestone and Rodessa Formation include lithium concentration data. The James Limestone consists of a fossiliferous, dense limestone, and red and gray shale. The Rodessa Formation consists of oolitic and crystalline limestones, lenticular fine-grained sandy limestone, anhydrite, coquinoid limestones, and gray shales and contains several lenses and tongues that are productive oil reservoirs (44). Upper Cretaceous units with lithium brine samples include the Tokio Formation [coarse gray and brown cross-bedded quartz and dark gray lignitic fossiliferous clay (45)] and the Nacatoch Formation [predominantly sand with interbedded shale, clay, and calcareous deposits (46)].

The source of lithium in Smackover Formation brines is not well understood as the brines have a complex geochemistry inherited from initial seawater evaporation and subsequent interaction with rocks during brine emplacement (5, 15, 47, 48). Smackover Formation brines initially acquired high salinity from evaporation of Jurassic seawater and expulsion of fluids from the underlying Louann Salt, but bromide and lithium are both enriched relative to seawater. Correlation between lithium and boron, rubidium, and potassium along with strontium isotope signatures that are more radiogenic than Jurassic seawater suggests that the brines interacted with siliciclastic minerals during emplacement (47, 48). Lithium could be sourced from igneous material—diabase dikes and sills, eroded volcaniclastic sediments, or air fall tuff—as volcanism occurred throughout Triassic rifting and deposition of Jurassic units (5, 25, 35). Regardless of the initial source of lithium, lithium-containing brines migrated to the Smackover Formation and—similar to the oil resource—were trapped by stratigraphic or structural features (49).

Brine data

Historical brine dataset

Brine samples from the US Geological Survey’s PWGD version 3.0 were filtered for the Gulf Coast region within the states of Arkansas, Louisiana, Mississippi, Alabama, and Texas to provide regional lithium concentrations (4). This dataset was further filtered for the model domain of southern Arkansas, which included 1168 samples of which 193 included a lithium analysis. These data were mostly from research associated with the US Bureau of Mines (49), Moldovanyi and Walter (47), and Trout (50). More recently collected data associated with lithium extraction in southern Arkansas were also available in the PWGD (4) associated with ongoing projects to produce lithium commercially (23, 24). Most historical lithium data were collected from the Smackover Formation, but limited data were available from the Tokio Formation, Nuevo Leon Group (Hosston Formation), and Cotton Valley Group (table S1 and fig. S1) (32). Because the historical brine data are from multiple sources, analytical methods and detection limits for lithium varied.

Brine sampling and analysis

In August 2022, brine samples were collected from 27 oil and brine wells throughout southern Arkansas from Jurassic and Cretaceous formations to provide updated information about lithium concentrations. The focus of the sampling was on the Smackover Formation, but shallower units overlying the Smackover—including the Nacatoch Formation, Trinity Group (Rodessa Formation and James Limestone), Nuevo Leon Group (Hosston Formation), and Cotton Valley Group—were also sampled to provide information about possible brine mixing between units (Fig. 1 and fig. S1) (32).

Water sample collection protocols for total dissolved solids, anions, and major and trace cations (including lithium) followed those outlined in Blondes et al. (51). Samples were collected at the wellhead or separator in 5-gallon carboys. For any well with a known presence of hazardous H2S gas, the fluid in the carboy was purged with ultrahigh purity nitrogen (N2) gas using a custom sparging apparatus until H2S was below detection after turning off the N2. The water was then pumped from the carboy through a 0.45-μm filter using a peristaltic pump and stored in nitric acid–washed high density polyethylene bottles that were triple rinsed with sample. The cation aliquots were acidified to a pH < 2 using TraceMetal grade nitric acid. To measure the dissolved sulfide concentration of the samples, 500-ml amber glass bottles containing a known quantity of zinc acetate saturated solution as a fixative were filled with brine directly at the wellhead or separator. These samples were not sparged with N2. All samples were stored on ice and shipped to the US Geological Survey in Reston, Virginia for analysis in the BRInE Laboratory (www.usgs.gov/labs/brine-research-instrumentation-and-experimental-laboratory).

Sulfide concentrations were determined on zinc acetate fixed samples using similar methods detailed in Moldovanyi and Walter (47). A known volume of the fixed sample was resuspended and injected anoxically into an N2-purged reactor containing anoxic 6-N hydrochloric acid (HCl). For all samples containing >1 mg/liter of iron, HCl was prepared with stannous chloride (SnCl2) to reduce oxidation artifacts by ferric iron. The resulting H2S was purged from the reactors using N2 and carried in fresh 0.3 M zinc acetate traps. Product zinc sulfide (ZnS) was converted to silver sulfide (Ag2S) by addition of 0.3 M silver nitrate (AgNO3). Ag2S was rinsed with ammonium hydroxide (NH4OH), filtered, dried, and weighed to assess the concentration of sulfide in the original solution. Replicate measurements of standard sulfide solutions, fixed as ZnS, of 10 and 100 mg of S/liter yielded a complete sulfide recovery of ±5%.

Predicted lithium concentrations

An RF machine-learning model was developed to predict lithium concentrations in Smackover Formation brines throughout southern Arkansas. The model was developed by (i) assigning explanatory variables to brine samples collected at wells, (ii) tuning the RF model to make predictions at wells and assess model performance, (iii) mapping spatially continuous predictions of lithium concentrations across the Reynolds oolite unit of the Smackover Formation in southern Arkansas, and (iv) inspecting the model for explanatory variable importance and influence. Initial model tuning used the tidymodels framework (52) in R (53) to test XGBoost, K-nearest neighbors, and RF algorithms; RF models consistently had higher accuracy and lower bias, so they were used to train the final model and predict lithium.

Explanatory variables

Explanatory variables used to tune the RF model included geologic, geochemical, and temperature information for Jurassic and Cretaceous units. The geologic framework of the model domain is expected to influence brine chemistry both spatially and with depth. Explanatory variables used to train the RF model must be mapped across the model domain to create spatially continuous predictions of lithium. Thus, spatially continuous subsurface geologic information is key, although these digital resources are often difficult to acquire. Each sample was attributed with all explanatory variables, regardless of the formation that the sample was collected from. See the Supplemental Materials for details about digitizing and creating model grid–based versions of the explanatory variables.

The final model included 11 explanatory variables: well depth and formation for the brine sample, thickness of the Hosston Formation (5456), thickness of the Cotton Valley Formation (5759), thickness and facies of the Buckner Member, thickness, altitude of the top, and facies of the Reynolds oolite unit of the Smackover Formation (27), bottom-hole temperature of the Smackover Formation (31), and H2S concentrations in Smackover Formation brines (4, 60). Categorical explanatory variables (facies classes and sample formation) were one hot–encoded (i.e., each category coded as a binary) for the RF model. Explanatory variables were assigned to lithium samples using a Python software (61) and the rasterio package (62) that extracted the value of the explanatory variable (generally at a 100-m2 resolution) to the well location. Explanatory variables were also resampled from the resolution of the source dataset to the 4-km2 resolution of the model grid to map predicted lithium.

RF modeling

RF is a type of ensemble tree machine-learning model where many decision trees are generated using a randomly sampled subset of the explanatory variables to make a prediction (63). Predictions for each sample are generated as an average of many model predictions in the ensemble of decision trees (hence, a forest made of trees). RF model tuning, training, and evaluating model performance for predicting lithium concentration were completed in R software (53) using the tidymodels framework (52).

RF models are tuned by varying hyperparameters that control model structure and accuracy, including the number of randomly selected explanatory variables (mtry), number of trees grown (trees), and the minimum number of observations per node (min_n). The model with the combination of hyperparameters that results in the lowest RMSE between observed and predicted values is considered the most accurate model. Tuning and hyperparameter selection was completed using 80% of the lithium concentration data as a training dataset (n = 176) and 10-fold cross-validation. During 10-fold cross-validation tuning, the training data are randomly divided into 10 subsets, and 10% of the data are used as a “testing” dataset to evaluate model performance (based on RMSE). Hyperparameters for RF tuning ranged from 20 to 1000 trees, 3 to 8 mtry, and 2 to 10 min_n for a total of 480 models (32). The range of hyperparameters used for model tuning represents a range of values recommended for RF modeling and appropriate for the dataset size. Specifically, a small number of trees was tested to understand how model accuracy changed with the number of trees grown to predict the relatively small sample size of the lithium dataset (n = 221).

The most accurate models tend to be the most complex based on the hyperparameters (that is, large trees, large mtry, and small min_n), so a model within one SE of the model with the lowest RMSE was chosen as the final model. As shown in other machine learning predictions of groundwater geochemistry (64, 65), models within one SE of the highest accuracy model (lowest RMSE) are reasonable models to use for predictions. The one-SE models quantify how model predictions vary on the basis of changes in the hyperparameters, which ultimately control model fit and accuracy. The final RF model was trained on the training dataset, and performance was evaluated using the remaining 20% of the data (holdout, n = 45).

The final RF model was also inspected to understand which explanatory variables were more important for predicting lithium concentration and how the variables influenced the prediction. Because ensemble tree machine-learning models are inherently complex, a single tree or collection of trees cannot be inspected to directly understand how the algorithm uses explanatory variables to make a prediction. SHAP values were calculated to quantify variable importance and influence using the fastshap (66) and shapviz (67) packages. For RF models, explanatory variables are generally more important when they are used earlier in tree building or used many times (19). SHAP values quantify the additive effect that an explanatory variable has on making each prediction by permutating models through many simulations, thus providing information about how each explanatory variable influences the response prediction (68, 69). In general, more positive SHAP values represent higher magnitude predictions (e.g., greater lithium concentration), and negative SHAP values represent lower magnitude values (e.g., lower lithium concentration).

Prediction maps

The final trained RF model and resampled grids of explanatory variables were used to make a spatially continuous prediction of lithium concentration at the midpoint altitude of the Reynolds oolite unit of the Smackover Formation. Although the model was trained on brine data from other Jurassic and Cretaceous units, predictions were only made for the Smackover Formation because much of the brine data were collected from the Smackover Formation (89%) and most of the explanatory variables represent conditions in the Smackover Formation. The depth of each cell in the lithium prediction map ranged from approximately 1066 to 3444 m deep over the model domain (32).

RF model uncertainty

Uncertainty in lithium concentration predictions were quantified using one-SE models (70). Each combination of hyperparameters from the 425 one-SE models was used to train an RF model and predict lithium across the model domain. At each cell in the model domain, the SD, 5th percentile, 50th percentile (median), and 95th percentile values were calculated from the predicted values from all the one-SE models for that cell. The SD and difference between the 95th and 5th percentiles provide quantitative measures of variability in lithium predictions and, thus, capture model uncertainty. In addition, a qualitative uncertainty was created by comparing the 5th percentile, median, and 95th percentile predictions to a threshold of 100 mg/liter for lithium, which is often cited as the minimum concentration for commercial lithium extraction using current technologies (7). If the 5th, median, and 95th percentile predictions at a cell were >100 mg/liter, then the prediction was described as “very likely” > 100 mg/liter and if the 5th, median, and 95th percentile predictions at a cell were ≤100 mg/liter, then the prediction was described as very likely < 100 mg/liter (32).

Lithium resource estimate

The area, thickness, porosity, and water-to-oil ratio of the Smackover Formation were used to calculate brine volume (fig. S2). Much of the oil and brine production occurs within the upper Smackover Formation (39, 42), so the thickness of the Reynolds oolite unit of the upper Smackover Formation from Akin and Graves (27) was used as the thickness for brine calculation. Because a porosity map was not available, porosity values from Smackover core plugs (28) and field averages (29) were used across the model domain. Variation in diagenesis across the south Arkansas shelf is known to control porosity (30), and any variation in the porosity values will affect the calculated mass of lithium. A companion US Geological Survey data release provides Python code to evaluate different porosity scenarios for estimating brine volume (32). Estimates of water-to-oil ratios for oil reservoirs in southern Arkansas (29) were applied to the 4-km2 model grid to estimate the volume of pore space that contains water (brine). Where information about oil reservoirs were not present on the basis of source data (29), the water-to-oil ratio was assumed to be 0.9 (90% brine) to account for possible small amounts of oil outside of traps. Water-to-oil ratios were multiplied by the area and porosity values to calculate the volume of brine.

The predicted lithium concentration maps from the RF model were multiplied by the calculated brine volume to calculate the mass of lithium in the Reynolds oolite unit of the upper Smackover Formation (fig. S2). By using lithium predictions from the RF model, the mass of lithium can vary spatially across the model domain, which provides a more detailed estimate than assuming an average lithium concentration. In addition, the final RF model and uncertainty bounds (at the 5th and 95th percentile predictions of the one-SE models) were used to provide lower and upper estimates of lithium mass–based RF model uncertainty.

To provide perspective on the mass of lithium from the Smackover Formation that is currently being brought to the surface from oil, gas, or brine production wells, historical brine production data were quantified across the model domain for the past 40 years. Fluid production rates are an important consideration for the viability of lithium extraction from brines (7, 71). Annual water production volumes were acquired from the proprietary Enerdeq datasets for any wells producing from the Smackover Formation in southern Arkansas from 1982 to 2022 (31). Annual water volumes were summed for wells within each 4-km2 grid cell of the model domain. In each year of extraction, the calculated mass of lithium (based on the volume of brine extracted and the predicted lithium concentration from the RF model) was assumed to be reinjected into the subsurface, as brine injection is used to dispose of wastewater. Therefore, the brine extraction and reinjection are assumed to result in little to no loss of lithium from the brines through time; this includes brines that are stripped of bromine. More research may be warranted to support this assumption because any substantial loss of lithium will affect the calculated mass of lithium in the subsurface and ratio of lithium brought to the surface.

Acknowledgments

We appreciate the cooperation of the private companies and well operators that provided access to sample brines. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US government.

Funding: This work was supported by the US Geological Survey Water Mission Area Cooperative Matching Funds, US Geological Survey Energy Resources Program, and the Arkansas Department of Energy and Environment—Office of the State Geologist.

Author contributions: Conceptualization: K.J.K., M.S.B., A.M., and P.F. Data curation: K.J.K., P.F., and C.D. Formal analysis: K.J.K., M.S.B., A.M., P.F., B.M., A.H., A.M.J., and J.C. Investigation: K.J.K., M.S.B., A.M., P.F., C.D., and A.M.J. Methodology: K.J.K., M.S.B., A.M., P.F., B.M., A.H., C.M., C.D., S.M.A., A.M.J., and J.C. Visualization: K.J.K., and A.H. Supervision: M.S.B. and S.M.A. Writing—original draft: K.J.K., M.S.B., A.M., P.L., and C.M. Writing—review and editing: K.J.K., M.S.B., A.M., P.F., B.M., P.L., C.M., and S.M.A.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. All data, code, and model results are available in a companion US Geological Survey data release (32).

Supplementary Materials

This PDF file includes:

Supplementary Text

Figs. S1 to S3

Table S1

References

sciadv.adp8149_sm.pdf (741.5KB, pdf)

REFERENCES AND NOTES

  • 1.D. C. Bradley, L. L. Stillings, B. W. Jaskula, L. Munk, A. D. McCauley, “Chapter K: Lithium” in Critical Mineral Resources of the United States—Economic and Environmental Geology and Prospects for Future Supply (US Geological Survey, 2017), vol. 1802, p. 862.
  • 2.US Geological Survey, Mineral Commodity Summaries 2024 (US Geological Survey, 2024).
  • 3.J. M. Hammarstrom, C. L. Dicken, W. C. Day, A. H. Hofstra, B. J. Drenth, A. K. Shah, A. E. McCafferty, L. G. Woodruff, N. K. Foley, D. A. Ponce, T. P. Frost, L. L. Stillings, “Focus areas for data acquisition for potential domestic resources of 11 critical minerals in the conterminous United States, Hawaii, and Puerto Rico—Aluminum, cobalt, graphite, lithium, niobium, platinum-group elements, rare earth elements, tantalum, tin, titanium, and tungsten” (USGS Numbered Series 2019-1023-B, US Geological Survey, 2020).
  • 4.M. S. Blondes, K. J. Knierim, M. R. Croke, P. A. Freeman, C. Doolan, A. S. Herzberg, J. L. Shelton, U.S. Geological Survey National Produced Waters Geochemical Database v 3.0, December 2023 (US Geological Survey, 2023).
  • 5.A. G. Collins, Geochemistry of Anomalous Lithium in Oil-Field Brines (Geological Survey, Oklahoma, 1978).
  • 6.Blois M., ExxonMobil plans to use direct lithium extraction in Arkansas, Chem. Eng. News. 101, (2023); https://cen.acs.org/energy/energy-storage-/ExxonMobil-plans-use-direct-lithium/101/i38. [Google Scholar]
  • 7.Knapik E., Rotko G., Marszałek M., Recovery of lithium from oilfield brines—Current achievements and future perspectives: A mini review. Energies 16, 6628 (2023). [Google Scholar]
  • 8.E. Bunker, R. Bolton, R. Crossley, M. Broadley, A. Thomas, Lithium Exploration Tools from Source to Sink (GeoConvention, 2022); https://geoconvention.com/wp-content/uploads/abstracts/2022/73906-lithium-exploration-tools-from-source-to-sink.pdf.
  • 9.Dugamin E. J. M., Richard A., Cathelineau M., Boiron M.-C., Despinois F., Brisset A., Groundwater in sedimentary basins as potential lithium resource: A global prospective study. Sci. Rep. 11, 21091 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kumar A., Fukuda H., Hatton T. A., Lienhard J. H., Lithium recovery from oil and gas produced water: A need for a growing energy industry. ACS Energy Lett. 4, 1471–1474 (2019). [Google Scholar]
  • 11.International Energy Agency, Lithium (International Energy Agency, 2024); www.iea.org/reports/lithium.
  • 12.Vera M. L., Torres W. R., Galli C. I., Chagnes A., Flexer V., Environmental impact of direct lithium extraction from brines. Nat. Rev. Earth Environ. 4, 149–165 (2023). [Google Scholar]
  • 13.Mackey J., Bain D. J., Lackey G., Gardiner J., Gulliver D., Kutchko B., Estimates of lithium mass yields from produced water sourced from the Devonian-aged Marcellus Shale. Sci. Rep. 14, 8813 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.McDevitt B., Tasker T. L., Coyte R., Blondes M. S., Stewart B. W., Capo R. C., Hakala J. A., Vengosh A., Burgos W. D., Warner N. R., Utica/Point Pleasant brine isotopic compositions (δ7Li, δ11B, δ138Ba) elucidate mechanisms of lithium enrichment in the Appalachian Basin. Sci. Total Environ. 947, 174588 (2024). [DOI] [PubMed] [Google Scholar]
  • 15.Darvari R., Nicot J.-P., Scanlon B. R., Kyle J. R., Elliott B. A., Uhlman K., Controls on lithium content of oilfield waters in Texas and neighboring states (USA). J. Geochem. Explor. 257, 107363 (2024). [Google Scholar]
  • 16.Macpherson G. L., Lithium in fluids from Paleozoic-aged reservoirs, Appalachian Plateau region, USA. Appl. Geochem. 60, 72–77 (2015). [Google Scholar]
  • 17.Attanasi E. D., Coburn T. C., Freeman P. A., Machine learning approaches to identify lithium concentration in petroleum produced waters. Miner. Econ. 10.1007/s13563-023-00409-8, (2024). [DOI] [Google Scholar]
  • 18.Lombard M. A., Brown E. E., Saftner D. M., Arienzo M. M., Fuller-Thomson E., Brown C. J., Ayotte J. D., Estimating lithium concentrations in groundwater used as drinking water for the conterminous United States. Environ. Sci. Technol. 58, 1255–1264 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.M. Kuhn, K. Johnson, Applied Predictive Modeling (Springer, 2016); 10.1007/978-1-4614-6849-3). [DOI]
  • 20.Knierim K. J., Kingsbury J. A., Haugh C. J., Ransom K. M., Using boosted regression tree models to predict salinity in Mississippi Embayment Aquifers, Central United States. J. Am. Water Resour. Assoc. 56, 1010–1029 (2020). [Google Scholar]
  • 21.Farahbakhsh E., Maughan J., Müller R. D., Prospectivity modelling of critical mineral deposits using a generative adversarial network with oversampling and positive-unlabelled bagging. Ore Geol. Rev. 162, 105665 (2023). [Google Scholar]
  • 22.Podgorski J., Berg M., Global threat of arsenic in groundwater. Science 368, 845–850 (2020). [DOI] [PubMed] [Google Scholar]
  • 23.D. R. Eccles, J. Touw, W. Novak, R. M. McGowen, “Maiden inferred bromine- and lithium-brine resource estimations for Tetra Technologies, Inc.’s Tetra Property in Akansas, United States” (S-K 1300 Technical Report, TETRA Technologies, 2022).
  • 24.NORAM Engineering Constructors Ltd., “Standard Lithium LTD. Preliminary economic assessment of SW Arkansas Lithium Project, NI 43-101 Technical Report, amended” (Document No. E3580-RP-0200, NORAM Engineering Constructors Ltd., 2021); www.sec.gov/Archives/edgar/data/1537137/000117184321008188/exh_991.htm.
  • 25.P. Daitch, “Lithium extraction from oilfield brine,” thesis, University of Texas, Austin, TX (2018). [Google Scholar]
  • 26.Zhang G., Lu Y., Bias-corrected random forests in regression. J. Appl. Stat. 39, 151–160 (2012). [Google Scholar]
  • 27.Akin R. H., Graves R. W., Reynolds oolite of Southern Arkansas. Am. Assoc. Pet. Geol. 53, 1909–1922 (1969). [Google Scholar]
  • 28.P. Li, “Core plug analysis of the upper Smackover Formation in Lafayette County, southwestern Arkansas” (Open-File Report 2023–02, Office of the State Geologist, 2023); www.geology.arkansas.gov/publication/open_file_reports/OFR-2023-02-open-file-report.html.
  • 29.Nehring Associates, The significant oil and gas fields of the United States database [data current as of December 2017] (Colorado Springs, 2019).
  • 30.Moore C. H., Druckman Y., Burial diagenesis and porosity evolution, Upper Jurassic Smackover, Arkansas and Louisiana. Am. Assoc. Pet. Geol., Bull. 65, 5569412 (1981). [Google Scholar]
  • 31.S&P Global, Enerdeq US Well History and Production; database available from S&P Global Commodity Insight (2023); www.spglobal.com.
  • 32.K. J. Knierim, M. S. Blondes, P. A. Freeman, A. L. Masterson, B. McDevitt, A. S. Herzberg, C. Doolan, J. M. Chenault, A. Jubb, M. R. Croke, Lithium observations, machine-learning predictions, and mass estimates from the Smackover Formation brines in southern Arkansas (data release, US Geological Survey, 2024); 10.5066/P9QPRYZN. [DOI]
  • 33.Hengl T., Nussbaum M., Wright M. N., Heuvelink G. B. M., Gräler B., Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ 6, e5518 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.US Geological Survey, Mineral Commodity Summaries 2023 (US Geological Survey, 2023).
  • 35.A. Salvador, “Chapter 8: Triassic-Jurassic” in The Gulf of Mexico Basin, A. Salvador, Ed., The Geology of North America (Geological Society of America, 1991), pp. 131–180.
  • 36.Scott K. R., Hayes W. E., Fietz R. P., Geology of the Eagle Mills Formation. Gulf Coast Assoc. Geol. Soc. Trans. 11, 1–14 (1961). [Google Scholar]
  • 37.J. W. Snedden, W. E. Galloway, The Gulf of Mexico Sedimentary Basin: Depositional Evolution and Petroleum Applications (Cambridge Univ. Press, ed. 1, 2019); www.cambridge.org/core/product/identifier/9781108292795/type/book.
  • 38.R. W. Imlay, Lower Cretaceous and Jurassic Formations of Southern Arkansas and Their Oil and Gas Possibilities, information circular 12 (Arkansas Resources and Development Commission, Division of Geology, 1940); www.geology.arkansas.gov/docs/pdf/publication/information-circulars/IC-12.pdf.
  • 39.Weeks W. B., South Arkansas stratigraphy with emphasis on the older coastal plain beds. AAPG Bull. 22, 953–983 (1938). [Google Scholar]
  • 40.Mancini E. A., Li P., Goddard D. A., Zimmerman R. K., Petroleum source rocks of the onshore interior salt basins North Central and Northeastern Gulf of Mexico. Gulf Coast Assoc. Geol. Soc. Trans. 55, 486–504 (2005). [Google Scholar]
  • 41.Arkansas Oil and Gas Commission, Annual report of production 2016 (2016); www.aogc.state.ar.us/annual/default.aspx.
  • 42.J. H. Vestal, Petroleum Geology of the Smackover Formation of Southern Arkansas, information circular 14 (Arkansas Geological Survey, 1950); www.geology.arkansas.gov/publication/information-circulars/IC-14-information-circular.html.
  • 43.R. B. Stroud, R. H. Arndt, F. B. Fulkerson, W. G. Diamond, Mineral Resources and Industries of Arkansas, bulletin 645 (US Bureau of Mines, 1969); https://digital.library.unt.edu/ark:/67531/metadc12795/m2/1/high_res_d/Bulletin0645.pdf.
  • 44.Roberts J. L., Lock B. E., The Rodessa Formation in Bossier Parish Louisiana: Lithofacies analysis of a hydrocarbon productive shallow water clastic-carbonate sequence. Gulf Coast Assoc. Geol. Soc. Trans. 38, 103–111 (1988). [Google Scholar]
  • 45.S. H. Ogier, Stratigraphy of the Upper Cretaceous Tokio Formation, Caddo Parish, Louisiana (Shreveport Geological Society, 1963).
  • 46.Cullom T., Granata W., Gayer S., Heffner R., Pike S., Hermann L., Meyertons C., Sigler G., The basin frontiers and limits for exploration in the Cretaceous System of central Louisiana. Gulf Coast Assoc. Geol. Soc. Trans. 12, 97–115 (1962). [Google Scholar]
  • 47.Moldovanyi E. P., Walter L. M., Regional trends in water chemistry, Smackover Formation, Southwest Arkansas: Geochemical and physical controls. AAPG Bull. 76, 864–894 (1992). [Google Scholar]
  • 48.Stueber A. M., Pushkar P., Hetherington E. A., A strontium isotopic study of Smackover brines and associated solids, southern Arkansas. Geochim. Cosmochim. Acta 48, 1637–1649 (1984). [Google Scholar]
  • 49.A. G. Collins, Geochemistry of Liquids, Gases, and Rocks From the Smackover Formation, report of investigations 7897 (US Bureau of Mines, 1974).
  • 50.M. L. Trout, “Origin of bromide-rich brines in southern Arkansas,” thesis, University of Missouri, Columbia, MO (1974). [Google Scholar]
  • 51.Blondes M. S., Shelton J. L., Engle M. A., Trembly J. P., Doolan C. A., Jubb A. M., Chenault J. C., Rowan E. L., Haefner R. J., Mailot B. E., Utica shale play oil and gas brines: Geochemistry and factors influencing wastewater management. Environ. Sci. Technol. 54, 13917–13925 (2020). [DOI] [PubMed] [Google Scholar]
  • 52.M. Kuhn, H. Wickham, Tidymodels: A collection of packages for modeling and machine learning using tidyverse principles, version 1.1 (2020); www.tidymodels.org.
  • 53.R Core Team, R: A language and environment for statistical computing, version 4.3, R Foundation for Statistical Computing (2023); www.r-project.org/.
  • 54.T. S. Dyman, S. M. Condon, Estimated Thickness of the Travis Peak-Hosston Formations to the Top of the Cotton Valley Group, Western Gulf and East Texas Basin and Louisiana-Mississippi Salt Basins Provinces (047, 048 and 049) (US Geological Survey, 2005).
  • 55.T. S. Dyman, S. M. Condon, Structure Contour of the Top of the Travis Peak-Hosston Formations, Western Gulf and East Texas Basin and Louisiana-Mississippi Salt Basins Provinces (047, 048 and 049) (US Geological Survey, 2005).
  • 56.T. S. Dyman, S. M. Condon, “Chapter 5: Assessment of undiscovered conventional oil and gas resources—Lower Cretaceous Travis Peak and Hosston Formations, Jurassic Smackover interior salt basins total petroleum system, in the East Texas Basin and Louisiana-Mississippi Salt Basins provinces” in Petroleum Systems and Geologic Assessment of Undiscovered Oil and Gas, Cotton Valley Group and Travis Peak-Hosston Formations, East Texas Basin and Louisiana-Mississippi Salt Basins Provinces of the Northern Gulf Coast Region (Data Series 69-E), Digital Data Series (US Geological Survey, 2006), p. 43; 10.3133/ds69E_chapter5. [DOI]
  • 57.T. S. Dyman, S. M. Condon, “Chapter 2: Assessment of undiscovered conventional oil and gas resources—Upper Jurassic–Lower Cretaceous Cotton Valley Group, Jurassic Smackover interior salt basins total petroleum system, in the East Texas Basin and Louisiana-Mississippi Salt Basins provinces” in Petroleum Systems and Geologic Assessment of Undiscovered Oil and Gas, Cotton Valley Group and Travis Peak–Hosston Formations, East Texas Basin and Louisiana-Mississippi Salt Basins Provinces of the Northern Gulf Coast Region, Digital Data Series (US Geological Survey, 2006), p. 52; 10.3133/ds69E_chapter2. [DOI]
  • 58.T. S. Dyman, S. M. Condon, “Estimated Thickness of the Cotton Valley Group to the Top of the Smackover Formation, Western Gulf and East Texas Basin and Louisiana-Mississippi Salt Basins Provinces (047, 048 and 049)” (US Geological Survey, 2005).
  • 59.T. S. Dyman, S. M. Condon, Structure Contour of the Top of the Cotton Valley Group, Western Gulf and East Texas Basin and Louisiana-Mississippi Salt Basins Provinces (047, 048 and 049) (US Geological Survey, 2005).
  • 60.S. T. Brennan, J. L. Rivera, B. Varela, A. J. Park, L. A. Agyepong, Natural Gas Compositional Analyses Dataset of Gases from United States Wells (US Geological Survey, 2021).
  • 61.Python Software Foundation, Python language reference, v 3.9, Python.org (2023); www.python.org.
  • 62.S. Gillies, Rasterio: Geospatial raster I/O for {Python} programmers, version 1.2, Mapbox (2023); https://github.com/rasterio/rasterio.
  • 63.Breiman L., Random forests. Mach. Learn. 45, 5–32 (2001). [Google Scholar]
  • 64.Knierim K. J., Kingsbury J. A., Belitz K., Stackelberg P. E., Minsley B. J., Rigby J. R., Mapped predictions of manganese and arsenic in an alluvial aquifer using boosted regression trees. Groundwater 60, 362–376 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Nolan B. T., Fienen M. N., Lorenz D. L., A statistical learning framework for groundwater nitrate models of the Central Valley, California, USA. J. Hydrol. 531 (Pt. 3), 902–911 (2015). [Google Scholar]
  • 66.B. M. Greenwell, fastshap: Fast approximate Shapley values, version 0.1 (2023); https://github.com/bgreenwell/fastshap.
  • 67.M. Mayer, A. Stando, shapviz: SHAP visualizations, version 0.9 (2023); https://cran.r-project.org/web/packages/shapviz/index.html.
  • 68.S. M. Lundberg, S.-I. Lee, “A unified approach to interpreting model predictions” in 31st Conference on Neural Information Processing Systems (NIPS 2017) (Advances in Neural Information Processing Systems 30, 2017); https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf. [Google Scholar]
  • 69.S. M. Lundberg, G. G. Erion, S.-I. Lee, Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888 [cs.LG] (7 March 2019).
  • 70.C. D. Killian, K. J. Knierim, “Machine-learning predictions of groundwater specific conductance in the Mississippi Alluvial Plain, South-Central United States, with evaluation of regional geophysical aerial electromagnetic data as explanatory variables” (Scientific Investigations Report 2023–5099, US Geological Survey, 2023).
  • 71.Marza M., Ferguson G., Thorson J., Barton I., Kim J.-H., Ma L., McIntosh J., Geological controls on lithium production from basinal brines across North America. J. Geochem. Explor. 257, 107383 (2024). [Google Scholar]
  • 72.Esri, ArcGIS Pro, version 3.1 (2023); www.esri.com/en-us/arcgis/products/arcgis-pro/overview.
  • 73.American Association of Petroleum Geologists, Reservoirs and Petroleum Systems of the Gulf Coast (2023); www.datapages.com/gis-map-publishing-program/gis-open-files/geographic/reservoirs-and-petroleum-systems-of-the-gulf-coast.
  • 74.L. A. Burke, O. N. Pearson, S. A. Kinney, “New method for correcting bottomhole temperatures acquired from wireline logging measurements and calibrated for the Onshore Gulf of Mexico Basin, U.S.A.” (Open-File Report 2019–1143, US Geological Survey, 2019).
  • 75.Carroll J. J., Mather A. E., The solubility of hydrogen sulphide in water from 0 to 90°C and pressures to 1 MPa. Geochim. Cosmochim. Acta 53, 1163–1170 (1989). [Google Scholar]
  • 76.L. Lee, NADA package, version 1.6 (2020); https://CRAN.R-project.org/package=NADA.
  • 77.O. Mersmann, H. Trautmann, D. Steuer, B. Bornkamp, truncnorm: Truncated normal distribution, version 1.0 (2023); https://cran.r-project.org/web/packages/truncnorm/index.html.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Text

Figs. S1 to S3

Table S1

References

sciadv.adp8149_sm.pdf (741.5KB, pdf)

Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES