Abstract
Accurately capturing the complex interaction between CO2 and water in porous media at the pore scale is essential for various geoscience applications, including carbon capture and storage (CCS). We introduce a comprehensive dataset generated from high-fidelity numerical simulations to capture the intricate interaction between CO2 and water at the pore scale. The dataset consists of 624 2D samples, each of size 512 × 512 with a resolution of 35μm, covering 100 time steps under a constant CO2 injection rate. It includes various levels of heterogeneity, represented by different grain sizes with random variation in spacing, offering a robust testbed for developing predictive models. This dataset provides high-resolution temporal and spatial information crucial for benchmarking machine learning models.
Subject terms: Geochemistry, Geophysics
Background & Summary
CO2 transport through porous media plays a critical role in both natural and engineered processes, including subsurface carbon sequestration1,2, enhanced oil recovery3, and groundwater management4. The challenge lies in accurately characterizing the movement and saturation of CO2, which is influenced by the complex interactions between fluid phases and the geological heterogeneity of the porous structure5. As CO2 is injected into underground formations, its movement through the pore spaces of geological materials, such as sandstone or basaltic reservoirs, dictates how efficiently it can be stored over long periods. This transport process is influenced by various factors, including capillary forces and chemical interactions between CO2, brine, and the mineral matrix.
Various approaches are utilized to understand and predict CO2 transport in porous media. Laboratory techniques, such as core flooding experiments6, yield effective bulk properties like permeability and residual saturation. Advanced imaging methods, like X-ray micro-tomography7, allow visualization of pore-scale phenomena but have limitations, especially for dynamic processes. Numerical simulations, including lattice Boltzmann8, pore-network modeling9, and direct numerical simulation10, offer more precise estimations of the fluid properties, however at a significant computational cost.
Machine learning (ML) models are emerging as valuable tools for predicting CO2 behavior in porous media, serving as efficient surrogates for computationally expensive simulations. Recent advancements highlight ML’s potential to estimate properties, like pressure build-up and saturation levels, with impressive speed and accuracy11–16. The principle of these models is to learn the relationship between inputs—such as physical properties of porous media and engineering parameters—and outputs, like spatial and temporal fluid changes. Once trained on a set of representative samples, these models can generalize to predict unseen patterns, such as new permeability fields or different injection scenarios, with considerable efficiency.
However, challenges remain in terms of having a sufficient and diverse dataset for training robust models that generalize well across various scenarios. For example, current datasets often remain constrained to relatively small scales, such as maximum mesh sizes of 256 × 25617–22, which limits the ability of these models to capture fine-grained patterns necessary for accurate predictions in complex formations. Another key limitation is that most datasets designed for machine learning models focus on predicting the final state (e.g., after the injection duration) rather than capturing intermediate states18–20. This limitation restricts the ability of models to capture the dynamic evolution of processes over time, which is crucial for understanding CO2 transient behaviors in real-world geological scenarios.
In this paper, we introduce a high-resolution dataset designed for benchmarking machine learning models in predicting CO2 behavior during multiphase flow in porous media. The dataset comprises 624 two-dimensional samples, each of size 512 × 512 pixels with a spatial resolution of 35μm, capturing the intricate interplay between CO2 and water over 100 equally spaced temporal snapshots under a constant CO2 injection rate. A distinctive feature of this dataset is its incorporation of varying levels of heterogeneity, represented through different grain sizes, which simulate realistic geological variability. This comprehensive dataset offers critical temporal and spatial granularity, serving as a utility for developing and benchmarking machine learning models.
Methods
Geometry Preprocessing
The pore structures are generated with the open-source notebook DrawMicromodels.ipynb in https://github.com/hannahmenke/DrawMicromodels, commit 5e0f947, which perturbs a regular triangular lattice of mean grain radius R0 by three heterogeneity amplitudes . For the n-th grain
where each perturbation term δ ∈ [−a, a] is sampled from a uniform distribution whose half-width a is the level-dependent deviation listed in Table 1. Five levels are defined, ranging from well-sorted media (Level 1) to highly heterogeneous media (Level 5).
Table 1.
Quantitative definition of the five heterogeneity levels (dimensionless amplitudes).
| Level | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| 0.05 | 0.10 | 0.15 | 0.20 | 0.25 | |
| 0.02 | 0.04 | 0.06 | 0.08 | 0.10 | |
| 0.02 | 0.04 | 0.06 | 0.08 | 0.10 |
Physical motivation
The radius variation mimics sedimentary sorting, while positional jitter reproduces local compaction and packing irregularities observed in outcrop sandstones (CV ≈ 0.05-0.25). Increasing these amplitudes therefore widens the pore–throat distribution and the capillary contrast, both of which are known to control CO2-water displacement dynamics.
Parametric sweep and augmentation
For each level we perform a deterministic sweep over R0 ∈ {70, 80, 90} and target porosities ϕ ∈ {0.20, 0.25, 0.30, 0.35, 0.40, 0.45}, producing 5 × 3 × 6 = 90 base images. Twelve images that displayed percolation shortcuts were discarded after visual inspection, leaving 78 accepted bases. Each 1024 × 1024 image is subsequently cropped into four non–overlapping quadrants (512 × 512), and mirrored vertically. This yields the final ensemble of 78 × 4 × 2 = 624 geometries used in this study as shown in Fig. 1. By exposing the ML models to a range of grain size distributions and spatial configurations, the dataset enhances the model’s ability to generalize to unseen porous media. The inter–sample sweep forces machine–learning surrogates to learn scale–invariant descriptors, while the intra–sample jitter trains them to handle local anomalies, both are crucial for robust generalisation to unseen geological settings. It allows the ML model to develop robust feature extraction capabilities that are invariant to changes in grain sizes and configurations. This is crucial for ensuring that the predictions remain accurate across different geological formations. The dataset contains 624 geometries, each one is of size 512 × 512 and the physical resolution per pixel is 35 μm. All samples are available in HDF5 format along with the simulations.
Fig. 1.
Some examples of domain geometries corresponding to different patterns of heterogeneity. The heterogeneity level increases from left to right.
Multi-phase flow at the pore-scale
Understanding CO2 injection into water-filled porous media at the pore scale is critical for designing effective carbon storage strategies, especially in tight reservoirs where pore structures are highly heterogeneous and capillary forces dominate. At this scale, the interplay between fluid properties, pore geometry, and interfacial dynamics significantly influences the distribution and transport of CO2. These micro-scale interactions can lead to complex displacement patterns including snap off, coalescence, and ganglion migration that are difficult or impossible to capture with conventional Darcy-scale constitutive functions such as saturation-dependent capillary pressure and relative permeabilities. Robust Darcy-scale models however are key to predicting CO2 migration and storage efficiency.
The two-phase flow simulations in this study were conducted using GeoChemFoam10, an advanced open-source numerical simulator developed at the Institute of GeoEnergy Engineering at Heriot-Watt University. GeoChemFoam is based on the OpenFOAM framework and is specifically designed to investigate pore-scale processes critical to energy transition and carbon storage.
GeoChemFoam uses the algebraic Volume-of-Fluid method23 to solve multiphase flow. The velocity u and the pressure p solve the single-field Navier-Stokes Equations (NSE):
| 1 |
| 2 |
where:
ρ = α1ρ1 + α2ρ2 is the fluid density,
u is the velocity,
S = is the viscous stress,
μ = α1μ1 + α2μ2 is the fluid viscosity,
p is the pressure,
fst is the surface tension force,
αi is the phase volume fraction, and
i = 1, 2 refers to the phase index.
The surface tension force is approximated using the Continuous Surface Force (CSF) model23:
| 3 |
where:
σ is the interfacial tension, and
is the interface curvature.
The phase indicator function α1 solves the phase transport equation:
| 4 |
To reduce interface smearing, an artificial compression term is introduced by replacing ur with a compressive velocity23.
Each geometry is a domain of 512 × 512 voxels at a resolution and depth of 35 microns. We perform a two-phase flow simulation where CO2 is injected into a fully water-filled model from the left boundary, as shown in Fig. 2, at a flow rate of 1 × 10−8m3/s corresponding to a capillary number of approximately 5 × 10−6. The CO2 properties are set to be and . The water properties are ρwater = 1 × 103kg/m3 and μwater = 1 × 10−6m2/s, with the interfacial tension between phases at 0.03 N/m, and the contact angle θ = 45°. The simulation was run until a total time of 1 s with a write interval of 0.01 s and a convergence tolerance of 1 × 10−8.
Fig. 2.

Visualization of CO2 injection in porous media initially saturated with water. The CO2 is injected from the left boundary, displacing the water phase as it migrates through the pore space.
In Fig. 3, we show the CO2 migration pattern, for different heterogeneities, as it displaces water at different time steps. Over time, the CO2 saturation front expands, displaying distinct channelized patterns and regions of accumulation. These patterns demonstrate the interaction between capillary forces, viscous forces, and the underlying geological features. The time-lapse progression also reveals the impact of grain size and pore structure on flow dynamics, emphasizing the importance of micro-scale processes in controlling large-scale behavior. We also show the pressure, capillary pressure, and vertical velocity fields for different geometries in Figs.4, 5, and 6, respectively.
Fig. 3.
CO2 (yellow) displacing water in a porous media during the simulation time. Each row shows an example of the 5 heterogeneity levels in the dataset.
Fig. 4.
Pressure field at different injection duration. Each row shows an example of the 5 heterogeneity levels in the dataset.
Fig. 5.
Capillary pressure field at different injection duration. Each row shows an example of the 5 heterogeneity levels in the dataset.
Fig. 6.
Vertical velocity field at different injection duration. Each row shows an example of the 5 heterogeneity levels in the dataset.
Data Records
The dataset has been made available on 10.5061/dryad.jm63xsjn524 and is organized into 10 folders, with each of the 5 geometries having its original version and a vertically flipped version (2 × 5 = 10). The simulation samples are provided in HDF5 format, with each file including water saturation (αwater), pressure (p), capillary pressure (pc), horizontal velocity (Ux), vertical velocity (Uy), and a binary image of the physical domain (where pores are denoted by 1 and grains by 0), as detailed in Table 2 which also lists the keys required to access the data. The water saturation αwater is in the range [0, 1]; hence, the CO2 saturation field can be computed using the relation , where img denotes the binary physical domain. Additionally, CSV files containing values for porosity, permeability, and relative permeability are provided, with details presented in Table 3.
Table 2.
Overview of the dataset files, including flow velocity components, pressure fields, and physical domain representations with corresponding sizes and descriptions.
| File Name | Key | Size | Description |
|---|---|---|---|
| *.hdf5 | Ux | 100 × 5122 | x-component of flow velocity |
| Uy | 100 × 5122 | y-component of flow velocity | |
| alpha_water | 100 × 5122 | water saturation field over time | |
| img | 5122 | physical domain | |
| p | 100 × 5122 | pressure field | |
| pc | 100 × 5122 | capillary pressure field |
Keys are provided for accessing hdf5 files.
Table 3.
List of files describing porosity and relative permeability values.
| File Name | Description |
|---|---|
| poroPerm.csv | Time, porosity, permeability (m2), the characteristic pore length L, the Reynolds number Re, and the Darcy velocity UD at the beginning of the simulation before any CO2 is injected into the model. |
| relperm.csv | Porosity, permeability (m2), and the capillary number of each phase (Ca1 for water and Ca2 for CO2) at the beginning of the simulation. The saturation of water Sw, the relative permeability of water krw, and the relative permeability of CO2 kwo are shown for each output timestep. |
Technical Validation
The GeoChemFoam solver used for flow simulation has been validated against experimental data in25. For accurate approximation, a convergence tolerance of 1 × 10−8 was used for all samples.
To assess the dataset’s utility for improving model generalization, three models of a U-Net architecture26 were trained on datasets of varying levels of heterogeneity. Each model was trained to predict future CO2 saturation by mapping a sequence of four consecutive saturation maps to the subsequent four timesteps. During evaluation, these models were applied in an autoregressive fashion to generate long-term predictions up to 60 timesteps. Model A was trained on the full dataset (5-Levels), model B was trained on a subset containing four of the five levels (4-Levels), and model C was trained on a subset with only the first level (1-Level). All models were then evaluated on samples from the fifth level, unseen by models B and C. For this analysis, all input samples were resized to 256 × 256 pixels, and predictions were made for the first 60 timesteps.
The results, summarized in Table 4, indicate a clear benefit to training on a more diverse dataset. The 4-Levels model achieved a lower Mean Squared Error (MSE) on average (0.0254) across the test samples compared to the 1-Level model (0.0320). This demonstrates superior average performance and generalization. The 5-Levels model, having been trained on the test data, served as a benchmark and predictably achieved the lowest average MSE (0.0145). A direct visual comparison of the predicted simulations against the ground truth, as seen in Fig. 7, corroborates these quantitative findings. Furthermore, the qualitative error maps in Fig. 8 visualize this trend, showing progressively lower absolute error from the 1-Level to the 5-Levels model. However, the per-sample MSE plots in Fig. 9 reveal that this improvement was not uniform across all samples; in some cases, the 4-Levels model performed similarly to, or slightly worse than, the 1-Level model. This suggests that while training on more varied data helps the model learn more general rules, it can also introduce biases that hinder performance on specific out-of-distribution samples. The primary conclusion is that increased training data diversity leads to better average generalization, though not necessarily universal improvement on every individual sample.
Table 4.
Summary statistics for model performance on the unseen fifth level.
| Model Name | Mean MSE | Final Step MSE | Std Dev |
|---|---|---|---|
| 5-Levels | 0.014484 | 0.009853 | 0.004364 |
| 4-Levels | 0.025410 | 0.023486 | 0.007635 |
| 1-Level | 0.032036 | 0.037971 | 0.008166 |
Fig. 7.
Qualitative comparison of model predictions against the target simulation for a sample from the test set.
Fig. 8.
Prediction error maps for each model at different timesteps.
Fig. 9.
Mean Squared Error (MSE) over simulation timesteps for various samples of level 5.
Acknowledgements
This work is funded by the Engineering and Physical Sciences Research Council’s ECO-AI Project grant (reference number EP/Y006143/1), with additional financial support from the PETRONAS Centre of Excellence in Subsurface Engineering and Energy Transition (PACESET).
Author contributions
Conceptualization and methodology, H.P.M., J.M., A.A., A.H.E.; visualization and writing, A.A., H.P.M.; formal analysis, A.A., H.P.M., A.H.E.; funding acquisition, A.H.E., F.D., H.P.M.; supervision, A.H.E, F.D., H.P.M. All authors have read and agreed to the published version of the manuscript.
Code availability
The input files used to simulate CO2 flow is built using GeoChemFoam10 and is available at https://github.com/ai4netzero/generating_co2_flow. The code is written in Python 3.11.9 and the list of the requirements is shown in the readme file. GeoChemFoam can be downloaded from https://github.com/GeoChemFoam/GeoChemFoam-5.1 and has been validated against experimental data in25.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Alhasan Abdellatif, Hannah P. Menke.
Contributor Information
Alhasan Abdellatif, Email: alhasanabdellatif@gmail.com.
Hannah P. Menke, Email: h.menke@hw.ac.uk
References
- 1.Xiao, Y., Xu, T. & Pruess, K. The effects of gas-fluid-rock interactions on CO2 injection and storage: insights from reactive transport modeling. Energy Procedia1(1), 1783–1790 (2009). [Google Scholar]
- 2.Guiltinan, E. J., Santos, J. E., Cardenas, M. B., Espinoza, D. N. & Kang, Q. Two-phase fluid flow properties of rough fractures with heterogeneous wettability: Analysis with lattice Boltzmann simulations. Water Resources Research57(1), e2020WR027943 (2021). [Google Scholar]
- 3.Xu, R., Prodanović, M. & Landry, C. Pore-scale study of water adsorption and subsequent methane transport in clay in the presence of wettability heterogeneity. Water Resources Research56(10), e2020WR027568 (2020). [Google Scholar]
- 4.Cassiraga, E. F., Fernández-Garcia, D. & Gómez-Hernández, J. J. Performance assessment of solute transport upscaling methods in the context of nuclear waste disposal. International Journal of Rock Mechanics and Mining Sciences42(5-6), 756–764 (2005). [Google Scholar]
- 5.Dentz, M., Le Borgne, T., Englert, A. & Bijeljic, B. Mixing, spreading and reaction in heterogeneous media: A brief review. Journal of Contaminant Hydrology120, 1–17 (2011). [DOI] [PubMed] [Google Scholar]
- 6.Mohammed, N. et al. Investigating the flow behaviour of CO2 and N2 in porous medium using core flooding experiment. Journal of Petroleum Science and Engineering208, 109753 (2022). [Google Scholar]
- 7.Huang, R., Herring, A. L. & Sheppard, A. Investigation of supercritical CO2 mass transfer in porous media using X-ray micro-computed tomography. Advances in Water Resources171, 104338 (2023). [Google Scholar]
- 8.Gao, J. et al. Reactive transport in porous media for CO2 sequestration: Pore scale modeling using the lattice Boltzmann method. Computers & Geosciences98, 9–20 (2017). [Google Scholar]
- 9.Xiong, Q., Baychev, T. G. & Jivkov, A. P. Review of pore network modelling of porous media: Experimental characterisations, network constructions and applications to reactive transport. Journal of Contaminant Hydrology192, 101–117 (2016). [DOI] [PubMed] [Google Scholar]
- 10.Maes, J. & Menke, H. P. GeoChemFoam: Direct modelling of flow and heat transfer in micro-CT images of porous media. Heat and Mass Transfer58(11), 1937–1947 (2022). [Google Scholar]
- 11.Zhu, Y. & Zabaras, N. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. Journal of Computational Physics, 366, 415–447. Elsevier (2018).
- 12.Zhong, Z., Sun, A. Y. & Jeong, H. Predicting CO2 plume migration in heterogeneous formations using conditional deep convolutional generative adversarial network. Water Resources Research55(7), 5830–5851 (2019). [Google Scholar]
- 13.Wang, K. et al. A physics-informed and hierarchically regularized data-driven model for predicting fluid flow through porous media. Journal of Computational Physics443, 110526 (2021). [Google Scholar]
- 14.Wen, G., Catherine, H. & Benson, S. M. CCSNet: a deep learning modeling suite for CO2 storage. Advances in Water Resources, 155, 104009. Elsevier (2021).
- 15.Wen, G., Li, Z., Azizzadenesheli, K., Anandkumar, A., & Benson, S. M. U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow. Advances in Water Resources, 163, 104180. Elsevier (2022).
- 16.Wen, G., Li, Z., Azizzadenesheli, K., Anandkumar, A., & Benson, S. M. Real-time high-resolution CO2 geological storage prediction using nested Fourier neural operators. Energy & Environmental Science, 16(4), 1732–1741, Royal Society of Chemistry (2023).
- 17.Wang, Y. D., Traiwit, C., Armstrong, R. T. & Mostaghimi, P. ML-LBM: predicting and accelerating steady state flow simulation in porous media with convolutional neural networks. Transport in Porous Media138(1), 49–75 (2021). [Google Scholar]
- 18.Feng, W. & Huang, H. Fast prediction of immiscible two-phase displacements in heterogeneous porous media with convolutional neural network. Advances in Applied Mathematics and Mechanics13(1), 140–162 (2021). [Google Scholar]
- 19.Wang, Z. et al. Pore-scale modeling of multiphase flow in porous media using a conditional generative adversarial network (cGAN). Physics of Fluids, 34, no. 12 AIP Publishing (2022).
- 20.Ko, D. D., Ji, H. & Ju, Y. S. Prediction of pore-scale flow in heterogeneous porous media from periodic structures using deep learning. AIP Advances 13, no. 4 AIP Publishing (2023).
- 21.Meng, Y., Jiang, J., Wu, J. & Wang, D. Transformer-based deep learning models for predicting permeability of porous media. Advances in Water Resources179, 104520 (2023). [Google Scholar]
- 22.Poels, Y., Minartz, K., Bansal, H. & Menkovski, V. Accelerating Simulation of Two-Phase Flows with Neural PDE Surrogates. arXiv preprint arXiv:2405.17260 (2024).
- 23.Rusche, H. Computational fluid dynamics of dispersed two-phase flow at high phase fractions, Ph.D. thesis, University of London, (2002).
- 24.Abdellatif, A., Menke, H. P., Maes, J., Elsheikh, A. H., Doster, F. Benchmark dataset for pore-scale CO2-water interaction [Dataset]. Dryad.10.5061/dryad.jm63xsjn5 (2025). [DOI] [PMC free article] [PubMed]
- 25.Zhao, B. et al. Comprehensive comparison of pore-scale models for multiphase flow in porous media. Proceedings of the National Academy of Sciences116(28), 13799–13806 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Olaf, R., Philipp, F. & Thomas, B. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, 234–241 (Springer, Cham, 2015).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The input files used to simulate CO2 flow is built using GeoChemFoam10 and is available at https://github.com/ai4netzero/generating_co2_flow. The code is written in Python 3.11.9 and the list of the requirements is shown in the readme file. GeoChemFoam can be downloaded from https://github.com/GeoChemFoam/GeoChemFoam-5.1 and has been validated against experimental data in25.








