Merge pull request #13 from slimgroup/abhinav

arxiv version
slimgroup · Mar 28, 2024 · 363a4d7 · 363a4d7
2 parents aaef8da + a626f78
commit 363a4d7
Show file tree

Hide file tree

Showing 2 changed files with 44 additions and 4 deletions.
diff --git a/GahlotLi2024SEG/paper.bib b/GahlotLi2024SEG/paper.bib
@@ -45,4 +45,40 @@ @CONFERENCE{yu2023IMAGEmsc
   url = {https://slimgroup.github.io/IMAGE2023/SequentialBayes/abstract.html},
   presentation = {https://slim.gatech.edu/Publications/Public/Conferences/SEG/2023/yu2023IMAGEmsc},
   note = {(IMAGE, Houston)}
-}
+}
+
+@article{doi:10.1073/pnas.1912789117,
+author = {Kyle Cranmer  and Johann Brehmer  and Gilles Louppe },
+title = {The frontier of simulation-based inference},
+journal = {Proceedings of the National Academy of Sciences},
+volume = {117},
+number = {48},
+pages = {30055-30062},
+year = {2020},
+doi = {10.1073/pnas.1912789117},
+URL = {https://www.pnas.org/doi/abs/10.1073/pnas.1912789117},
+eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.1912789117},
+abstract = {Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, 
+they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify 
+the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound 
+influence these developments may have on science.}
+}
+
+@article{papamakarios2019sequential,
+      title={Sequential Neural Likelihood: Fast Likelihood-free Inference with Autoregressive Flows}, 
+      author={George Papamakarios and David C. Sterratt and Iain Murray},
+      year={2019},
+      eprint={1805.07226},
+      archivePrefix={arXiv},
+      primaryClass={stat.ML}
+}
+
+@inproceedings{kruse2021hint,
+  title={HINT: Hierarchical invertible neural transport for density estimation and Bayesian inference},
+  author={Kruse, Jakob and Detommaso, Gianluca and K{\"o}the, Ullrich and Scheichl, Robert},
+  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
+  volume={35},
+  number={9},
+  pages={8191--8199},
+  year={2021}
+}
diff --git a/GahlotLi2024SEG/paper.qmd b/GahlotLi2024SEG/paper.qmd
@@ -8,9 +8,13 @@ author:
 bibliography: paper.bib
 ---
 
+## Abstract {.unnumbered}
+
+*We present an uncertainty-aware Digital Twin for geologic carbon storage (GCS), capable of handling multimodal time-lapse data and controlling CO~2~ injectivity to mitigate reservoir fracturing risks. In GCS, Digital Twins represent virtual replicas of subsurface systems that incorporate real-time data and advanced generative Artificial Intelligence (genAI) techniques, including neural posterior density estimation via simulation-based inference and sequential Bayesian inference. These methods enable the effective monitoring and control of CO~2~ storage projects, addressing challenges such as subsurface complexity, operational optimization, and risk mitigation. By integrating diverse monitoring data, e.g., geophysical well observations and imaged seismic, Digital Twin (DT) can bridge the gaps between seemingly distinct fields like geophysics and reservoir engineering. In addition, the recent advancements in genAI also facilitate DT with principled uncertainty quantification. Through recursive training and inference, DT utilizes simulated current state samples, e.g., CO~2~ saturation, paired with corresponding geophysical field observations to train its neural networks and enable posterior sampling upon receiving new field data. However, it lacks decision-making and control capabilities, which is necessary for full DT functionality. This study aims to demonstrate how DT can inform decision-making processes to prevent risks such as cap rock fracturing during CO~2~ storage operations.*
+
 ## Introduction
 
-Digital Twins refer to dynamic virtual replicas of subsurface systems, integrating real-time data and employing advanced generative Artificial Intelligence (genAI) methodologies, such as neural posterior density estimation via simulation-based inference and sequential Bayesian inference. Thanks to combination of these advanced Bayesian inference techniques, our approach is capable of addressing challenges of monitoring and controlling CO~2~ storage projects. These challenges include dealing with the subsurface's complexity and heterogeneity (seismic and fluid-flow properties), operations optimization, and risk mitigation, e.g. via injection rate control. Because our Digital Twin is capable of handling diverse monitoring data, consisting of time-lapse seismic and data collected at (monitoring) wells, it entails a technology that serves as a platform to integrate seemingly disparate and siloed fields, e.g. geophysics and reservoir engineering. In addition, recent breakthroughs in genAI, allow Digital Twins to capture uncertainty in a principled way [@yu2023IMAGEmsc;@herrmann2023president;@gahlot2023NIPSWSifp]. By employing training and inference recursively, the Digital Twin trains its neural networks on samples of the simulated current state---i.e., the CO~2~ saturation/pressure, paired with simulated imaged seismic and/or data collected at (monitoring) wells. These training pairs of the simulated state and simulated observations are obtained by sampling the posterior distribution, $\mathbf{x}_{k-1}\sim p(\mathbf{x}_{k-1}\vert \mathbf{y}^\mathrm{o}_{1:k-1})$, at the previous timestep, $k-1$, conditioned on field data, $\mathbf{y}^\mathrm{o}_{1:k-1}$, collected over all previous timesteps, $1:k-1$, followed by advancing the state to the current timestep, followed by simulating (seismic/well) observations associated with that state. Given these simulated state-observation pairs, the Digital Twin's networks are trained, so they are current and ready to produce samples of the posterior when the new field data comes in---i.e. $\mathbf{x}_{k}\sim p(\mathbf{x}_{k}\vert \mathbf{y}^\mathrm{o}_{1:k})$. While this new neural approach to data assimilation for CO~2~ storage projects provides what is called an uncertainty-informed *Digital Shadow*, it lacks decision making and control, which would make it a Digital Twin [@thelen2022comprehensivea], capable of optimizing storage operations while mitigating risks including the risk of fracturing the cap rock by exceeding the fracture pressure. The latter risk is illustrated in Figure 1, where the first row contains simulated samples of the pressure difference  at timestep $k=4$, between the reservoir pressure and hydraulic pressure, without control. These samples for the simulated state exceed the fracture pressure and are denoted by the red areas. During this talk, we will demonstrate how the Digital Twin can make informed decisions to avoid exceeding the fracture pressure.
+Digital Twins refer to dynamic virtual replicas of subsurface systems, integrating real-time data and employing advanced generative Artificial Intelligence (genAI) methodologies, such as neural posterior density estimation via simulation-based inference [@doi:10.1073/pnas.1912789117] and sequential Bayesian inference [@papamakarios2019sequential;@kruse2021hint]. Thanks to combination of these advanced Bayesian inference techniques, our approach is capable of addressing challenges of monitoring and controlling CO~2~ storage projects. These challenges include dealing with the subsurface's complexity and heterogeneity (seismic and fluid-flow properties), operations optimization, and risk mitigation, e.g. via injection rate control. Because our Digital Twin is capable of handling diverse monitoring data, consisting of time-lapse seismic and data collected at (monitoring) wells, it entails a technology that serves as a platform to integrate seemingly disparate and siloed fields, e.g. geophysics and reservoir engineering. In addition, recent breakthroughs in genAI, allow Digital Twins to capture uncertainty in a principled way [@yu2023IMAGEmsc;@herrmann2023president;@gahlot2023NIPSWSifp]. By employing training and inference recursively, the Digital Twin trains its neural networks on samples of the simulated current state---i.e., the CO~2~ saturation/pressure, paired with simulated imaged seismic and/or data collected at (monitoring) wells. These training pairs of the simulated state and simulated observations are obtained by sampling the posterior distribution, $\mathbf{x}_{k-1}\sim p(\mathbf{x}_{k-1}\vert \mathbf{y}^\mathrm{o}_{1:k-1})$, at the previous timestep, $k-1$, conditioned on field data, $\mathbf{y}^\mathrm{o}_{1:k-1}$, collected over all previous timesteps, $1:k-1$, followed by advancing the state to the current timestep, followed by simulating (seismic/well) observations associated with that state. Given these simulated state-observation pairs, the Digital Twin's networks are trained, so they are current and ready to produce samples of the posterior when the new field data comes in---i.e. $\mathbf{x}_{k}\sim p(\mathbf{x}_{k}\vert \mathbf{y}^\mathrm{o}_{1:k})$. While this new neural approach to data assimilation for CO~2~ storage projects provides what is called an uncertainty-informed *Digital Shadow*, it lacks decision making and control, which would make it a Digital Twin [@thelen2022comprehensivea], capable of optimizing storage operations while mitigating risks including the risk of fracturing the cap rock by exceeding the fracture pressure. The latter risk is illustrated in Figure 1, where the first row contains simulated samples of the pressure difference at timestep $k=4$, between the reservoir pressure and hydraulic pressure, without control. These samples for the simulated state exceed the fracture pressure and are denoted by the red areas.  This manuscript will demonstrate how the Digital Twin can make informed decisions to avoid exceeding the fracture pressure.
 
 
 ## Methodology
@@ -22,9 +26,9 @@ To make uncertainty informed decisions on adapting the injection rate, 128 sampl
 
 To calculate injection rates that mitigate the risk of exceeding the fracture pressure, we proceed as follows. First, because the optimized injection rates are close to the fracture pressure, we consider these optimization as approximations to the injection rates where the fracture pressure are exceeded. Next, Kernel Density Estimation (KDE) is applied to produce the smooth red curve in Figure 2(a). This smoothed probability function is used to calculate the Cumulative Density Function (CDF), plotted in Figure 2(b). Using the fact that non-fracture/fracture occurrence entails a Bernoulli distribution, confidence intervals can be calculated, $\pm Z_{\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1 - \hat{p})}{128}}$ where $\hat{p}$ represents the CDF (blue line) and $Z_{\frac{\alpha}{2}}=1.96$ with $\alpha=0.05$. From the CDF and confidence intervals (denoted by the grey areas), the following conclusions can be drawn: First, if the initial injection rate of $q_3 = 0.0500 m^3/s$ is kept, the fracture probability lies between 24.47 -- 40.71% (vertical dashed line) and has a maximum likelihood of 32.59%, which are all way too high. Second, if we want to limit the fracture occurrence rate to 1% (red dashed line), then we need to lower the injection rate to $q_3=0.0387\mathrm{m^3/s}$. To ensure the low fracture occurrence rate of 1%, the reduced injection rate is chosen as the smallest injection rate within the confidence interval. As can be observed from Figure 1 (third row) and Figure 2(b), lowering the injection rate avoids exceeding the fracture pressure at the expense of injecting less CO~2~. Out of 128 samples, 43 samples are fractured with the initial injection rate, while only one sample is fractured with the controlled injection rate. 
 
-## Conclusion and discussions
+## Conclusion and discussion
 
-The above example illustrates how Digital Twins can be used to mitigate risks associated with CO~2~ storage projects. Specifically, we used the Digital Twin's capability to produce samples of its state (pressure), conditioned on observed seismic and/or well data. Using these samples, in conjunction with samples from the permeability distribution, we were able to capture statistics on the fracture occurrence frequency as a function of the injection rate. Given these statistics, we were in a position to set a fracture frequency and choose the corresponding injection rate as a function of the confidence interval. By following this procedure, exceeding the fracture pressure was avoided by lowering the injection rate. The decision to lower the injection rate, and by which amount, was informed by the Digital Twin, which uses seismic and/or well data to capture reservoir's state including its uncertainty.
+We illustrate how Digital Twins can be used to mitigate risks associated with CO~2~ storage projects. Specifically, we used the Digital Twin's capability to produce samples of its state (pressure), conditioned on observed seismic and/or well data. Using these samples, in conjunction with samples from the permeability distribution, we were able to capture statistics on the fracture occurrence frequency as a function of the injection rate. Given these statistics, we set a fracture frequency and choose the corresponding injection rate as a function of the confidence interval. By following this procedure, exceeding the fracture pressure was avoided by lowering the injection rate. The decision to lower the injection rate, and by which amount, was informed by the Digital Twin, which uses seismic and/or well data to capture reservoir's state including its uncertainty.
 
 ::: {#fig-flow}
 ![](./figs/Figure1.png){width="90%"}