Better stopping through cross validation in an iterative ensemble smoother: A perspective from supervised machine learning
Resumo
Iterative ensemble smoothers (IES) are among the popular reservoir data assimilation (RDA) algorithms for reservoir characterization. The actual deployment of an IES algorithm requires implementing certain stopping criteria, normally adopted for runtime control (e.g., by stopping the IES when it reaches the maximum number of iterations) and/or safeguarding the RDA performance (e.g., by preventing the simulated data from overfitting the actual observations). In practice, for various reasons, it is often challenging for existing stopping criteria to simultaneously achieve both purposes. One noticeable issue, as illustrated in this work, is that in many situations, the qualities of the estimated reservoir models may already start to deteriorate before a conventional stopping criterion activates to terminate the iteration process. Following this observation, one practically important question arises: Is it possible to further improve the efficacy of the IES algorithm by designing a different stopping criterion so that the IES can stop earlier, saving computational costs while achieving better RDA performance?
As one of the rare attempts in the community, this work aims to investigate the use of a new IES stopping criterion that has the potential to provide an affirmative answer to the above question. In this regard, our main idea is based on the concept of cross validation (CV), routinely adopted in supervised machine learning (SML) problems for early stopping to prevent SML models from overfitting the training data. Despite the noticed similarities between RDA and SML problems, some fundamental differences exist, making it fail to work well if one directly extends a vanilla CV procedure from SML to RDA. To tackle this identified challenge, we design an efficient CV procedure tailored for RDA problems, and inspect the performance of an IES algorithm equipped with this CV procedure (IES-CV) in both synthetic and real field case studies. Our numerical investigation indicates that the IES-CV algorithm achieves promising RDA performance in all case studies, confirming the possibility that with the aid of a proper stopping criterion, an IES algorithm can terminate at an appropriate iteration step with near-optimal RDA performance. Beyond these numerical findings, it is also our hope that the current work may help improve the best practices of applying IES to RDA problems, taking advantage of the effective, CV-based stopping criterion.