by Victor Eduardo Martinez Abaunza e Anderson de Rezende Rocha, presented at First EAGE Conference on Machine Learning in Americas, September 2020
The oil & gas (O&G) industry is one of the most important activities that support the economy in the world. Reservoir production management is a challenge with several facets and, surely, one of great interest refers to the difficult forecasting of O&G production in a reservoir. Most models in the prior art are physics-based, and try to predict the reservoir behavior based on fluid dynamics simulation. The problem with these models is the high computational cost footprint as models can take several hours/days/weeks of computation to obtain an accurate simulation (Zhang, 2018). On the other hand, Machine Learning (ML) algorithms have been implemented in several applications, and they can lead to breakthroughs in several areas. Recently, they have been used in time series forecasting, where we try to predict a variable from historical records. It is very promising for O&G management as ML methods could be exploited for production forecasting. One advantage is the computational footprint of ML algorithms that are smaller as we can deploy most tasks in GPUs so highly parallelizing them. ML-based reservoir models have been classified into two major classes: first, the Surrogate Reservoir Models (SRM), in which the simulation is based on synthetic numerical models and try to reproduce accurate replicas of traditional reservoirs; and second, Top-down Models, when an ML model is built from actual field data such as historical production, seismic attributes, well production, etc. (Mohaghegh, 2011). The main challenge on all these models is to find the optimal input set that allows us to perform an accurate production forecasting. Sometimes it is difficult to find relationships, or correlations, between input and production variables. Common methods of production forecasting are based on regression algorithms such as Support Vector Machines (Noshi et al. 2019), Random Forests (Maucec and Garni, 2019), or Radial Basis Functions models (Memon et al. 2014). All these ML models share a similar methodology: first, they seek correlations between input values (i.e., injection wells, bottom-hole pressure, well logs, etc.); second, they define a training set from the input and the production variables; third, they train an ML model; and, finally, they validate the model with a testing set. If results from the regression model are close to the ones in the testing set, we can say the ML model has learned properly the mapping function between the input variables and the target variable, and the trained model can be used in forecasting. In this work, we present a forecasting model for O&G production based on a data-driven approach and supported by ML algorithms. Our model takes advantage of a long-short term memory (LSTM) methodology capable of finding correlations in the data not only in the recent data points of a time series but also more subtle ones present in longer time intervals. The results show promising perspectives for forecasting a short-term context for oil, gas, water, and liquid production on a synthetic, but realistic, benchmark.