The Golem: A general data-driven model for oil & gas forecasting based on recurrent neural networks
Resumo
Oil & gas forecasting is one of the most critical issues in reservoir management. Physics-based simulations are the most common models used for production forecasts in oilfields. Previous works based on Machine Learning (ML) developed models focused on the oil rate as the unique target variable, forecasting by one-day output, and just one class of reservoir (synthetic or actual). This work introduces a general data-driven model based on Recurrent Neural Networks to forecast an adaptive sequence of timestamps for the complete production rates (oil, gas, and water), and we also included the well-bore pressure as a target variable, for both classes of reservoirs as actual as synthetic. The first dataset was obtained from the synthetic benchmark UNISIM-II-H, which simulates a carbonate reservoir in the Brazilian pre-salt; the second dataset is from an actual reservoir, the Volve oilfield, a decommissioned reservoir in the Norwegian North Sea. The forecasting is calculated using an input sequence of daily values from the historical record of the production rates and the pressure; the output is also a set of the values to the next sequence of days for one selected production variable (oil, gas, water, or pressure). The size of both input and output sets is adaptive and its adjustment depends on the dataset size and the production time. We built the model and compared it between the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU) implementations. We tuned the architectural parameters of the model, the input size of historical records, and the output forecasting days. We performed the training/testing procedures with several sizes for the training dataset from the target well-bore and tested with the remaining data to evaluate the model stability. We adopted the Symmetric Mean Absolute Percentage Error (SMAPE) and the coefficient of determination r-square (R2) metrics to compare our forecasting values to the production rates and the pressure, most of the results for both synthetic and actual oilfields exhibited that the model can follow an accurate trend of the production rates and the pressure, and the output values are approximated. Forecasting values from the designed model exhibited closer values when compared to the expected data from the well-bores in most of the experiments, some cases exhibited a SMAPE lower than 2 and R2 up to 0.99. The model can learn the behavior of each production variable in the training stage and the forecasting output can be adapted for a set of several timestamps.