Supervised machine learning pipeline for subsurface reservoir characterisation using wireline well logs from the FORCE 2020 benchmark dataset. Two tasks are solved from the same data: classifying rock type (lithology) across 11 classes, and predicting continuous porosity values from indirect log curves.
Dataset: FORCE 2020 Lithofacies Prediction Benchmark — 98 wells, Norwegian North Sea
Models: Random Forest and XGBoost (both tasks)
Evaluation: Split by well — 79 training wells, 19 held-out test wells
| Model | Accuracy | Macro F1 |
|---|---|---|
| Random Forest | 75.0% | 0.468 |
| XGBoost | 67.9% | 0.429 |
Per-class F1 on held-out test wells:
| Lithology | RF F1 | XGB F1 | Notes |
|---|---|---|---|
| Shale | 0.861 | 0.802 | Dominant class — well separated by GR |
| Halite | 0.864 | 0.778 | Near-zero GR — highly distinctive |
| Sandstone | 0.711 | 0.723 | XGBoost slightly better |
| Limestone | 0.610 | 0.520 | RF advantage |
| Coal | 0.587 | 0.413 | Low density signature |
| Marl | 0.373 | 0.352 | Transitional — overlaps with Shale |
| Sandstone/Shale | 0.340 | 0.398 | Mixed facies — hard by definition |
| Chalk | 0.028 | 0.123 | XGBoost better on rare classes |
| Dolomite | 0.000 | 0.015 | Insufficient samples |
Key finding: Random Forest outperforms XGBoost on common lithologies. XGBoost is better on rare and transitional classes (Chalk, Dolomite, Sandstone/Shale), consistent with boosting's focus on hard examples.
| Model | RMSE | MAE | R² |
|---|---|---|---|
| Random Forest | 0.0489 | 0.0375 | 0.866 |
| XGBoost | 0.0523 | 0.0388 | 0.847 |
Target: density porosity (PHIT) derived from RHOB using the standard petrophysical formula. RHOB, NPHI, and their derivatives were excluded from the feature set to prevent data leakage — the model predicts porosity from independent logs only (DTC, GR, resistivity, SP, caliper).
Key finding: DTC (sonic travel time) is the dominant predictor (importance 0.36 RF, 0.55 XGBoost), consistent with the Wyllie time-average equation — 60 years of petrophysical theory independently confirmed by the model.
Five-track plot showing GR, resistivity, expert labels, RF predictions, and XGBoost predictions for a held-out well. The model correctly identifies major stratigraphic boundaries and reservoir sand zones.
Bormann, P., Aursand, P., Dilib, F., Dischington, P., & Garland, C. (2020). FORCE Machine Learning Competition. https://github.com/bolgebrygg/Force-2020-Machine-Learning-competition
Available at: https://drive.google.com/drive/folders/0B7brcf-eGK8CRUhfRW9rSG91bW8
The dataset contains wireline well logs and expert lithofacies labels for 98 wells in the Norwegian North Sea, released for the FORCE 2020 machine learning competition. Labels were hand-crafted by skilled geoscientists.
This project was originally scoped around the Volve field dataset released by Equinor. Volve is a landmark open dataset but its download is unreliable and the raw files require substantial format engineering before ML can be applied. The FORCE 2020 dataset is a peer-reviewed benchmark specifically designed for ML — it provides clean, labelled CSV data directly comparable to published baselines. Using it is standard practice in the well log ML community.
| Log | Measures | Missing % |
|---|---|---|
| GR | Natural radioactivity — primary shale indicator | 0.0% |
| RDEP | Deep resistivity — hydrocarbon indicator | 0.9% |
| RMED | Medium resistivity | 3.3% |
| DTC | Compressional sonic travel time | 6.9% |
| CALI | Borehole diameter (quality control) | 7.5% |
| RHOB | Bulk density | 13.8% |
| SP | Spontaneous potential | 26.2% |
| NPHI | Neutron porosity | 34.6% |
| RSHA | Shallow resistivity | 46.1% |
Seven logs with >70% missingness were dropped (SGR, DTS, RMIC, ROPA, DCAL, MUDWEIGHT, RXO). BS was dropped as redundant with CALI (correlation = −0.90).
| Class | Geological character | O&G significance |
|---|---|---|
| Sandstone | High porosity clastic | Primary reservoir |
| Shale | Clay-rich, radioactive | Seal rock |
| Sandstone/Shale | Interbedded | Reduced reservoir quality |
| Limestone | Carbonate | Carbonate reservoir |
| Marl | Carbonate-clay mix | Variable |
| Chalk | Soft carbonate | North Sea specific |
| Halite | Evaporite salt | Perfect seal |
| Anhydrite | Evaporite | Seal rock |
| Tuff | Volcanic ash | Marker horizon |
| Coal | Organic-rich | Source rock indicator |
| Dolomite | Dolomitised carbonate | Fractured reservoir |
This is the most important methodological decision in the project. A random row split would allow the model to train on 18,000 rows from a well and test on 2,000 rows from the same well. The model would have already learned that well's depth trends, GR baseline, and porosity regime — making the test set trivially easy to predict.
Splitting by entire wells forces the model to generalise to wells it has never seen — which is the real-world task. A petrophysicist trains on offset wells and predicts lithology in a new well before it is drilled.
- Sentinel values (−999.25, −9999) replaced with NaN before imputation
- Physical impossibilities clipped to instrument range (e.g. GR < 0, RHOB < 1.0)
- Baseline merged into Shale (103 samples — insufficient for classification)
- Per-well median imputation — not global median — to preserve local geological character
- Class weights applied (Shale = 61.6% of data)
Seven domain-informed features derived from standard petrophysical relationships:
| Feature | Formula / method | Geological meaning |
|---|---|---|
| DPHI | (2.65 − RHOB) / 1.65 | Density porosity |
| NPHI_RHOB_SEP | NPHI − DPHI | Gas crossover indicator |
| RES_RATIO | log10(RDEP / RSHA) | Invasion ratio — hydrocarbon flag |
| VSHALE | (GR − GR_clean) / (GR_shale − GR_clean) | Volume of shale |
| GR_NORM | Per-well z-score of GR | Removes depth trend |
| GR_ROLLING_MEAN | 5-sample window mean | Bed thickness context |
| GR_ROLLING_STD | 5-sample window std | Boundary detection |
Random Forest: 300 trees, max depth 25 (classification) / 20 (regression), balanced class weights, sqrt features per split, all CPU cores.
XGBoost: 300 rounds, max depth 6, learning rate 0.1, 80% row and column subsampling, sample weights for class imbalance.
Density porosity (PHIT) is derived from RHOB. Including RHOB as a feature would allow the model to trivially invert the formula, producing R² ≈ 1.000. RHOB, NPHI, DPHI, NPHI_RHOB_SEP, and DRHO are all excluded from the regression feature set. The model predicts porosity from DTC, resistivity logs, GR, and engineered features only — the geologically honest approach.
Well-Log-Analysis/
├── data/
│ ├── raw/ ← CSV_train.csv, CSV_test.csv (not tracked)
│ └── processed/ ← numpy arrays and dataframes (not tracked)
├── notebooks/
│ ├── 01_eda.ipynb ← data loading, visualisation, QC
│ ├── 02_preprocessing.ipynb ← cleaning, imputation, feature engineering
│ ├── 03_lithology_classification.ipynb ← RF + XGBoost classification
│ ├── 04_porosity_regression.ipynb ← RF + XGBoost regression
│ └── 05_evaluation.ipynb ← comparison figures and summary
├── outputs/
│ ├── figures/ ← all saved plots
│ └── models/ ← saved model files (not tracked)
├── requirements.txt
├── .gitignore
└── README.md
Download CSV_train.csv and CSV_test.csv from the FORCE 2020 Google Drive folder:
https://drive.google.com/drive/folders/0B7brcf-eGK8CRUhfRW9rSG91bW8
Place both files in data/raw/.
conda create -n well-log-analysis python=3.10 -y
conda activate well-log-analysis
pip install -r requirements.txt
python -m ipykernel install --user --name well-log-analysis --display-name "Well Log Analysis"01_eda.ipynb ← explore the data (run first)
02_preprocessing.ipynb ← clean and engineer features
03_lithology_classification.ipynb ← train and evaluate classifiers
04_porosity_regression.ipynb ← train and evaluate regressors
05_evaluation.ipynb ← generate all portfolio figures
| Figure | Description |
|---|---|
outputs/figures/lith_pred_*.png |
Predicted lithology log vs expert labels |
outputs/figures/final_per_class_f1.png |
Per-class F1 comparison RF vs XGBoost |
outputs/figures/final_summary_bars.png |
Cross-task performance summary |
outputs/figures/final_rf_confusion.png |
RF confusion matrix |
outputs/figures/porosity_scatter_clean.png |
Porosity predicted vs actual |
outputs/figures/porosity_pred_*.png |
Porosity prediction log track |
outputs/figures/log_correlation.png |
Log curve correlation matrix |
outputs/figures/porosity_by_lithology.png |
Median porosity per rock type |
Bormann, P., Aursand, P., Dilib, F., Dischington, P., & Garland, C. (2020). FORCE Machine Learning Competition — Lithofacies prediction from well logs. Norwegian Petroleum Directorate. https://github.com/bolgebrygg/Force-2020-Machine-Learning-competition
Author: Olasunkanmi
Tasks: Lithology classification (11 classes) + porosity regression
Dataset: FORCE 2020 Lithofacies Benchmark — 98 Norwegian North Sea wells