Skip to content

dbestdataguy/seismic-facies-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seismic Facies Classification: CNN vs Random Forest

Supervised machine learning pipeline for classifying subsurface geological units from 3D seismic reflection data. Trains a CNN and a Random Forest on the same dataset, then compares them on held-out test sections.

Dataset: F3 Netherlands Seismic Benchmark — Alaudah et al. (2019)
Task: 5-class seismic facies classification
Best result: CNN macro F1 = 0.955 (validation), 0.819 (test)


Results

Metric CNN Random Forest
Accuracy (val) 96.4% 91.1%
Macro F1 (val) 0.955 0.883
Macro F1 (test) 0.819 0.631

Per-class F1 on held-out test section:

Facies class CNN RF Gap
Upper North Sea 0.90 0.76 +0.14
Middle North Sea 0.91 0.84 +0.07
Lower North Sea 0.74 0.57 +0.16
Rijnland / Chalk 0.73 0.40 +0.33
Scruff 0.65 0.39 +0.25

The CNN's advantage is largest for structurally defined facies — Rijnland/Chalk (+0.33) and Scruff (+0.25) — where spatial pattern recognition provides information that aggregate statistics cannot.


Facies map comparison

Facies map

Left to right: raw seismic amplitude, expert ground truth, CNN prediction, Random Forest prediction. The CNN produces spatially coherent facies maps close to the ground truth. The RF breaks down in structurally complex zones, producing fragmented predictions where the geology is most challenging.


Dataset

This project uses the F3 Netherlands Seismic Benchmark prepared by:

Alaudah, Y., Michałowicz, P., Alfarraj, M., & Alregib, G. (2019). A machine-learning benchmark for facies classification. Interpretation, 7(3), SE175–SE187.

Available at: https://github.com/yalaudah/facies_classification_benchmark

The benchmark provides seismic amplitudes and facies labels as clean NumPy arrays, split into training and test sections. The F3 Netherlands survey covers the Dutch North Sea sector and uses the same six-class labelling scheme as the Parihaka benchmark, making results comparable to published work.

Why not raw Parihaka SEGY? Parsing the raw SEG-Y binary format — extracting trace headers, reconstructing 3D geometry, aligning with label files — is substantial domain-specific engineering. This benchmark provides verified, peer-reviewed data in a format that lets the project focus on the ML pipeline rather than format parsing. Using a citable benchmark also makes results directly comparable to published baselines.

Facies classes

Class Name Geological character
1 Upper North Sea Young marine sediments; smooth, continuous reflectors
2 Middle North Sea Intermediate marine clays/sands; moderate amplitude
3 Lower North Sea Deeper sands; high amplitude, parallel layering
4 Rijnland / Chalk Carbonate group; bright top reflection, dim below
5 Scruff Unconformity zone; chaotic, disrupted reflectors

Class 0 (Unknown) and Class 6 (Zechstein) are excluded from training — class 0 is unlabelled boundary voxels, class 6 is absent from the training volume.


Project structure

seismic-facies-classification/
├── data/
│   ├── raw/
│   │   └── facies_classification_benchmark/   ← benchmark data (not tracked)
│   └── processed/                             ← extracted patches and features (not tracked)
├── notebooks/
│   ├── 01_eda.ipynb                           ← data loading and visualisation
│   ├── 02_preprocessing.ipynb                 ← patch extraction and class balancing
│   ├── 03_cnn.ipynb                           ← CNN training and evaluation
│   ├── 04_random_forest.ipynb                 ← Random Forest training and evaluation
│   └── 05_comparison.ipynb                    ← side-by-side results and facies maps
├── outputs/
│   ├── figures/                               ← all saved plots
│   └── models/                               ← saved model weights (not tracked)
├── requirements.txt
└── README.md

Methodology

Preprocessing

Each labelled voxel in the training volume becomes one training sample. A 33×33 patch is extracted from the inline slice centred on that voxel, giving the model 16 samples of spatial context in every direction. The time dimension is padded with reflection padding to handle voxels near the volume boundary.

Training data is sampled with fixed per-class targets to address the severe class imbalance in the raw volume (Middle North Sea = 48.6%, Scruff = 1.5%). Final training set: 45,000 patches across 5 classes. Class weights are applied during CNN training to further penalise misclassification of rare classes.

CNN

Two convolutional blocks (32 and 64 filters, kernel size 3×3), each followed by batch normalisation, ReLU activation, 2×2 max pooling, and dropout (0.25). A dense head with 128 units and 0.4 dropout feeds a 5-class softmax output. Trained with Adam (lr=0.001), sparse categorical crossentropy loss, and early stopping on validation loss. Best weights restored from epoch 14 of 19.

Random Forest

200 decision trees with max depth 20, trained on 18 handcrafted features per patch: amplitude statistics (mean, std, min, max, range, median), RMS energy, skewness, kurtosis, zero-crossing rate, horizontal/vertical gradient energy, quadrant means, and centre-surround contrast. Features are standardised before training. Class weights set to balanced.

Why these two models?

The comparison is the scientific contribution. The CNN sees raw spatial patterns — it learns what a Rijnland/Chalk reflector looks like geometrically. The RF sees 18 summary statistics — it learns that Rijnland/Chalk has moderate mean amplitude and high RMS energy. The performance gap quantifies how much spatial pattern recognition matters for each facies class. For Upper and Middle North Sea the gap is small (7–14%); for Rijnland/Chalk and Scruff it is large (25–33%), because those classes are defined by spatial geometry rather than bulk amplitude properties.


How to run

1. Clone the benchmark data

cd data/raw
git clone https://github.com/yalaudah/facies_classification_benchmark

2. Set up the environment

conda create -n seismic-facies python=3.10 -y
conda activate seismic-facies
pip install -r requirements.txt
python -m ipykernel install --user --name seismic-facies --display-name "Seismic Facies"

3. Run notebooks in order

01_eda.ipynb              ← explore and visualise the data
02_preprocessing.ipynb    ← extract patches (takes ~10 minutes)
03_cnn.ipynb              ← train CNN (takes ~30–60 minutes on CPU)
04_random_forest.ipynb    ← train RF (takes ~5 minutes)
05_comparison.ipynb       ← generate facies maps and comparison figures

Hardware note

All training was done on CPU. CNN training took approximately 2–3 minutes per epoch (19 epochs total). If you have a CUDA-enabled GPU, replace tensorflow in requirements.txt with tensorflow[and-cuda] and expect 5–10× speedup.


Key figures

Figure Description
outputs/figures/facies_map_test1.png Main comparison: seismic, ground truth, CNN, RF
outputs/figures/cnn_vs_rf_per_class_f1.png Per-class F1 bar chart
outputs/figures/cnn_training_history.png Loss and accuracy curves
outputs/figures/cnn_confusion_matrix.png CNN normalised confusion matrix
outputs/figures/rf_confusion_matrix.png RF normalised confusion matrix
outputs/figures/rf_feature_importance.png Which features matter most for the RF

Reference

Alaudah, Y., Michałowicz, P., Alfarraj, M., & Alregib, G. (2019). A machine-learning benchmark for facies classification. Interpretation, 7(3), SE175–SE187. https://doi.org/10.1190/INT-2018-0249.1


Author: Olasunkanmi
Task: Seismic facies classification — CNN vs Random Forest
Dataset: F3 Netherlands Seismic Benchmark (Alaudah et al., 2019)

About

Supervised machine learning pipeline for classifying subsurface geological units from 3D seismic reflection data. Trains a CNN and a Random Forest on the same dataset, then compares them on held-out test sections.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors