Seismic Facies Classification: CNN vs Random Forest

Supervised machine learning pipeline for classifying subsurface geological units from 3D seismic reflection data. Trains a CNN and a Random Forest on the same dataset, then compares them on held-out test sections.

Dataset: F3 Netherlands Seismic Benchmark — Alaudah et al. (2019)
Task: 5-class seismic facies classification
Best result: CNN macro F1 = 0.955 (validation), 0.819 (test)

Results

Metric	CNN	Random Forest
Accuracy (val)	96.4%	91.1%
Macro F1 (val)	0.955	0.883
Macro F1 (test)	0.819	0.631

Per-class F1 on held-out test section:

Facies class	CNN	RF	Gap
Upper North Sea	0.90	0.76	+0.14
Middle North Sea	0.91	0.84	+0.07
Lower North Sea	0.74	0.57	+0.16
Rijnland / Chalk	0.73	0.40	+0.33
Scruff	0.65	0.39	+0.25

The CNN's advantage is largest for structurally defined facies — Rijnland/Chalk (+0.33) and Scruff (+0.25) — where spatial pattern recognition provides information that aggregate statistics cannot.

Facies map comparison

Left to right: raw seismic amplitude, expert ground truth, CNN prediction, Random Forest prediction. The CNN produces spatially coherent facies maps close to the ground truth. The RF breaks down in structurally complex zones, producing fragmented predictions where the geology is most challenging.

Dataset

This project uses the F3 Netherlands Seismic Benchmark prepared by:

Alaudah, Y., Michałowicz, P., Alfarraj, M., & Alregib, G. (2019). A machine-learning benchmark for facies classification. Interpretation, 7(3), SE175–SE187.

Available at: https://github.com/yalaudah/facies_classification_benchmark

The benchmark provides seismic amplitudes and facies labels as clean NumPy arrays, split into training and test sections. The F3 Netherlands survey covers the Dutch North Sea sector and uses the same six-class labelling scheme as the Parihaka benchmark, making results comparable to published work.

Why not raw Parihaka SEGY? Parsing the raw SEG-Y binary format — extracting trace headers, reconstructing 3D geometry, aligning with label files — is substantial domain-specific engineering. This benchmark provides verified, peer-reviewed data in a format that lets the project focus on the ML pipeline rather than format parsing. Using a citable benchmark also makes results directly comparable to published baselines.

Facies classes

Class	Name	Geological character
1	Upper North Sea	Young marine sediments; smooth, continuous reflectors
2	Middle North Sea	Intermediate marine clays/sands; moderate amplitude
3	Lower North Sea	Deeper sands; high amplitude, parallel layering
4	Rijnland / Chalk	Carbonate group; bright top reflection, dim below
5	Scruff	Unconformity zone; chaotic, disrupted reflectors

Class 0 (Unknown) and Class 6 (Zechstein) are excluded from training — class 0 is unlabelled boundary voxels, class 6 is absent from the training volume.

Project structure

seismic-facies-classification/
├── data/
│   ├── raw/
│   │   └── facies_classification_benchmark/   ← benchmark data (not tracked)
│   └── processed/                             ← extracted patches and features (not tracked)
├── notebooks/
│   ├── 01_eda.ipynb                           ← data loading and visualisation
│   ├── 02_preprocessing.ipynb                 ← patch extraction and class balancing
│   ├── 03_cnn.ipynb                           ← CNN training and evaluation
│   ├── 04_random_forest.ipynb                 ← Random Forest training and evaluation
│   └── 05_comparison.ipynb                    ← side-by-side results and facies maps
├── outputs/
│   ├── figures/                               ← all saved plots
│   └── models/                               ← saved model weights (not tracked)
├── requirements.txt
└── README.md

Methodology

Preprocessing

Each labelled voxel in the training volume becomes one training sample. A 33×33 patch is extracted from the inline slice centred on that voxel, giving the model 16 samples of spatial context in every direction. The time dimension is padded with reflection padding to handle voxels near the volume boundary.

Training data is sampled with fixed per-class targets to address the severe class imbalance in the raw volume (Middle North Sea = 48.6%, Scruff = 1.5%). Final training set: 45,000 patches across 5 classes. Class weights are applied during CNN training to further penalise misclassification of rare classes.

CNN

Two convolutional blocks (32 and 64 filters, kernel size 3×3), each followed by batch normalisation, ReLU activation, 2×2 max pooling, and dropout (0.25). A dense head with 128 units and 0.4 dropout feeds a 5-class softmax output. Trained with Adam (lr=0.001), sparse categorical crossentropy loss, and early stopping on validation loss. Best weights restored from epoch 14 of 19.

Random Forest

200 decision trees with max depth 20, trained on 18 handcrafted features per patch: amplitude statistics (mean, std, min, max, range, median), RMS energy, skewness, kurtosis, zero-crossing rate, horizontal/vertical gradient energy, quadrant means, and centre-surround contrast. Features are standardised before training. Class weights set to balanced.

Why these two models?

The comparison is the scientific contribution. The CNN sees raw spatial patterns — it learns what a Rijnland/Chalk reflector looks like geometrically. The RF sees 18 summary statistics — it learns that Rijnland/Chalk has moderate mean amplitude and high RMS energy. The performance gap quantifies how much spatial pattern recognition matters for each facies class. For Upper and Middle North Sea the gap is small (7–14%); for Rijnland/Chalk and Scruff it is large (25–33%), because those classes are defined by spatial geometry rather than bulk amplitude properties.

How to run

1. Clone the benchmark data

cd data/raw
git clone https://github.com/yalaudah/facies_classification_benchmark

2. Set up the environment

conda create -n seismic-facies python=3.10 -y
conda activate seismic-facies
pip install -r requirements.txt
python -m ipykernel install --user --name seismic-facies --display-name "Seismic Facies"

3. Run notebooks in order

01_eda.ipynb              ← explore and visualise the data
02_preprocessing.ipynb    ← extract patches (takes ~10 minutes)
03_cnn.ipynb              ← train CNN (takes ~30–60 minutes on CPU)
04_random_forest.ipynb    ← train RF (takes ~5 minutes)
05_comparison.ipynb       ← generate facies maps and comparison figures

Hardware note

All training was done on CPU. CNN training took approximately 2–3 minutes per epoch (19 epochs total). If you have a CUDA-enabled GPU, replace tensorflow in requirements.txt with tensorflow[and-cuda] and expect 5–10× speedup.

Key figures

Figure	Description
`outputs/figures/facies_map_test1.png`	Main comparison: seismic, ground truth, CNN, RF
`outputs/figures/cnn_vs_rf_per_class_f1.png`	Per-class F1 bar chart
`outputs/figures/cnn_training_history.png`	Loss and accuracy curves
`outputs/figures/cnn_confusion_matrix.png`	CNN normalised confusion matrix
`outputs/figures/rf_confusion_matrix.png`	RF normalised confusion matrix
`outputs/figures/rf_feature_importance.png`	Which features matter most for the RF

Reference

Alaudah, Y., Michałowicz, P., Alfarraj, M., & Alregib, G. (2019). A machine-learning benchmark for facies classification. Interpretation, 7(3), SE175–SE187. https://doi.org/10.1190/INT-2018-0249.1

Author: Olasunkanmi
Task: Seismic facies classification — CNN vs Random Forest
Dataset: F3 Netherlands Seismic Benchmark (Alaudah et al., 2019)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seismic Facies Classification: CNN vs Random Forest

Results

Facies map comparison

Dataset

Facies classes

Project structure

Methodology

Preprocessing

CNN

Random Forest

Why these two models?

How to run

1. Clone the benchmark data

2. Set up the environment

3. Run notebooks in order

Hardware note

Key figures

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
notebooks		notebooks
outputs/figures		outputs/figures
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Seismic Facies Classification: CNN vs Random Forest

Results

Facies map comparison

Dataset

Facies classes

Project structure

Methodology

Preprocessing

CNN

Random Forest

Why these two models?

How to run

1. Clone the benchmark data

2. Set up the environment

3. Run notebooks in order

Hardware note

Key figures

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages