Train a lightweight ResNet18 multi-label classifier to detect marine debris (and cloud) from images. The repo includes:
- Data prep: parse Pascal VOC-style XML annotations into a CSV
- Training: train and save a model checkpoint + loss metrics
- Inference: run predictions on a folder of images
- Plotting: generate a training-loss curve
python -m venv .venv
. .venv/bin/activate # macOS/Linux
# OR on Windows PowerShell:
# .\.venv\Scripts\Activate.ps1
pip install -r requirements.txtExpected layout:
data/
train/ # training images + XML annotations
*.png|*.jpg|... # images (any PIL-supported format)
*.xml # Pascal VOC-style annotation files
test/ # images to run inference on
*.png # inference.py currently scans for .png files
Generate the labels CSV from data/train/*.xml:
python clean_csv.pyThis creates data/labels_parsed.csv with columns:
filename: image filename relative todata/traindebris: 0/1cloud: 0/1
python main.pyOutputs:
model.pth: saved model weightstraining_metrics.csv: epoch/loss log
Run predictions on data/test using the saved model.pth:
python inference.pyWhat you’ll see:
- Per-image scores for
DebrisandCloud(sigmoid probabilities) - An “ALERT” if debris score (\ge) the configured threshold
After training:
python plot_metrics.pyOutputs:
training_metrics.png: loss curve image (and also shows the plot interactively)
These defaults live at the top of main.py / inference.py:
- Paths:
data/train,data/test,data/labels_parsed.csv,model.pth - Training:
BATCH_SIZE=8,EPOCHS=12,lr=0.001 - Threshold:
THRESHOLD=0.5 - Device: uses CUDA if available, otherwise CPU
main.py # train (and writes metrics/model)
inference.py # run predictions on data/test images
clean_csv.py # build data/labels_parsed.csv from VOC XMLs
utils.py # PyTorch Dataset (reads images + labels CSV)
plot_metrics.py # plot training_metrics.csv -> training_metrics.png
requirements.txt
inference.pycurrently only considers files ending in.png. If your test images are.jpg, update the extension filter.- Training uses BCEWithLogitsLoss with 2 outputs (
debris,cloud) and appliessigmoidat inference time.