We propose a Semantic Constrained Attention ReFinement (SCARF) network. Our model can efficiently capture the long-range contextual information with semantic constraint layer-by-layer, which enhances the semantic information and structure reasoning of the model. We achieve superior performance on three challenging scene segmentation datasets, i.e., PASCAL VOC 2012, PASCAL Context and Cityscapes datasets.
-
Install pytorch
Our code is conducted on python 3.5 and torch 1.4.0.
-
Install environment
clone the repository and open the folder, run
python setup.pyor
pip install -e . -
Dataset
PASCAL VOC 2012: Download the PASCAL VOC 2012 dataset and augmentation data, then convert the dataset into trainaug, trainval, and test sets for training, fine-tuning and testing, respectively.
Cityscapes: Download the Cityscapes dataset and convert the dataset to 19 categories.
PASCAL Context: run script
scripts/prepare_pcontext.py. -
Training on PASCAL VOC 2012 dataset
cd ./experiments/segmentation/Training on the trainaug set:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset pascal_aug2 --backbone resnet101s --model scarf --checkname myname --ft --DS --dilatedFine-tuning on the trainval set:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset pascal_voc --backbone resnet101s --model scarf --checkname myname --resume "pretrained_model_path" --ft --DS --dilatedwhere
pretrained_model_pathis the path to the model trained on trainaug set. -
Evaluation on PASCAL VOC 2012 dataset
cd ./experiments/segmentation/Single scale testing on val set for model (trained on trainaug set):
CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset pascal_voc --backbone resnet101s --model scarf --resume "pretrained_model_path" --eval --dilatedOne can download pretrained SCARF model with password 92b9 (trained on trainaug set) for easy testing. The expected score will show (mIoU/pAcc): 81.58/95.86.