Adopted SAST to support Multimodal input of RGB and Event data

This is a modified copy of SAST. It includes different model architectures making use of a modified SAST architecture.

Adopted SAST to support Multimodal input of RGB and Event data

Conda Installation

./setup_env.sh
conda activate sast

Detectron2 is not strictly required but speeds up the evaluation.

Used datasets for training and evaluation

You may also pre-process the dataset yourself by following the instructions.

Pre-trained Checkpoints

We used the following checkpoints from the respective model repositories

YOLOX	SAST	LWDETR
YOLOX-s	1 Mpx	LWDETR_tiny_30e_objects365

Training

Set DATA_DIR in set_envs.sh
Set other parameters as well for example GPUS=0 or GPUS=[0,1]
run:

source set_envs.sh

NeRDD and F-UAV-D

To run models with both event and rgb data select the dataset=eventrgb type and specify the DATA_DIR to the preprocessed data. Check the config folder to set pretrained checkpoints

python train.py model=rnndet dataset=eventrgb dataset.path=${DATA_DIR} wandb.project_name=SAST 
wandb.group_name=1mpx hardware.num_workers.train=2 batch_size.train=${BATCH_SIZE_PER_GPU} 
hardware.num_workers.eval=2 batch_size.eval=${BATCH_SIZE_PER_GPU} 
hardware.gpus=[${GPUS}] +experiment/gen4="base.yaml" 
training.learning_rate=${lr} validation.val_check_interval=10000

Models that can be selected

The model= parameter can be one of those:

rnndet
lwdetr
rgb
eventrgb

rnndet is the base SAST model and rgb is a base yolox model. SAST expects only event data while rgb expects RGB images. Both models work with dataset=eventrgb

lwdetr architekture

Using SAST as a model to create sparsified masks:

eventrgb

needs both modalities RGB and event data

Model performance

StStephan is a subset of the F-UAV-D dataset

The table names map to the model parameter like this:

SAST = rnndet
YOLOX-RGB = rgb
SAST+RGB = eventrgb
SAST+LWDETR = lwdetr
SAST-Pretrained+LWDETR = lwdetr (checkpoint from training SAST on StStephan)

Example

python train.py model=eventrgb dataset=eventrgb dataset.path="${DATA_DIR_NERD}" wandb.project_name=NERD_NEW wandb.group_name=rgb batch_size.train=4 batch_size.eval=4 hardware.gpus=\[${GPUS}\] +experiment/arma="base.yaml" training.learning_rate=${lr} dataset.train.use_fraction=1 dataset.validation.use_fraction=1

python train.py model=rgb dataset=eventrgb dataset.path="${DATA_DIR_NERD}" wandb.project_name=NERD_NEW wandb.group_name=rgb batch_size.train=4 batch_size.eval=4 hardware.gpus=\[${GPUS}\] +experiment/arma="base.yaml" training.learning_rate=${lr} dataset.train.use_fraction=1 dataset.validation.use_fraction=1 fpn.ckpt=yolox_s.pth

Code Acknowledgments

This project has used code from the following projects:

SAST for the SAST architecture
RVT for the RVT architecture implementation in Pytorch
timm for the original MaxViT layer implementation in Pytorch
YOLOX for the detection PAFPN/head

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
callbacks		callbacks
config		config
data		data
figures		figures
loggers		loggers
models		models
modules		modules
preprocessing		preprocessing
scripts/genx		scripts/genx
test		test
util		util
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
export.py		export.py
flop_count.py		flop_count.py
run_onnx.py		run_onnx.py
set_envs.sh		set_envs.sh
setup_env.sh		setup_env.sh
train.py		train.py
validation.py		validation.py
visualizer.py		visualizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adopted SAST to support Multimodal input of RGB and Event data

Conda Installation

Used datasets for training and evaluation

Pre-trained Checkpoints

Training

NeRDD and F-UAV-D

Models that can be selected

lwdetr architekture

eventrgb

Model performance

Example

Code Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Adopted SAST to support Multimodal input of RGB and Event data

Conda Installation

Used datasets for training and evaluation

Pre-trained Checkpoints

Training

NeRDD and F-UAV-D

Models that can be selected

lwdetr architekture

eventrgb

Model performance

Example

Code Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages