hf2vad/pre_process/readme.md at master · maxytian/hf2vad

Data preprocessing

0. Dataset preparing

Here we use the ped2 dataset as example.

Download the video anomaly detection dataset and place it into the data directory of this project. In order to evaluate the frame-level AUC, we provide the frame labels of each test video in data/ped2/ground_truth_demo/gt_label.json

The file structure should be similar as follows:

./data
└── ped2
    ├── ground_truth_demo
    │   └── gt_label.json
    ├── testing
    │   └── frames
    │       ├── Test001
    │       ├── Test001_gt
    │       ├── Test002
    │       ├── Test002_gt
    │       ├── Test003
    │       ├── Test003_gt
    │       ├── Test004
    │       ├── Test004_gt
    │       ├── Test005
    │       ├── Test005_gt
    │       ├── Test006
    │       ├── Test006_gt
    │       ├── Test007
    │       ├── Test007_gt
    │       ├── Test008
    │       ├── Test008_gt
    │       ├── Test009
    │       ├── Test009_gt
    │       ├── Test010
    │       ├── Test010_gt
    │       ├── Test011
    │       ├── Test011_gt
    │       ├── Test012
    │       └── Test012_gt
    └── training
        └── frames
            ├── Train001
            ├── Train002
            ├── Train003
            ├── Train004
            ├── Train005
            ├── Train006
            ├── Train007
            ├── Train008
            ├── Train009
            ├── Train010
            ├── Train011
            ├── Train012
            ├── Train013
            ├── Train014
            ├── Train015
            └── Train016

1. Objects detecting

Please install the mmcv and mmdetection accordingly, then download the cascade RCNN pretrained weights (we use cascade_rcnn_r101_fpn_1x_coco_20200317-0b6a2fbf.pth) and place it in pre_porocess/assets folder.

Run the following command to detect all the foreground objects.

$ python extract_bboxes.py [--proj_root] [--dataset_name] [--mode]

E.g., to extract objects of all training data:

$ python extract_bboxes.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=train

To extract objects of all test data:

$ python extract_bboxes.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=test

After doing this, the results will be default saved at ./data/ped2/ped2_bboxes_train.npy, in which each item contains all the bounding boxes in a single video frame.

2. Extracting optical flows

We extract optical flows in videos using use FlowNet2.0.

download the pre-trained FlowNet2 weights (i.e., FlowNet2_checkpoint.pth.tar) from here and place it in pre_process/assets.
build the customer layers via executing install_custome_layers.sh.
run the following command to estimate all the optical flows:

$ python extract_flows.py [--proj_root] [--dataset_name] [--mode]

E.g., to extract flows of all training data:

$ python extract_flows.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=train

To extract flows of all test data:

$ python extract_flows.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=test

After doing this, the estimated flows will be default saved at ./data/ped2/traning/flows. The final data structure should be similar as follows:

./data
└── ped2
    ├── ground_truth_demo
    │   └── gt_label.json
    ├── ped2_bboxes_test.npy
    ├── ped2_bboxes_train.npy
    ├── testing
    │   ├── flows
    │   │   ├── Test001
    │   │   ├── Test002
    │   │   ├── Test003
    │   │   ├── Test004
    │   │   ├── Test005
    │   │   ├── Test006
    │   │   ├── Test007
    │   │   ├── Test008
    │   │   ├── Test009
    │   │   ├── Test010
    │   │   ├── Test011
    │   │   └── Test012
    │   └── frames
    │       ├── Test001
    │       ├── Test001_gt
    │       ├── Test002
    │       ├── Test002_gt
    │       ├── Test003
    │       ├── Test003_gt
    │       ├── Test004
    │       ├── Test004_gt
    │       ├── Test005
    │       ├── Test005_gt
    │       ├── Test006
    │       ├── Test006_gt
    │       ├── Test007
    │       ├── Test007_gt
    │       ├── Test008
    │       ├── Test008_gt
    │       ├── Test009
    │       ├── Test009_gt
    │       ├── Test010
    │       ├── Test010_gt
    │       ├── Test011
    │       ├── Test011_gt
    │       ├── Test012
    │       └── Test012_gt
    └── training
        ├── flows
        │   ├── Train001
        │   ├── Train002
        │   ├── Train003
        │   ├── Train004
        │   ├── Train005
        │   ├── Train006
        │   ├── Train007
        │   ├── Train008
        │   ├── Train009
        │   ├── Train010
        │   ├── Train011
        │   ├── Train012
        │   ├── Train013
        │   ├── Train014
        │   ├── Train015
        │   └── Train016
        └── frames
            ├── Train001
            ├── Train002
            ├── Train003
            ├── Train004
            ├── Train005
            ├── Train006
            ├── Train007
            ├── Train008
            ├── Train009
            ├── Train010
            ├── Train011
            ├── Train012
            ├── Train013
            ├── Train014
            ├── Train015
            └── Train016

3. Prefetch spatial-temporal cubes

For every extracted object above, we can construct a spatial-temporal cube (STC). For example, assume we extract only one bbox in $i$-th frame, then we can crop the same region from $(i-4), (i-3), (i-2), (i-1), i$ frames using the coordinates of that bbox, resulting a STC with shape [5,3,H,W]. Things are similar for the optical flows.

To extract all the STCs in the dataset, run the following command:

$ python extract_samples.py [--proj_root] [--dataset_name] [--mode]

E.g., to extract samples of all training data:

$ python extract_samples.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=train

To extract samples of all test data:

$ python extract_samples.py --proj_root=<path/to/project_root> --dataset_name=ped2 --mode=test

Note that the extracted samples number will be very large for Avenue and ShanghaiTech dataset, hence we save the samples in a chunked file manner. The max number of samples in a separate chunked file is set to be 100K by default, feel free to modify that in #Line11 here depending on the available memory and disk space of your machine.

Given the first 4 frames and corresponding flows as input, the model is encouraged to predict the final frame.

After finishing the steps above, your dataset file structure should be similar as follows:

./data
└── ped2
    ├── ground_truth_demo
    │   └── gt_label.json
    ├── ped2_bboxes_test.npy
    ├── ped2_bboxes_train.npy
    ├── testing
    │   ├── chunked_samples
    │   │   └── chunked_samples_00.pkl
    │   ├── flows
    │   │   ├── Test001
    │   │   ├── Test002
    │   │   ├── Test003
    │   │   ├── Test004
    │   │   ├── Test005
    │   │   ├── Test006
    │   │   ├── Test007
    │   │   ├── Test008
    │   │   ├── Test009
    │   │   ├── Test010
    │   │   ├── Test011
    │   │   └── Test012
    │   └── frames
    │       ├── Test001
    │       ├── Test001_gt
    │       ├── Test002
    │       ├── Test002_gt
    │       ├── Test003
    │       ├── Test003_gt
    │       ├── Test004
    │       ├── Test004_gt
    │       ├── Test005
    │       ├── Test005_gt
    │       ├── Test006
    │       ├── Test006_gt
    │       ├── Test007
    │       ├── Test007_gt
    │       ├── Test008
    │       ├── Test008_gt
    │       ├── Test009
    │       ├── Test009_gt
    │       ├── Test010
    │       ├── Test010_gt
    │       ├── Test011
    │       ├── Test011_gt
    │       ├── Test012
    │       └── Test012_gt
    └── training
        ├── chunked_samples
        │   └── chunked_samples_00.pkl
        ├── flows
        │   ├── Train001
        │   ├── Train002
        │   ├── Train003
        │   ├── Train004
        │   ├── Train005
        │   ├── Train006
        │   ├── Train007
        │   ├── Train008
        │   ├── Train009
        │   ├── Train010
        │   ├── Train011
        │   ├── Train012
        │   ├── Train013
        │   ├── Train014
        │   ├── Train015
        │   └── Train016
        └── frames
            ├── Train001
            ├── Train002
            ├── Train003
            ├── Train004
            ├── Train005
            ├── Train006
            ├── Train007
            ├── Train008
            ├── Train009
            ├── Train010
            ├── Train011
            ├── Train012
            ├── Train013
            ├── Train014
            ├── Train015
            └── Train016

The above steps also support Avenue and ShanghaiTech datasets. (May consume large disk space)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data preprocessing

0. Dataset preparing

1. Objects detecting

2. Extracting optical flows

3. Prefetch spatial-temporal cubes

FilesExpand file tree

readme.md

Latest commit

History

readme.md

File metadata and controls

Data preprocessing

0. Dataset preparing

1. Objects detecting

2. Extracting optical flows

3. Prefetch spatial-temporal cubes