This project is part of the coursework for CSCE 689 - Advanced Deep Learning. For a given target action - walking, deep learning techniques are used to detect the action in the given video.
Use the package manager pip or conda to install the necessary packages.
The data is stored here: HMDB, NIXMAS. This contains the training videos, frames of the videos, test samples, generated landmarks using openpose, json files.
- If using google colab,mount the drive and use the files. Use colab with gpu runtime.
- If running locally or in cluster, download the data and update the data locations appropriately in the code. Use gpu to train the models.
- data_loader.py - contains functions to load the train and test data and for preprocessing the images.
- model.py - models used in the project. Code for 3DCNN, CRNN and Resnet + RNN.
- videoToFrame.py - to convert the training data videos to frames. Train videos are in drive folder.
python ./videoToFrame.py - splitVideo.sh - to split the test videos to multiple components to run prediction.
./splitVideo --<video_file>- openpose.ipynb - to generate landmarks using openpose. Run the cells in the notebook. Enter the correct location of the video files.
Program to train the models.
python ./<model_name>.py The trained models are included in the drive link.
- Load the test files in the google drive.
- Split the test videos to multiple samples using splitVideo.sh.
- Convert video to frames.
- Open the prediction_code.ipynb in utility/. Enter location of the test video images and run the code. The prediction results will be stored in .json format.