This project chooses the pre-trained PIFu as the based-model, which takes in an (or multiple) image and outputs a shape represented by SDF (signed-distance function). One can leverage voxel sampling and marching cube algorithm to generate mesh.
The video2mesh tool adds the following features on top of PIFu:
- Auto extract video frames as png files where the alpha (4-th) channel contains the mask info (0: background, 255: foreground)
- Use MobileNetV3 to extract human from background (beat traditional method KNN and MOG2)
- Add video frame sharpening kernel 3x3
- Use DeFMO to do motion deblurring on test video (Failed)
- OpenCV for image sharpening, KNN, and MOG2 background segmentation, video encoding and decoding.
- MediaPipe for MobileNetV3 background segmentation.
- natsort for image file sorting
- Put video into
smaple_videofolder (accept more than one) - Run
./scripts/make_test_set.shto generate testing png (one mask and one cropped image per frame) insample_images
Please feel free to modify the script
python ./apps/video_to_png.py \
--input_folder_path [path to folder contains videos] \ # default="./sample_videos"
--output_folder_path [path to output folder] \ # default="./sample_images"
--algo ["MobileNetV3", "KNN", "MOG2"] \ # default="MobileNetV3"
--sharpening \ # enable sharpening, default=False
--play_video # play frame while processing, default=False
- Run
./scripts/test_video.shto generate obj file frame by frame.
Others:
- Run
./scripts/make_video.shto convert masked png into a mp4 video - Run
./scripts/display.shto generate video of a rotating obj file
Please find the comparison and discussion here.
Please find all predicted mesh files here


