This repository is for environment setup and inference of the paper "SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion."
result_sv3d.mp4
result2_sv3d.mp4
Our default, provided install method is based on Conda package and environment management:
conda create -n sv3d python=3.10
conda activate sv3d
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # check CUDA version
pip3 install -r requirements/pt2.txt
pip3 install .
pip3 install -e git+https://github.com/Stability-AI/datapipelines.git@main#egg=sdatapip install huggingface_hub
huggingface-cli login
huggingface-cli download stabilityai/sv3d sv3d_u.safetensors --local-dir checkpointsscripts/sampling/simple_video_sample.py --input_path <path/to/image.png> --version sv3d_u
pip install huggingface_hub
huggingface-cli login
huggingface-cli download stabilityai/sv3d sv3d_p.safetensors --local-dir checkpointsGenerate static orbit at a specified elevation eg. 10.0
python scripts/sampling/simple_video_sample.py \
--input_path <path/to/image.png> \
--version sv3d_p \
--elevations_deg 10.0Generate dynamic orbit at a specified elevations and azimuths: specify sequences of 21 elevations (in degrees) to elevations_deg ([-90, 90]), and 21 azimuths (in degrees) to azimuths_deg [0, 360] in sorted order from 0 to 360.
python scripts/sampling/simple_video_sample.py --input_path <path/to/image.png> --version sv3d_p --elevations_deg [<list of 21 elevations in degrees>] --azimuths_deg [<list of 21 azimuths in degrees>]@misc{voleti2024sv3dnovelmultiviewsynthesis,
title = {SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion},
author = {Vikram Voleti and Chun-Han Yao and Mark Boss and Adam Letts and David Pankratz and Dmitry Tochilkin and Christian Laforte and Robin Rombach and Varun Jampani},
year = {2024},
eprint = {2403.12008},
archivePrefix= {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2403.12008}
}