Skip to content

cvrs-ys801/ApDepth-G

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

7 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

ApDepth: Aiming for Precise Monocular Depth Estimation Based on Diffusion Models

This repository is based on Marigold, CVPR 2024 Best Paper: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Website License Hugging Face Model

Jiawei Wang, Shuai Yuan Mingbo Lei

cover

Note

This project follows the same training methodology as ApDepth and serves as an extension of its content. It is provided for reference only.

๐Ÿ› ๏ธ Setup

The model was trained on:

  • Ubuntu 22.04 LTS, Python 3.12.9, CUDA 11.8, GeForce RTX 6000 Ada Generation

The inference code was tested on:

  • Ubuntu 22.04 LTS, Python 3.12.9, CUDA 11.8, GeForce RTX 4090

๐Ÿชง A Note for Windows users

We recommend running the code in WSL2:

  1. Install WSL following installation guide.
  2. Install CUDA support for WSL following installation guide.
  3. Find your drives in /mnt/<drive letter>/; check WSL FAQ for more details. Navigate to the working directory of choice.

๐Ÿ“ฆ Repository

Clone the repository (requires git):

git clone https://github.com/cvrs-ys801/ApDepth-G.git
cd ApDepth-G

๐Ÿ’ป Dependencies

Using Conda: Alternatively, create a Python native virtual environment and install dependencies into it:

conda create -n apdepth python==3.12.9
conda activate apdepth
pip install -r requirements.txt

Keep the environment activated before running the inference script. Activate the environment again after restarting the terminal session.

๐Ÿƒ Testing on your images

๐Ÿ“ท Prepare images

  • Use selected images under input. Or place your images in it, for example, under input/test-image, and run the following inference command.

๐ŸŽฎ Run inference with paper setting

This setting corresponds to our paper. For academic comparison, please run with this setting.

python run.py \
    --checkpoint prs-eth/marigold-v1-0 \
    --ensemble_size 1 \
    --input_rgb_dir input/in-the-wild_example \
    --output_dir output/in-the-wild_example

You can find all results in output/in-the-wild_example. Enjoy!

โš™๏ธ Inference settings

The default settings are optimized for the best result. However, the behavior of the code can be customized:

  • Trade-offs between the accuracy and speed (for both options, larger values result in better accuracy at the cost of slower inference.)

    • --ensemble_size: Number of inference passes in the ensemble. For LCM ensemble_size is more important than denoise_steps. Default: 10
  • By default, the inference script resizes input images to the processing resolution, and then resizes the prediction back to the original resolution. This gives the best quality, as Stable Diffusion, from which Marigold is derived, performs best at 768x768 resolution.

    • --processing_res: the processing resolution; set as 0 to process the input resolution directly. When unassigned (None), will read default setting from model config. Default: 768 None.
    • --output_processing_res: produce output at the processing resolution instead of upsampling it to the input resolution. Default: False.
    • --resample_method: the resampling method used to resize images and depth predictions. This can be one of bilinear, bicubic, or nearest. Default: bilinear.
  • --half_precision or --fp16: Run with half-precision (16-bit float) to have faster speed and reduced VRAM usage, but might lead to suboptimal results.

  • --seed: Random seed can be set to ensure additional reproducibility. Default: None (unseeded). Note: forcing --batch_size 1 helps to increase reproducibility. To ensure full reproducibility, deterministic mode needs to be used.

  • --batch_size: Batch size of repeated inference. Default: 0 (best value determined automatically).

  • --color_map: Colormap used to colorize the depth prediction. Default: Spectral. Set to None to skip colored depth map generation.

  • --apple_silicon: Use Apple Silicon MPS acceleration.

โฌ‡ Checkpoint cache

By default, the checkpoint is stored in the Hugging Face cache. The HF_HOME environment variable defines its location and can be overridden, e.g.:

export HF_HOME=$(pwd)/cache

Alternatively, use the following script to download the checkpoint weights locally:

bash script/download_weights.sh apdepth-G

At inference, specify the checkpoint path:

python run.py \
    --checkpoint checkpoint/marigold-v1-0 \
    --ensemble_size 10 \
    --input_rgb_dir input/in-the-wild_example\
    --output_dir output/in-the-wild_example

๐Ÿฆฟ Evaluation on test datasets

Install additional dependencies:

pip install -r requirements+.txt -r requirements.txt

Set data directory variable (also needed in evaluation scripts) and download evaluation datasets into corresponding subfolders:

export BASE_DATA_DIR=<YOUR_DATA_DIR>  # Set target data directory

wget -r -np -nH --cut-dirs=4 -R "index.html*" -P ${BASE_DATA_DIR} https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset/

Run inference and evaluation scripts, for example:

# Run inference
bash script/eval/11_infer_nyu.sh

# Evaluate predictions
bash script/eval/12_eval_nyu.sh

Note: although the seed has been set, the results might still be slightly different on different hardware.

๐Ÿ‹๏ธ Training

Based on the previously created environment, install extended requirements:

pip install -r requirements++.txt -r requirements+.txt -r requirements.txt

Set environment parameters for the data directory:

export BASE_DATA_DIR=YOUR_DATA_DIR  # directory of training data
export BASE_CKPT_DIR=YOUR_CHECKPOINT_DIR  # directory of pretrained checkpoint

Download Stable Diffusion v2 checkpoint into ${BASE_CKPT_DIR}

Prepare for Hypersim and Virtual KITTI 2 datasets and save into ${BASE_DATA_DIR}. Please refer to this README for Hypersim preprocessing.

Run training script

python train.py --config config/train_marigold.yaml --no_wandb

Resume from a checkpoint, e.g.

python train.py --resume_run output/train_marigold/checkpoint/latest --no_wandb

Evaluating results

Only the U-Net is updated and saved during training. To use the inference pipeline with your training result, replace unet folder in Marigold checkpoints with that in the checkpoint output folder. Then refer to this section for evaluation.

Important

Although random seeds have been set, the training result might be slightly different on different hardwares. It's recommended to train without interruption.

โœ๏ธ Contributing

Please refer to this instruction.

๐Ÿค” Troubleshooting

Problem Solution
(Windows) Invalid DOS bash script on WSL Run dos2unix <script_name> to convert script format
(Windows) error on WSL: Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory Run export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

๐ŸŽซ License

This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).

By downloading and using the code and model you agree to the terms in the LICENSE.

License

About

Based on ApDepth, we present ApDepth-G. It adopts the multi-step denoising inference approach of diffusion models while simultaneously resolving the pseudo-texture issue in Marigold.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors