GitHub - Py-Expo/SYNTAX-SQUAD

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

News

2024-02-27: Depth Anything is accepted by CVPR 2024.
2024-02-05: Depth Anything Gallery is released. Thank all the users!
2024-02-02: Depth Anything serves as the default depth processor for InstantID and InvokeAI.
2024-01-25: Support video depth visualization. An online demo for video is also available.
2024-01-23: The new ControlNet based on Depth Anything is integrated into ControlNet WebUI and ComfyUI's ControlNet.
2024-01-23: Depth Anything ONNX and TensorRT versions are supported.
2024-01-22: Paper, project page, code, models, and demo (HuggingFace, OpenXLab) are released.

Features of Depth Anything

If you need other features, please first check existing community supports.

Relative depth estimation:

Our foundation models listed here can provide relative depth estimation for any given image robustly. Please refer here for details.
Metric depth estimation

We fine-tune our Depth Anything model with metric depth information from NYUv2 or KITTI. It offers strong capabilities of both in-domain and zero-shot metric depth estimation. Please refer here for details.
Better depth-conditioned ControlNet

We re-train a better depth-conditioned ControlNet based on Depth Anything. It offers more precise synthesis than the previous MiDaS-based ControlNet. Please refer here for details. You can also use our new ControlNet based on Depth Anything in ControlNet WebUI or ComfyUI's ControlNet.

You can easily load our pre-trained models by:

from depth_anything.dpt import DepthAnything

encoder = 'vits' # can also be 'vitb' or 'vitl'
depth_anything = DepthAnything.from_pretrained('LiheYoung/depth_anything_{:}14'.format(encoder))

Depth Anything is also supported in transformers. You can use it for depth prediction within 3 lines of code (credit to @niels).

No network connection, cannot load these models?

Click here for solutions

First, manually download the three checkpoints: depth-anything-large, depth-anything-base, and depth-anything-small.
Second, upload the folder containing the checkpoints to your remote server.
Lastly, load the model locally:

from depth_anything.dpt import DepthAnything

model_configs = {
    'vitl': {'encoder': 'vitl', 'features': 256, 'out_channels': [256, 512, 1024, 1024]},
    'vitb': {'encoder': 'vitb', 'features': 128, 'out_channels': [96, 192, 384, 768]},
    'vits': {'encoder': 'vits', 'features': 64, 'out_channels': [48, 96, 192, 384]}
}

encoder = 'vitl' # or 'vitb', 'vits'
depth_anything = DepthAnything(model_configs[encoder])
depth_anything.load_state_dict(torch.load(f'./checkpoints/depth_anything_{encoder}14.pth'))

Note that in this locally loading manner, you also do not have to install the huggingface_hub package. In this way, please feel free to delete this line and the PyTorchModelHubMixin in this line.

Usage

Installation

git clone https://github.com/LiheYoung/Depth-Anything
cd Depth-Anything
pip install -r requirements.txt

Running

python run.py --encoder <vits | vitb | vitl> --img-path <img-directory | single-img | txt-file> --outdir <outdir> [--pred-only] [--grayscale]

Arguments:

--img-path: you can either 1) point it to an image directory storing all interested images, 2) point it to a single image, or 3) point it to a text file storing all image paths.
--pred-only is set to save the predicted depth map only. Without it, by default, we visualize both image and its depth map side by side.
--grayscale is set to save the grayscale depth map. Without it, by default, we apply a color palette to the depth map.

For example:

python run.py --encoder vitl --img-path assets/examples --outdir depth_vis

If you want to use Depth Anything on videos:

python run_video.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis

Gradio demo

To use our gradio demo locally:

python app.py

You can also try our online demo.

Import Depth Anything to your project

If you want to use Depth Anything in your own project, you can simply follow run.py to load our models and define data pre-processing.

Code snippet (note the difference between our data pre-processing and that of MiDaS)

from depth_anything.dpt import DepthAnything
from depth_anything.util.transform import Resize, NormalizeImage, PrepareForNet

import cv2
import torch
from torchvision.transforms import Compose

encoder = 'vits' # can also be 'vitb' or 'vitl'
depth_anything = DepthAnything.from_pretrained('LiheYoung/depth_anything_{:}14'.format(encoder)).eval()

transform = Compose([
    Resize(
        width=518,
        height=518,
        resize_target=False,
        keep_aspect_ratio=True,
        ensure_multiple_of=14,
        resize_method='lower_bound',
        image_interpolation_method=cv2.INTER_CUBIC,
    ),
    NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    PrepareForNet(),
])

image = cv2.cvtColor(cv2.imread('your image path'), cv2.COLOR_BGR2RGB) / 255.0
image = transform({'image': image})['image']
image = torch.from_numpy(image).unsqueeze(0)

# depth shape: 1xHxW
depth = depth_anything(image)

Do not want to define image pre-processing or download model definition files?

Easily use Depth Anything through transformers within 3 lines of code! Please refer to these instructions (credit to @niels).

Note: If you encounter KeyError: 'depth_anything', please install the latest transformers from source:

pip install git+https://github.com/huggingface/transformers.git

Click here for a brief demo:

from transformers import pipeline
from PIL import Image

image = Image.open('Your-image-path')
pipe = pipeline(task="depth-estimation", model="LiheYoung/depth-anything-small-hf")
depth = pipe(image)["depth"]

Community Support

We sincerely appreciate all the extensions built on our Depth Anything from the community. Thank you a lot!

Here we list the extensions we have found:

Depth Anything TensorRT:
Depth Anything ONNX: https://github.com/fabio-sim/Depth-Anything-ONNX
Depth Anything in Transformers.js (3D visualization): https://huggingface.co/spaces/Xenova/depth-anything-web
Depth Anything for video (online demo): https://huggingface.co/spaces/JohanDL/Depth-Anything-Video
Depth Anything in ControlNet WebUI: https://github.com/Mikubill/sd-webui-controlnet
Depth Anything in ComfyUI's ControlNet: https://github.com/Fannovel16/comfyui_controlnet_aux
Depth Anything in X-AnyLabeling: https://github.com/CVHub520/X-AnyLabeling
Depth Anything in OpenXLab: https://openxlab.org.cn/apps/detail/yyfan/depth_anything
Depth Anything in OpenVINO: https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/280-depth-anything
Depth Anything ROS:
- https://github.com/scepter914/DepthAnything-ROS
- https://github.com/polatztrk/depth_anything_ros
Depth Anything Android:
- https://github.com/FeiGeChuanShu/ncnn-android-depth_anything
- https://github.com/shubham0204/Depth-Anything-Android
Depth Anything in TouchDesigner: https://github.com/olegchomp/TDDepthAnything
LearnOpenCV research article on Depth Anything: https://learnopencv.com/depth-anything

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
controlnet		controlnet
depth_anything		depth_anything
metric_depth		metric_depth
semseg		semseg
torchhub		torchhub
LICENSE		LICENSE
README.md		README.md
app.py		app.py
gallery.md		gallery.md
requirements.txt		requirements.txt
run.py		run.py
run_video.py		run_video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

News

Features of Depth Anything

No network connection, cannot load these models?

Usage

Installation

Running

Gradio demo

Import Depth Anything to your project

Do not want to define image pre-processing or download model definition files?

Community Support

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

News

Features of Depth Anything

No network connection, cannot load these models?

Usage

Installation

Running

Gradio demo

Import Depth Anything to your project

Do not want to define image pre-processing or download model definition files?

Community Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages