README.md
2.5 KB · 84 lines · markdown Raw
1 ---
2 library_name: transformers
3 license: mit
4 pipeline_tag: depth-estimation
5 arxiv: <2502.19204>
6 tags:
7 - distill-any-depth
8 - vision
9 ---
10
11 # Distill Any Depth Large - Transformers Version
12
13 ## Introduction
14 We present Distill-Any-Depth, a new SOTA monocular depth estimation model trained with our proposed knowledge distillation algorithms. It was introduced in the paper [Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator](http://arxiv.org/abs/2502.19204).
15
16 This model checkpoint is compatible with the transformers library.
17
18 [Online demo](https://huggingface.co/spaces/xingyang1/Distill-Any-Depth).
19
20 ### How to use
21
22 Here is how to use this model to perform zero-shot depth estimation:
23
24 ```python
25 from transformers import pipeline
26 from PIL import Image
27 import requests
28 # load pipe
29 pipe = pipeline(task="depth-estimation", model="xingyang1/Distill-Any-Depth-Large-hf")
30 # load image
31 url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
32 image = Image.open(requests.get(url, stream=True).raw)
33 # inference
34 depth = pipe(image)["depth"]
35 ```
36
37 Alternatively, you can use the model and processor classes:
38
39 ```python
40 from transformers import AutoImageProcessor, AutoModelForDepthEstimation
41 import torch
42 import numpy as np
43 from PIL import Image
44 import requests
45
46 url = "http://images.cocodataset.org/val2017/000000039769.jpg"
47 image = Image.open(requests.get(url, stream=True).raw)
48
49 image_processor = AutoImageProcessor.from_pretrained("xingyang1/Distill-Any-Depth-Large-hf")
50 model = AutoModelForDepthEstimation.from_pretrained("xingyang1/Distill-Any-Depth-Large-hf")
51
52 # prepare image for the model
53 inputs = image_processor(images=image, return_tensors="pt")
54
55 with torch.no_grad():
56 outputs = model(**inputs)
57
58 # interpolate to original size and visualize the prediction
59 post_processed_output = image_processor.post_process_depth_estimation(
60 outputs,
61 target_sizes=[(image.height, image.width)],
62 )
63
64 predicted_depth = post_processed_output[0]["predicted_depth"]
65 depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())
66 depth = depth.detach().cpu().numpy() * 255
67 depth = Image.fromarray(depth.astype("uint8"))
68 )
69 ```
70
71
72 If you find this project useful, please consider citing:
73
74 ```bibtex
75 @article{he2025distill,
76 title = {Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator},
77 author = {Xiankang He and Dongyan Guo and Hongji Li and Ruibo Li and Ying Cui and Chi Zhang},
78 year = {2025},
79 journal = {arXiv preprint arXiv: 2502.19204}
80 }
81 ```
82
83 ## Model Card Author
84 [Parteek Kamboj](https://huggingface.co/keetrap)