README.md · Distill-Any-Depth-Large-hf

README.md

2.5 KB · 84 lines · markdown Raw

1	`---`
2	`library_name: transformers`
3	`license: mit`
4	`pipeline_tag: depth-estimation`
5	`arxiv: <2502.19204>`
6	`tags:`
7	`- distill-any-depth`
8	`- vision`
9	`---`
10
11	`# Distill Any Depth Large - Transformers Version`
12
13	`## Introduction`
14	`We present Distill-Any-Depth, a new SOTA monocular depth estimation model trained with our proposed knowledge distillation algorithms. It was introduced in the paper [Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator](http://arxiv.org/abs/2502.19204).`
15
16	`This model checkpoint is compatible with the transformers library.`
17
18	`[Online demo](https://huggingface.co/spaces/xingyang1/Distill-Any-Depth).`
19
20	`### How to use`
21
22	`Here is how to use this model to perform zero-shot depth estimation:`
23
24	```python
25	`from transformers import pipeline`
26	`from PIL import Image`
27	`import requests`
28	`# load pipe`
29	`pipe = pipeline(task="depth-estimation", model="xingyang1/Distill-Any-Depth-Large-hf")`
30	`# load image`
31	`url = 'http://images.cocodataset.org/val2017/000000039769.jpg'`
32	`image = Image.open(requests.get(url, stream=True).raw)`
33	`# inference`
34	`depth = pipe(image)["depth"]`
35	```
36
37	`Alternatively, you can use the model and processor classes:`
38
39	```python
40	`from transformers import AutoImageProcessor, AutoModelForDepthEstimation`
41	`import torch`
42	`import numpy as np`
43	`from PIL import Image`
44	`import requests`
45
46	`url = "http://images.cocodataset.org/val2017/000000039769.jpg"`
47	`image = Image.open(requests.get(url, stream=True).raw)`
48
49	`image_processor = AutoImageProcessor.from_pretrained("xingyang1/Distill-Any-Depth-Large-hf")`
50	`model = AutoModelForDepthEstimation.from_pretrained("xingyang1/Distill-Any-Depth-Large-hf")`
51
52	`# prepare image for the model`
53	`inputs = image_processor(images=image, return_tensors="pt")`
54
55	`with torch.no_grad():`
56	`outputs = model(**inputs)`
57
58	`# interpolate to original size and visualize the prediction`
59	`post_processed_output = image_processor.post_process_depth_estimation(`
60	`outputs,`
61	`target_sizes=[(image.height, image.width)],`
62	`)`
63
64	`predicted_depth = post_processed_output[0]["predicted_depth"]`
65	`depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())`
66	`depth = depth.detach().cpu().numpy() * 255`
67	`depth = Image.fromarray(depth.astype("uint8"))`
68	`)`
69	```
70
71
72	`If you find this project useful, please consider citing:`
73
74	```bibtex
75	`@article{he2025distill,`
76	`title = {Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator},`
77	`author = {Xiankang He and Dongyan Guo and Hongji Li and Ruibo Li and Ying Cui and Chi Zhang},`
78	`year = {2025},`
79	`journal = {arXiv preprint arXiv: 2502.19204}`
80	`}`
81	```
82
83	`## Model Card Author`
84	`[Parteek Kamboj](https://huggingface.co/keetrap)`