README.md · depth-anything-large-hf

README.md

3.3 KB · 98 lines · markdown Raw

1	`---`
2	`license: apache-2.0`
3	`tags:`
4	`- vision`
5	`pipeline_tag: depth-estimation`
6	`widget:`
7	`- inference: false`
8	`---`
9
10	`# Depth Anything (large-sized model, Transformers version)`
11
12	`Depth Anything model. It was introduced in the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang et al. and first released in [this repository](https://github.com/LiheYoung/Depth-Anything).`
13
14	`[Online demo](https://huggingface.co/spaces/LiheYoung/Depth-Anything) is also provided.`
15
16	`Disclaimer: The team releasing Depth Anything did not write a model card for this model so this model card has been written by the Hugging Face team.`
17
18	`## Model description`
19
20	`Depth Anything leverages the [DPT](https://huggingface.co/docs/transformers/model_doc/dpt) architecture with a [DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2) backbone.`
21
22	`The model is trained on ~62 million images, obtaining state-of-the-art results for both relative and absolute depth estimation.`
23
24	`<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/depth_anything_overview.jpg"`
25	`alt="drawing" width="600"/>`
26
27	`<small> Depth Anything overview. Taken from the <a href="https://arxiv.org/abs/2401.10891">original paper</a>.</small>`
28
29	`## Intended uses & limitations`
30
31	`You can use the raw model for tasks like zero-shot depth estimation. See the [model hub](https://huggingface.co/models?search=depth-anything) to look for`
32	`other versions on a task that interests you.`
33
34	`### How to use`
35
36	`Here is how to use this model to perform zero-shot depth estimation:`
37
38	```python
39	`from transformers import pipeline`
40	`from PIL import Image`
41	`import requests`
42
43	`# load pipe`
44	`pipe = pipeline(task="depth-estimation", model="LiheYoung/depth-anything-large-hf")`
45
46	`# load image`
47	`url = 'http://images.cocodataset.org/val2017/000000039769.jpg'`
48	`image = Image.open(requests.get(url, stream=True).raw)`
49
50	`# inference`
51	`depth = pipe(image)["depth"]`
52	```
53
54	`Alternatively, one can use the classes themselves:`
55
56	```python
57	`from transformers import AutoImageProcessor, AutoModelForDepthEstimation`
58	`import torch`
59	`import numpy as np`
60	`from PIL import Image`
61	`import requests`
62
63	`url = "http://images.cocodataset.org/val2017/000000039769.jpg"`
64	`image = Image.open(requests.get(url, stream=True).raw)`
65
66	`image_processor = AutoImageProcessor.from_pretrained("LiheYoung/depth-anything-large-hf")`
67	`model = AutoModelForDepthEstimation.from_pretrained("LiheYoung/depth-anything-large-hf")`
68
69	`# prepare image for the model`
70	`inputs = image_processor(images=image, return_tensors="pt")`
71
72	`with torch.no_grad():`
73	`outputs = model(**inputs)`
74	`predicted_depth = outputs.predicted_depth`
75
76	`# interpolate to original size`
77	`prediction = torch.nn.functional.interpolate(`
78	`predicted_depth.unsqueeze(1),`
79	`size=image.size[::-1],`
80	`mode="bicubic",`
81	`align_corners=False,`
82	`)`
83	```
84	`For more code examples, we refer to the [documentation](https://huggingface.co/transformers/main/model_doc/depth_anything.html#).`
85
86
87	`### BibTeX entry and citation info`
88
89	```bibtex
90	`@misc{yang2024depth,`
91	`title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},`
92	`author={Lihe Yang and Bingyi Kang and Zilong Huang and Xiaogang Xu and Jiashi Feng and Hengshuang Zhao},`
93	`year={2024},`
94	`eprint={2401.10891},`
95	`archivePrefix={arXiv},`
96	`primaryClass={cs.CV}`
97	`}`
98	```