README.md
| 1 | --- |
| 2 | license: mit |
| 3 | tags: |
| 4 | - vision |
| 5 | pipeline_tag: depth-estimation |
| 6 | --- |
| 7 | |
| 8 | # ZoeDepth (fine-tuned on NYU and KITTI) |
| 9 | |
| 10 | ZoeDepth model fine-tuned on the NYU and KITTI datasets. It was introduced in the paper [ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth](https://arxiv.org/abs/2302.12288) by Shariq et al. and first released in [this repository](https://github.com/isl-org/ZoeDepth). |
| 11 | |
| 12 | ZoeDepth extends the [DPT](https://huggingface.co/docs/transformers/en/model_doc/dpt) framework for metric (also called absolute) depth estimation, obtaining state-of-the-art results. |
| 13 | |
| 14 | Disclaimer: The team releasing ZoeDepth did not write a model card for this model so this model card has been written by the Hugging Face team. |
| 15 | |
| 16 | ## Model description |
| 17 | |
| 18 | ZoeDepth adapts [DPT](https://huggingface.co/docs/transformers/en/model_doc/dpt), a model for relative depth estimation, for so-called metric (also called absolute) depth estimation. |
| 19 | |
| 20 | This means that the model is able to estimate depth in actual metric values. |
| 21 | |
| 22 | <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/zoedepth_architecture_bis.png" |
| 23 | alt="drawing" width="600"/> |
| 24 | |
| 25 | <small> ZoeDepth architecture. Taken from the <a href="https://arxiv.org/abs/2302.12288">original paper.</a> </small> |
| 26 | |
| 27 | ## Intended uses & limitations |
| 28 | |
| 29 | You can use the raw model for tasks like zero-shot monocular depth estimation. See the [model hub](https://huggingface.co/models?search=Intel/zoedepth) to look for |
| 30 | other versions on a task that interests you. |
| 31 | |
| 32 | ### How to use |
| 33 | |
| 34 | The easiest is to leverage the pipeline API which abstracts away the complexity for the user: |
| 35 | |
| 36 | ```python |
| 37 | from transformers import pipeline |
| 38 | from PIL import Image |
| 39 | import requests |
| 40 | |
| 41 | # load pipe |
| 42 | depth_estimator = pipeline(task="depth-estimation", model="Intel/zoedepth-nyu-kitti") |
| 43 | |
| 44 | # load image |
| 45 | url = 'http://images.cocodataset.org/val2017/000000039769.jpg' |
| 46 | image = Image.open(requests.get(url, stream=True).raw) |
| 47 | |
| 48 | # inference |
| 49 | outputs = depth_estimator(image) |
| 50 | depth = outputs.depth |
| 51 | ``` |
| 52 | For more code examples, we refer to the [documentation](https://huggingface.co/transformers/main/model_doc/zoedepth.html#). |
| 53 | |
| 54 | ### BibTeX entry and citation info |
| 55 | |
| 56 | ```bibtex |
| 57 | @misc{bhat2023zoedepth, |
| 58 | title={ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth}, |
| 59 | author={Shariq Farooq Bhat and Reiner Birkl and Diana Wofk and Peter Wonka and Matthias Müller}, |
| 60 | year={2023}, |
| 61 | eprint={2302.12288}, |
| 62 | archivePrefix={arXiv}, |
| 63 | primaryClass={cs.CV} |
| 64 | } |
| 65 | ``` |