README.md
4.4 KB · 157 lines · markdown Raw
1 ---
2 tags:
3 - image-classification
4 - timm
5 - transformers
6 library_name: timm
7 license: mit
8 datasets:
9 - imagenet-1k
10 ---
11 # Model card for repvgg_a0
12
13 A RepVGG image classification model. Trained on ImageNet-1k by paper authors.
14
15 This model architecture is implemented using `timm`'s flexible [BYOBNet (Bring-Your-Own-Blocks Network)](https://github.com/huggingface/pytorch-image-models/blob/main/timm/models/byobnet.py).
16
17 BYOBNet allows configuration of:
18 * block / stage layout
19 * stem layout
20 * output stride (dilation)
21 * activation and norm layers
22 * channel and spatial / self-attention layers
23
24 ...and also includes `timm` features common to many other architectures, including:
25 * stochastic depth
26 * gradient checkpointing
27 * layer-wise LR decay
28 * per-stage feature extraction
29
30
31 ## Model Details
32 - **Model Type:** Image classification / feature backbone
33 - **Model Stats:**
34 - Params (M): 9.1
35 - GMACs: 1.5
36 - Activations (M): 3.6
37 - Image size: 224 x 224
38 - **Papers:**
39 - RepVGG: Making VGG-style ConvNets Great Again: https://arxiv.org/abs/2101.03697
40 - **Dataset:** ImageNet-1k
41 - **Original:** https://github.com/DingXiaoH/RepVGG
42
43 ## Model Usage
44 ### Image Classification
45 ```python
46 from urllib.request import urlopen
47 from PIL import Image
48 import timm
49
50 img = Image.open(urlopen(
51 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
52 ))
53
54 model = timm.create_model('repvgg_a0', pretrained=True)
55 model = model.eval()
56
57 # get model specific transforms (normalization, resize)
58 data_config = timm.data.resolve_model_data_config(model)
59 transforms = timm.data.create_transform(**data_config, is_training=False)
60
61 output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
62
63 top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
64 ```
65
66 ### Feature Map Extraction
67 ```python
68 from urllib.request import urlopen
69 from PIL import Image
70 import timm
71
72 img = Image.open(urlopen(
73 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
74 ))
75
76 model = timm.create_model(
77 'repvgg_a0',
78 pretrained=True,
79 features_only=True,
80 )
81 model = model.eval()
82
83 # get model specific transforms (normalization, resize)
84 data_config = timm.data.resolve_model_data_config(model)
85 transforms = timm.data.create_transform(**data_config, is_training=False)
86
87 output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
88
89 for o in output:
90 # print shape of each feature map in output
91 # e.g.:
92 # torch.Size([1, 48, 112, 112])
93 # torch.Size([1, 48, 56, 56])
94 # torch.Size([1, 96, 28, 28])
95 # torch.Size([1, 192, 14, 14])
96 # torch.Size([1, 1280, 7, 7])
97
98 print(o.shape)
99 ```
100
101 ### Image Embeddings
102 ```python
103 from urllib.request import urlopen
104 from PIL import Image
105 import timm
106
107 img = Image.open(urlopen(
108 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
109 ))
110
111 model = timm.create_model(
112 'repvgg_a0',
113 pretrained=True,
114 num_classes=0, # remove classifier nn.Linear
115 )
116 model = model.eval()
117
118 # get model specific transforms (normalization, resize)
119 data_config = timm.data.resolve_model_data_config(model)
120 transforms = timm.data.create_transform(**data_config, is_training=False)
121
122 output = model(transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor
123
124 # or equivalently (without needing to set num_classes=0)
125
126 output = model.forward_features(transforms(img).unsqueeze(0))
127 # output is unpooled, a (1, 1280, 7, 7) shaped tensor
128
129 output = model.forward_head(output, pre_logits=True)
130 # output is a (1, num_features) shaped tensor
131 ```
132
133 ## Model Comparison
134 Explore the dataset and runtime metrics of this model in timm [model results](https://github.com/huggingface/pytorch-image-models/tree/main/results).
135
136 ## Citation
137 ```bibtex
138 @misc{rw2019timm,
139 author = {Ross Wightman},
140 title = {PyTorch Image Models},
141 year = {2019},
142 publisher = {GitHub},
143 journal = {GitHub repository},
144 doi = {10.5281/zenodo.4414861},
145 howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
146 }
147 ```
148 ```bibtex
149 @inproceedings{ding2021repvgg,
150 title={Repvgg: Making vgg-style convnets great again},
151 author={Ding, Xiaohan and Zhang, Xiangyu and Ma, Ningning and Han, Jungong and Ding, Guiguang and Sun, Jian},
152 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
153 pages={13733--13742},
154 year={2021}
155 }
156 ```
157