README.md
4.3 KB · 146 lines · markdown Raw
1 ---
2 tags:
3 - image-classification
4 - timm
5 - transformers
6 library_name: timm
7 license: apache-2.0
8 datasets:
9 - imagenet-1k
10 ---
11 # Model card for mobilenetv3_small_100.lamb_in1k
12
13 A MobileNet-v3 image classification model. Trained on ImageNet-1k in `timm` using recipe template described below.
14
15 Recipe details:
16 * A LAMB optimizer based recipe that is similar to [ResNet Strikes Back](https://arxiv.org/abs/2110.00476) `A2` but 50% longer with EMA weight averaging, no CutMix
17 * Step (exponential decay w/ staircase) LR schedule with warmup
18
19
20 ## Model Details
21 - **Model Type:** Image classification / feature backbone
22 - **Model Stats:**
23 - Params (M): 2.5
24 - GMACs: 0.1
25 - Activations (M): 1.4
26 - Image size: 224 x 224
27 - **Papers:**
28 - Searching for MobileNetV3: https://arxiv.org/abs/1905.02244
29 - **Dataset:** ImageNet-1k
30 - **Original:** https://github.com/huggingface/pytorch-image-models
31
32 ## Model Usage
33 ### Image Classification
34 ```python
35 from urllib.request import urlopen
36 from PIL import Image
37 import timm
38
39 img = Image.open(urlopen(
40 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
41 ))
42
43 model = timm.create_model('mobilenetv3_small_100.lamb_in1k', pretrained=True)
44 model = model.eval()
45
46 # get model specific transforms (normalization, resize)
47 data_config = timm.data.resolve_model_data_config(model)
48 transforms = timm.data.create_transform(**data_config, is_training=False)
49
50 output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
51
52 top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
53 ```
54
55 ### Feature Map Extraction
56 ```python
57 from urllib.request import urlopen
58 from PIL import Image
59 import timm
60
61 img = Image.open(urlopen(
62 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
63 ))
64
65 model = timm.create_model(
66 'mobilenetv3_small_100.lamb_in1k',
67 pretrained=True,
68 features_only=True,
69 )
70 model = model.eval()
71
72 # get model specific transforms (normalization, resize)
73 data_config = timm.data.resolve_model_data_config(model)
74 transforms = timm.data.create_transform(**data_config, is_training=False)
75
76 output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
77
78 for o in output:
79 # print shape of each feature map in output
80 # e.g.:
81 # torch.Size([1, 16, 112, 112])
82 # torch.Size([1, 16, 56, 56])
83 # torch.Size([1, 24, 28, 28])
84 # torch.Size([1, 48, 14, 14])
85 # torch.Size([1, 576, 7, 7])
86
87 print(o.shape)
88 ```
89
90 ### Image Embeddings
91 ```python
92 from urllib.request import urlopen
93 from PIL import Image
94 import timm
95
96 img = Image.open(urlopen(
97 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
98 ))
99
100 model = timm.create_model(
101 'mobilenetv3_small_100.lamb_in1k',
102 pretrained=True,
103 num_classes=0, # remove classifier nn.Linear
104 )
105 model = model.eval()
106
107 # get model specific transforms (normalization, resize)
108 data_config = timm.data.resolve_model_data_config(model)
109 transforms = timm.data.create_transform(**data_config, is_training=False)
110
111 output = model(transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor
112
113 # or equivalently (without needing to set num_classes=0)
114
115 output = model.forward_features(transforms(img).unsqueeze(0))
116 # output is unpooled, a (1, 576, 7, 7) shaped tensor
117
118 output = model.forward_head(output, pre_logits=True)
119 # output is a (1, num_features) shaped tensor
120 ```
121
122 ## Model Comparison
123 Explore the dataset and runtime metrics of this model in timm [model results](https://github.com/huggingface/pytorch-image-models/tree/main/results).
124
125 ## Citation
126 ```bibtex
127 @misc{rw2019timm,
128 author = {Ross Wightman},
129 title = {PyTorch Image Models},
130 year = {2019},
131 publisher = {GitHub},
132 journal = {GitHub repository},
133 doi = {10.5281/zenodo.4414861},
134 howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
135 }
136 ```
137 ```bibtex
138 @inproceedings{howard2019searching,
139 title={Searching for mobilenetv3},
140 author={Howard, Andrew and Sandler, Mark and Chu, Grace and Chen, Liang-Chieh and Chen, Bo and Tan, Mingxing and Wang, Weijun and Zhu, Yukun and Pang, Ruoming and Vasudevan, Vijay and others},
141 booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
142 pages={1314--1324},
143 year={2019}
144 }
145 ```
146