README.md
13.8 KB · 250 lines · markdown Raw
1 ---
2 license: apache-2.0
3 language:
4 - en
5 base_model:
6 - answerdotai/ModernBERT-base
7 base_model_relation: finetune
8 pipeline_tag: text-ranking
9 library_name: transformers
10 tags:
11 - sentence-transformers
12 - transformers.js
13 - text-embeddings-inference
14 ---
15
16 # gte-reranker-modernbert-base
17
18 We are excited to introduce the `gte-modernbert` series of models, which are built upon the latest modernBERT pre-trained encoder-only foundation models. The `gte-modernbert` series models include both text embedding models and rerank models.
19
20 The `gte-modernbert` models demonstrates competitive performance in several text embedding and text retrieval evaluation tasks when compared to similar-scale models from the current open-source community. This includes assessments such as **MTEB**, **LoCO**, and **COIR** evaluation.
21
22 ## Model Overview
23
24 - Developed by: Tongyi Lab, Alibaba Group
25 - Model Type: Text reranker
26 - Primary Language: English
27 - Model Size: 149M
28 - Max Input Length: 8192 tokens
29
30 ### Model list
31
32
33 | Models | Language | Model Type | Model Size | Max Seq. Length | Dimension | MTEB-en | BEIR | LoCo | CoIR |
34 |:--------------------------------------------------------------------------------------:|:--------:|:----------------------:|:----------:|:---------------:|:---------:|:-------:|:----:|:----:|:----:|
35 | [`gte-modernbert-base`](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | English | text embedding | 149M | 8192 | 768 | 64.38 | 55.33 | 87.57 | 79.31 |
36 | [`gte-reranker-modernbert-base`](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | English | text reranker | 149M | 8192 | - | - | 56.19 | 90.68 | 79.99 |
37
38 ## Usage
39
40 > [!TIP]
41 > For `transformers` and `sentence-transformers`, if your GPU supports it, the efficient Flash Attention 2 will be used automatically if you have `flash_attn` installed. It is not mandatory.
42 >
43 > ```bash
44 > pip install flash_attn
45 > ```
46
47 Use with `transformers`
48 ```python
49 # Requires transformers>=4.48.0
50 import torch
51 from transformers import AutoModelForSequenceClassification, AutoTokenizer
52
53 model_name_or_path = "Alibaba-NLP/gte-reranker-modernbert-base"
54 tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
55 model = AutoModelForSequenceClassification.from_pretrained(
56 model_name_or_path,
57 torch_dtype=torch.float16,
58 )
59 model.eval()
60
61 pairs = [
62 ["what is the capital of China?", "Beijing"],
63 ["how to implement quick sort in python?", "Introduction of quick sort"],
64 ["how to implement quick sort in python?", "The weather is nice today"],
65 ]
66
67 with torch.no_grad():
68 inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
69 scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
70 print(scores)
71
72 # tensor([ 2.1387, 2.4609, -1.6729])
73 ```
74 Use with `sentence-transformers`:
75
76 Before you start, install the sentence-transformers libraries:
77 ```
78 pip install sentence-transformers
79 ```
80
81 ```python
82 # Requires transformers>=4.48.0
83 from sentence_transformers import CrossEncoder
84
85 model = CrossEncoder(
86 "Alibaba-NLP/gte-reranker-modernbert-base",
87 automodel_args={"torch_dtype": "auto"},
88 )
89
90 pairs = [
91 ["what is the capital of China?", "Beijing"],
92 ["how to implement quick sort in python?","Introduction of quick sort"],
93 ["how to implement quick sort in python?", "The weather is nice today"],
94 ]
95
96 scores = model.predict(pairs)
97 print(scores)
98 # [0.8945664 0.9213594 0.15742092]
99 # NOTE: Sentence Transformers calls Softmax over the outputs by default, hence the scores are in [0, 1] range.
100 ```
101
102 Use with `transformers.js`
103 ```js
104 import {
105 AutoTokenizer,
106 AutoModelForSequenceClassification,
107 } from "@huggingface/transformers";
108
109 const model_id = "Alibaba-NLP/gte-reranker-modernbert-base";
110 const model = await AutoModelForSequenceClassification.from_pretrained(
111 model_id,
112 { dtype: "fp32" }, // Supported options: "fp32", "fp16", "q8", "q4", "q4f16"
113 );
114 const tokenizer = await AutoTokenizer.from_pretrained(model_id);
115
116 const pairs = [
117 ["what is the capital of China?", "Beijing"],
118 ["how to implement quick sort in python?", "Introduction of quick sort"],
119 ["how to implement quick sort in python?", "The weather is nice today"],
120 ];
121 const inputs = tokenizer(
122 pairs.map((x) => x[0]),
123 {
124 text_pair: pairs.map((x) => x[1]),
125 padding: true,
126 truncation: true,
127 },
128 );
129 const { logits } = await model(inputs);
130 console.log(logits.tolist()); // [[2.138258218765259], [2.4609625339508057], [-1.6775450706481934]]
131 ```
132
133 Additionally, you can also deploy `Alibaba-NLP/gte-reranker-modernbert-base` with [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) as follows:
134
135 - CPU
136
137 ```bash
138 docker run --platform linux/amd64 \
139 -p 8080:80 \
140 -v $PWD/data:/data \
141 --pull always \
142 ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \
143 --model-id Alibaba-NLP/gte-reranker-modernbert-base
144 ```
145
146 - GPU
147
148 ```bash
149 docker run --gpus all \
150 -p 8080:80 \
151 -v $PWD/data:/data \
152 --pull always \
153 ghcr.io/huggingface/text-embeddings-inference:1.7 \
154 --model-id Alibaba-NLP/gte-reranker-modernbert-base
155 ```
156
157 Then you can send requests to the deployed API via the `/rerank` route (see the [Text Embeddings Inference OpenAPI Specification](https://huggingface.github.io/text-embeddings-inference/) for more details):
158
159 ```bash
160 curl https://0.0.0.0:8080/rerank \
161 -H "Content-Type: application/json" \
162 -d '{
163 "query": "What is the capital of China?",
164 "raw_scores": false,
165 "return_text": false,
166 "texts": [ "Beijing" ],
167 "truncate": true,
168 "truncation_direction": "right"
169 }'
170 ```
171
172 ## Training Details
173
174 The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)
175
176 ## Evaluation
177
178 ### MTEB
179
180 The results of other models are retrieved from [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard). Given that all models in the `gte-modernbert` series have a size of less than 1B parameters, we focused exclusively on the results of models under 1B from the MTEB leaderboard.
181
182 | Model Name | Param Size (M) | Dimension | Sequence Length | Average (56) | Class. (12) | Clust. (11) | Pair Class. (3) | Reran. (4) | Retr. (15) | STS (10) | Summ. (1) |
183 |:------------------------------------------------------------------------------------------------:|:--------------:|:---------:|:---------------:|:------------:|:-----------:|:---:|:---:|:---:|:---:|:-----------:|:--------:|
184 | [mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1) | 335 | 1024 | 512 | 64.68 | 75.64 | 46.71 | 87.2 | 60.11 | 54.39 | 85 | 32.71 |
185 | [multilingual-e5-large-instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct) | 560 | 1024 | 514 | 64.41 | 77.56 | 47.1 | 86.19 | 58.58 | 52.47 | 84.78 | 30.39 |
186 | [bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | 335 | 1024 | 512 | 64.23 | 75.97 | 46.08 | 87.12 | 60.03 | 54.29 | 83.11 | 31.61 |
187 | [gte-base-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | 137 | 768 | 8192 | 64.11 | 77.17 | 46.82 | 85.33 | 57.66 | 54.09 | 81.97 | 31.17 |
188 | [bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | 109 | 768 | 512 | 63.55 | 75.53 | 45.77 | 86.55 | 58.86 | 53.25 | 82.4 | 31.07 |
189 | [gte-large-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | 409 | 1024 | 8192 | 65.39 | 77.75 | 47.95 | 84.63 | 58.50 | 57.91 | 81.43 | 30.91 |
190 | [modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) | 149 | 768 | 8192 | 62.62 | 74.31 | 44.98 | 83.96 | 56.42 | 52.89 | 81.78 | 31.39 |
191 | [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) | | 768 | 8192 | 62.28 | 73.55 | 43.93 | 84.61 | 55.78 | 53.01| 81.94 | 30.4 |
192 | [gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) | 305 | 768 | 8192 | 61.4 | 70.89 | 44.31 | 84.24 | 57.47 |51.08 | 82.11 | 30.58 |
193 | [jina-embeddings-v3](https://huggingface.co/jinaai/jina-embeddings-v3) | 572 | 1024 | 8192 | 65.51 | 82.58 |45.21 |84.01 |58.13 |53.88 | 85.81 | 29.71 |
194 | [**gte-modernbert-base**](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 149 | 768 | 8192 | **64.38** | **76.99** | **46.47** | **85.93** | **59.24** | **55.33** | **81.57** | **30.68** |
195
196
197 ### LoCo (Long Document Retrieval)
198
199 | Model Name | Dimension | Sequence Length | Average (5) | QsmsumRetrieval | SummScreenRetrieval | QasperAbastractRetrieval | QasperTitleRetrieval | GovReportRetrieval |
200 |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
201 | [gte-qwen1.5-7b](https://huggingface.co/Alibaba-NLP/gte-qwen1.5-7b) | 4096 | 32768 | 87.57 | 49.37 | 93.10 | 99.67 | 97.54 | 98.21 |
202 | [gte-large-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-v1.5) |1024 | 8192 | 86.71 | 44.55 | 92.61 | 99.82 | 97.81 | 98.74 |
203 | [gte-base-v1.5](https://huggingface.co/Alibaba-NLP/gte-base-v1.5) | 768 | 8192 | 87.44 | 49.91 | 91.78 | 99.82 | 97.13 | 98.58 |
204 | [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 88.88 | 54.45 | 93.00 | 99.82 | 98.03 | 98.70 |
205 | [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 90.68 | 70.86 | 94.06 | 99.73 | 99.11 | 89.67 |
206
207 ### COIR (Code Retrieval Task)
208
209 | Model Name | Dimension | Sequence Length | Average(20) | CodeSearchNet-ccr-go | CodeSearchNet-ccr-java | CodeSearchNet-ccr-javascript | CodeSearchNet-ccr-php | CodeSearchNet-ccr-python | CodeSearchNet-ccr-ruby | CodeSearchNet-go | CodeSearchNet-java | CodeSearchNet-javascript | CodeSearchNet-php | CodeSearchNet-python | CodeSearchNet-ruby | apps | codefeedback-mt | codefeedback-st | codetrans-contest | codetrans-dl | cosqa | stackoverflow-qa | synthetic-text2sql |
210 |:----:|:---:|:---:|:---:|:---:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
211 | [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 79.31 | 94.15 | 93.57 | 94.27 | 91.51 | 93.93 | 90.63 | 88.32 | 83.27 | 76.05 | 85.12 | 88.16 | 77.59 | 57.54 | 82.34 | 85.95 | 71.89 | 35.46 | 43.47 | 91.2 | 61.87 |
212 | [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 79.99 | 96.43 | 96.88 | 98.32 | 91.81 | 97.7 | 91.96 | 88.81 | 79.71 | 76.27 | 89.39 | 98.37 | 84.11 | 47.57 | 83.37 | 88.91 | 49.66 | 36.36 | 44.37 | 89.58 | 64.21 |
213
214 ### BEIR
215
216 | Model Name | Dimension | Sequence Length | Average(15) | ArguAna | ClimateFEVER | CQADupstackAndroidRetrieval | DBPedia | FEVER | FiQA2018 | HotpotQA | MSMARCO | NFCorpus | NQ | QuoraRetrieval | SCIDOCS | SciFact | Touche2020 | TRECCOVID |
217 | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
218 | [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 55.33 | 72.68 | 37.74 | 42.63 | 41.79 | 91.03 | 48.81 | 69.47 | 40.9 | 36.44 | 57.62 | 88.55 | 21.29 | 77.4 | 21.68 | 81.95 |
219 | [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 56.73 | 69.03 | 37.79 | 44.68 | 47.23 | 94.54 | 49.81 | 78.16 | 45.38 | 30.69 | 64.57 | 87.77 | 20.60 | 73.57 | 27.36 | 79.89 |
220
221
222 ## Hiring
223
224 We have open positions for **Research Interns** and **Full-Time Researchers** to join our team at Tongyi Lab.
225 We are seeking passionate individuals with expertise in representation learning, LLM-driven information retrieval, Retrieval-Augmented Generation (RAG), and agent-based systems.
226 Our team is located in the vibrant cities of **Beijing** and **Hangzhou**.
227 If you are driven by curiosity and eager to make a meaningful impact through your work, we would love to hear from you. Please submit your resume along with a brief introduction to <a href="mailto:dingkun.ldk@alibaba-inc.com">dingkun.ldk@alibaba-inc.com</a>.
228
229
230 ## Citation
231
232 If you find our paper or models helpful, feel free to give us a cite.
233
234 ```
235 @inproceedings{zhang2024mgte,
236 title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
237 author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others},
238 booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
239 pages={1393--1412},
240 year={2024}
241 }
242
243 @article{li2023towards,
244 title={Towards general text embeddings with multi-stage contrastive learning},
245 author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
246 journal={arXiv preprint arXiv:2308.03281},
247 year={2023}
248 }
249 ```
250