README.md · gte-reranker-modernbert-base

1

---

2

license: apache-2.0

3

language:

4

- en

5

base_model:

6

- answerdotai/ModernBERT-base

7

base_model_relation: finetune

8

pipeline_tag: text-ranking

9

library_name: transformers

10

tags:

11

- sentence-transformers

12

- transformers.js

13

- text-embeddings-inference

14

---

15

16

# gte-reranker-modernbert-base

17

18

We are excited to introduce the `gte-modernbert` series of models, which are built upon the latest modernBERT pre-trained encoder-only foundation models. The `gte-modernbert` series models include both text embedding models and rerank models.

19

20

The `gte-modernbert` models demonstrates competitive performance in several text embedding and text retrieval evaluation tasks when compared to similar-scale models from the current open-source community. This includes assessments such as **MTEB**, **LoCO**, and **COIR** evaluation.

21

22

## Model Overview

23

24

- Developed by: Tongyi Lab, Alibaba Group

25

- Model Type: Text reranker

26

- Primary Language: English

27

- Model Size: 149M

28

- Max Input Length: 8192 tokens

29

30

### Model list

31

32

33

|                                         Models                                         | Language |       Model Type       | Model Size | Max Seq. Length | Dimension | MTEB-en | BEIR | LoCo | CoIR |

34

|:--------------------------------------------------------------------------------------:|:--------:|:----------------------:|:----------:|:---------------:|:---------:|:-------:|:----:|:----:|:----:|

35

|  [`gte-modernbert-base`](https://huggingface.co/Alibaba-NLP/gte-modernbert-base)   | English  |     text embedding     |    149M    |      8192       |    768    |  64.38  | 55.33 | 87.57 | 79.31 |

36

| [`gte-reranker-modernbert-base`](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base)  | English  | text reranker     |    149M    |    8192    |     -     |  - | 56.19 | 90.68 | 79.99 |

37

38

## Usage

39

40

> [!TIP]

41

> For `transformers` and `sentence-transformers`, if your GPU supports it, the efficient Flash Attention 2 will be used automatically if you have `flash_attn` installed. It is not mandatory.

42

>

43

> ```bash

44

> pip install flash_attn

45

> ```

46

47

Use with `transformers`

48

```python

49

# Requires transformers>=4.48.0

50

import torch

51

from transformers import AutoModelForSequenceClassification, AutoTokenizer

52

53

model_name_or_path = "Alibaba-NLP/gte-reranker-modernbert-base"

54

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)

55

model = AutoModelForSequenceClassification.from_pretrained(

56

model_name_or_path,

57

torch_dtype=torch.float16,

58

)

59

model.eval()

60

61

pairs = [

62

["what is the capital of China?", "Beijing"],

63

["how to implement quick sort in python?", "Introduction of quick sort"],

64

["how to implement quick sort in python?", "The weather is nice today"],

65

]

66

67

with torch.no_grad():

68

inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)

69

scores = model(**inputs, return_dict=True).logits.view(-1, ).float()

70

print(scores)

71

72

# tensor([ 2.1387, 2.4609, -1.6729])

73

```

74

Use with `sentence-transformers`:

75

76

Before you start, install the sentence-transformers libraries:

77

```

78

pip install sentence-transformers

79

```

80

81

```python

82

# Requires transformers>=4.48.0

83

from sentence_transformers import CrossEncoder

84

85

model = CrossEncoder(

86

"Alibaba-NLP/gte-reranker-modernbert-base",

87

automodel_args={"torch_dtype": "auto"},

88

)

89

90

pairs = [

91

["what is the capital of China?", "Beijing"],

92

["how to implement quick sort in python?","Introduction of quick sort"],

93

["how to implement quick sort in python?", "The weather is nice today"],

94

]

95

96

scores = model.predict(pairs)

97

print(scores)

98

# [0.8945664 0.9213594 0.15742092]

99

# NOTE: Sentence Transformers calls Softmax over the outputs by default, hence the scores are in [0, 1] range.

100

```

101

102

Use with `transformers.js`

103

```js

104

import {

105

AutoTokenizer,

106

AutoModelForSequenceClassification,

107

} from "@huggingface/transformers";

108

109

const model_id = "Alibaba-NLP/gte-reranker-modernbert-base";

110

const model = await AutoModelForSequenceClassification.from_pretrained(

111

model_id,

112

{ dtype: "fp32" }, // Supported options: "fp32", "fp16", "q8", "q4", "q4f16"

113

);

114

const tokenizer = await AutoTokenizer.from_pretrained(model_id);

115

116

const pairs = [

117

["what is the capital of China?", "Beijing"],

118

["how to implement quick sort in python?", "Introduction of quick sort"],

119

["how to implement quick sort in python?", "The weather is nice today"],

120

];

121

const inputs = tokenizer(

122

pairs.map((x) => x[0]),

123

{

124

text_pair: pairs.map((x) => x[1]),

125

padding: true,

126

truncation: true,

127

},

128

);

129

const { logits } = await model(inputs);

130

console.log(logits.tolist()); // [[2.138258218765259], [2.4609625339508057], [-1.6775450706481934]]

131

```

132

133

Additionally, you can also deploy `Alibaba-NLP/gte-reranker-modernbert-base` with [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) as follows:

134

135

- CPU

136

137

```bash

138

docker run --platform linux/amd64 \

139

-p 8080:80 \

140

-v $PWD/data:/data \

141

--pull always \

142

ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \

143

--model-id Alibaba-NLP/gte-reranker-modernbert-base

144

```

145

146

- GPU

147

148

```bash

149

docker run --gpus all \

150

-p 8080:80 \

151

-v $PWD/data:/data \

152

--pull always \

153

ghcr.io/huggingface/text-embeddings-inference:1.7 \

154

--model-id Alibaba-NLP/gte-reranker-modernbert-base

155

```

156

157

Then you can send requests to the deployed API via the `/rerank` route (see the [Text Embeddings Inference OpenAPI Specification](https://huggingface.github.io/text-embeddings-inference/) for more details):

158

159

```bash

160

curl https://0.0.0.0:8080/rerank \

161

-H "Content-Type: application/json" \

162

-d '{

163

"query": "What is the capital of China?",

164

"raw_scores": false,

165

"return_text": false,

166

"texts": [ "Beijing" ],

167

"truncate": true,

168

"truncation_direction": "right"

169

}'

170

```

171

172

## Training Details

173

174

The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)

175

176

## Evaluation

177

178

### MTEB

179

180

The results of other models are retrieved from [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard). Given that all models in the `gte-modernbert` series have a size of less than 1B parameters, we focused exclusively on the results of models under 1B from the MTEB leaderboard.

181

182

|                                            Model Name                                            | Param Size (M) | Dimension | Sequence Length | Average (56) | Class. (12) | Clust. (11) | Pair Class. (3) | Reran. (4) | Retr. (15) |  STS (10)   | Summ. (1) |

183

|:------------------------------------------------------------------------------------------------:|:--------------:|:---------:|:---------------:|:------------:|:-----------:|:---:|:---:|:---:|:---:|:-----------:|:--------:|

184

|        [mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1)         |      335       |   1024    |       512       |    64.68     |    75.64    | 46.71 | 87.2 | 60.11 | 54.39 |     85      |   32.71  |

185

| [multilingual-e5-large-instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct) |      560       |   1024    |       514       |    64.41     |    77.56    | 47.1 | 86.19 | 58.58 | 52.47 |    84.78    |   30.39  |

186

|                [bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5)                |      335       |   1024    |       512       |    64.23     |    75.97    | 46.08 | 87.12 | 60.03 | 54.29 |    83.11    |   31.61  |

187

|             [gte-base-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5)              |      137       |    768    |      8192       |  64.11   |    77.17    | 46.82 | 85.33 | 57.66 | 54.09 |    81.97    |   31.17  |

188

|                 [bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)                 |      109       |    768    |       512       |    63.55     |    75.53    | 45.77 | 86.55 | 58.86 | 53.25 |    82.4     |   31.07  |

189

|            [gte-large-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5)             |      409       |   1024    |      8192       |    65.39     |    77.75    | 47.95 | 84.63 | 58.50 | 57.91 |    81.43    |   30.91  |

190

| [modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) |      149       |    768    |      8192       |    62.62     |    74.31    | 44.98 | 83.96 | 56.42 | 52.89 |    81.78    |   31.39  |

191

| [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) |                |    768    |      8192       |    62.28     |   	73.55    |	43.93 |	84.61 |	55.78 | 53.01|    81.94    |   30.4   |

192

| [gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) |      305       |    768    |       8192      |     61.4     | 70.89 | 44.31 | 84.24 | 57.47 |51.08 |    82.11    |   30.58  |

193

| [jina-embeddings-v3](https://huggingface.co/jinaai/jina-embeddings-v3) | 572 |   1024    |      8192  |       65.51 | 82.58 |45.21 |84.01 |58.13 |53.88 | 85.81 |   29.71  |

194

| [**gte-modernbert-base**](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 149 |   768    |      8192  |   **64.38** | **76.99** | **46.47** | **85.93** | **59.24** | **55.33** | **81.57** | **30.68** |

195

196

197

### LoCo (Long Document Retrieval)

198

199

| Model Name |  Dimension | Sequence Length | Average (5) | QsmsumRetrieval | SummScreenRetrieval | QasperAbastractRetrieval | QasperTitleRetrieval |  GovReportRetrieval |

200

|:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|

201

| [gte-qwen1.5-7b](https://huggingface.co/Alibaba-NLP/gte-qwen1.5-7b) | 4096 | 32768 |  87.57 | 49.37 | 93.10 | 99.67 | 97.54 | 98.21 |

202

| [gte-large-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-v1.5) |1024 | 8192 | 86.71 | 44.55 | 92.61 | 99.82 | 97.81 | 98.74 |

203

| [gte-base-v1.5](https://huggingface.co/Alibaba-NLP/gte-base-v1.5) | 768 | 8192 | 87.44 | 49.91  | 91.78 | 99.82 | 97.13 | 98.58 |

204

| [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 88.88 | 54.45 | 93.00 | 99.82 | 98.03 | 98.70 |

205

| [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 90.68 | 70.86 | 94.06 | 99.73 | 99.11 | 89.67 |

206

207

### COIR (Code Retrieval Task)

208

209

| Model Name | Dimension | Sequence Length | Average(20) | CodeSearchNet-ccr-go | CodeSearchNet-ccr-java | CodeSearchNet-ccr-javascript | CodeSearchNet-ccr-php | CodeSearchNet-ccr-python | CodeSearchNet-ccr-ruby | CodeSearchNet-go | CodeSearchNet-java | CodeSearchNet-javascript | CodeSearchNet-php | CodeSearchNet-python | CodeSearchNet-ruby | apps | codefeedback-mt | codefeedback-st | codetrans-contest | codetrans-dl | cosqa | stackoverflow-qa | synthetic-text2sql |

210

|:----:|:---:|:---:|:---:|:---:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |

211

| [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 79.31	| 94.15	| 93.57 |	94.27 |	91.51	| 93.93	| 90.63	| 88.32 |	83.27	| 76.05	| 85.12	| 88.16	| 77.59	| 57.54	| 82.34	| 85.95	| 71.89	 | 35.46	| 43.47	| 91.2	| 61.87 |

212

| [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 79.99	| 96.43	| 96.88	| 98.32 | 91.81	| 97.7	| 91.96 |	88.81	| 79.71	| 76.27	| 89.39	| 98.37	| 84.11	| 47.57	| 83.37	| 88.91	| 49.66	| 36.36	| 44.37	| 89.58	| 64.21 |

213

214

### BEIR

215

216

| Model Name | Dimension | Sequence Length | Average(15) | ArguAna | ClimateFEVER | CQADupstackAndroidRetrieval | DBPedia | FEVER | FiQA2018 | HotpotQA | MSMARCO | NFCorpus | NQ | QuoraRetrieval | SCIDOCS | SciFact | Touche2020 | TRECCOVID |

217

| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |

218

| [gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) | 768 | 8192 | 55.33 | 72.68 | 37.74 | 42.63 | 41.79 | 91.03 | 48.81 | 69.47 | 40.9 | 36.44 | 57.62 | 88.55 | 21.29 | 77.4 | 21.68 | 81.95 |

219

| [gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) | - | 8192 | 56.73 | 69.03 | 37.79 | 44.68 | 47.23 | 94.54 | 49.81 | 78.16 | 45.38 | 30.69 | 64.57 | 87.77 | 20.60 | 73.57 | 27.36 | 79.89 |

220

221

222

## Hiring

223

224

We have open positions for **Research Interns** and **Full-Time Researchers** to join our team at Tongyi Lab.

225

We are seeking passionate individuals with expertise in representation learning, LLM-driven information retrieval, Retrieval-Augmented Generation (RAG), and agent-based systems.

226

Our team is located in the vibrant cities of **Beijing** and **Hangzhou**.

227

If you are driven by curiosity and eager to make a meaningful impact through your work, we would love to hear from you. Please submit your resume along with a brief introduction to <a href="mailto:dingkun.ldk@alibaba-inc.com">dingkun.ldk@alibaba-inc.com</a>.

228

229

230

## Citation

231

232

If you find our paper or models helpful, feel free to give us a cite.

233

234

```

235

@inproceedings{zhang2024mgte,

236

title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},

237

  author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others},

238

booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},

239

pages={1393--1412},

240

year={2024}

241

}

242

243

@article{li2023towards,

244

title={Towards general text embeddings with multi-stage contrastive learning},

245

author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},

246

journal={arXiv preprint arXiv:2308.03281},

247

year={2023}

248

}

249

```

250