README.md
92.4 KB · 3070 lines · markdown Raw
1 ---
2 tags:
3 - sentence-transformers
4 - feature-extraction
5 - sentence-similarity
6 - transformers
7 - mteb
8 model-index:
9 - name: bge-large-en-v1.5
10 results:
11 - task:
12 type: Classification
13 dataset:
14 type: mteb/amazon_counterfactual
15 name: MTEB AmazonCounterfactualClassification (en)
16 config: en
17 split: test
18 revision: e8379541af4e31359cca9fbcf4b00f2671dba205
19 metrics:
20 - type: accuracy
21 value: 75.8507462686567
22 - type: ap
23 value: 38.566457320228245
24 - type: f1
25 value: 69.69386648043475
26 - task:
27 type: Classification
28 dataset:
29 type: mteb/amazon_polarity
30 name: MTEB AmazonPolarityClassification
31 config: default
32 split: test
33 revision: e2d317d38cd51312af73b3d32a06d1a08b442046
34 metrics:
35 - type: accuracy
36 value: 92.416675
37 - type: ap
38 value: 89.1928861155922
39 - type: f1
40 value: 92.39477019574215
41 - task:
42 type: Classification
43 dataset:
44 type: mteb/amazon_reviews_multi
45 name: MTEB AmazonReviewsClassification (en)
46 config: en
47 split: test
48 revision: 1399c76144fd37290681b995c656ef9b2e06e26d
49 metrics:
50 - type: accuracy
51 value: 48.175999999999995
52 - type: f1
53 value: 47.80712792870253
54 - task:
55 type: Retrieval
56 dataset:
57 type: arguana
58 name: MTEB ArguAna
59 config: default
60 split: test
61 revision: None
62 metrics:
63 - type: map_at_1
64 value: 40.184999999999995
65 - type: map_at_10
66 value: 55.654
67 - type: map_at_100
68 value: 56.25
69 - type: map_at_1000
70 value: 56.255
71 - type: map_at_3
72 value: 51.742999999999995
73 - type: map_at_5
74 value: 54.129000000000005
75 - type: mrr_at_1
76 value: 40.967
77 - type: mrr_at_10
78 value: 55.96
79 - type: mrr_at_100
80 value: 56.54900000000001
81 - type: mrr_at_1000
82 value: 56.554
83 - type: mrr_at_3
84 value: 51.980000000000004
85 - type: mrr_at_5
86 value: 54.44
87 - type: ndcg_at_1
88 value: 40.184999999999995
89 - type: ndcg_at_10
90 value: 63.542
91 - type: ndcg_at_100
92 value: 65.96499999999999
93 - type: ndcg_at_1000
94 value: 66.08699999999999
95 - type: ndcg_at_3
96 value: 55.582
97 - type: ndcg_at_5
98 value: 59.855000000000004
99 - type: precision_at_1
100 value: 40.184999999999995
101 - type: precision_at_10
102 value: 8.841000000000001
103 - type: precision_at_100
104 value: 0.987
105 - type: precision_at_1000
106 value: 0.1
107 - type: precision_at_3
108 value: 22.238
109 - type: precision_at_5
110 value: 15.405
111 - type: recall_at_1
112 value: 40.184999999999995
113 - type: recall_at_10
114 value: 88.407
115 - type: recall_at_100
116 value: 98.72
117 - type: recall_at_1000
118 value: 99.644
119 - type: recall_at_3
120 value: 66.714
121 - type: recall_at_5
122 value: 77.027
123 - task:
124 type: Clustering
125 dataset:
126 type: mteb/arxiv-clustering-p2p
127 name: MTEB ArxivClusteringP2P
128 config: default
129 split: test
130 revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
131 metrics:
132 - type: v_measure
133 value: 48.567077926750066
134 - task:
135 type: Clustering
136 dataset:
137 type: mteb/arxiv-clustering-s2s
138 name: MTEB ArxivClusteringS2S
139 config: default
140 split: test
141 revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
142 metrics:
143 - type: v_measure
144 value: 43.19453389182364
145 - task:
146 type: Reranking
147 dataset:
148 type: mteb/askubuntudupquestions-reranking
149 name: MTEB AskUbuntuDupQuestions
150 config: default
151 split: test
152 revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
153 metrics:
154 - type: map
155 value: 64.46555939623092
156 - type: mrr
157 value: 77.82361605768807
158 - task:
159 type: STS
160 dataset:
161 type: mteb/biosses-sts
162 name: MTEB BIOSSES
163 config: default
164 split: test
165 revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
166 metrics:
167 - type: cos_sim_pearson
168 value: 84.9554128814735
169 - type: cos_sim_spearman
170 value: 84.65373612172036
171 - type: euclidean_pearson
172 value: 83.2905059954138
173 - type: euclidean_spearman
174 value: 84.52240782811128
175 - type: manhattan_pearson
176 value: 82.99533802997436
177 - type: manhattan_spearman
178 value: 84.20673798475734
179 - task:
180 type: Classification
181 dataset:
182 type: mteb/banking77
183 name: MTEB Banking77Classification
184 config: default
185 split: test
186 revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
187 metrics:
188 - type: accuracy
189 value: 87.78896103896103
190 - type: f1
191 value: 87.77189310964883
192 - task:
193 type: Clustering
194 dataset:
195 type: mteb/biorxiv-clustering-p2p
196 name: MTEB BiorxivClusteringP2P
197 config: default
198 split: test
199 revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
200 metrics:
201 - type: v_measure
202 value: 39.714538337650495
203 - task:
204 type: Clustering
205 dataset:
206 type: mteb/biorxiv-clustering-s2s
207 name: MTEB BiorxivClusteringS2S
208 config: default
209 split: test
210 revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
211 metrics:
212 - type: v_measure
213 value: 36.90108349284447
214 - task:
215 type: Retrieval
216 dataset:
217 type: BeIR/cqadupstack
218 name: MTEB CQADupstackAndroidRetrieval
219 config: default
220 split: test
221 revision: None
222 metrics:
223 - type: map_at_1
224 value: 32.795
225 - type: map_at_10
226 value: 43.669000000000004
227 - type: map_at_100
228 value: 45.151
229 - type: map_at_1000
230 value: 45.278
231 - type: map_at_3
232 value: 40.006
233 - type: map_at_5
234 value: 42.059999999999995
235 - type: mrr_at_1
236 value: 39.771
237 - type: mrr_at_10
238 value: 49.826
239 - type: mrr_at_100
240 value: 50.504000000000005
241 - type: mrr_at_1000
242 value: 50.549
243 - type: mrr_at_3
244 value: 47.115
245 - type: mrr_at_5
246 value: 48.832
247 - type: ndcg_at_1
248 value: 39.771
249 - type: ndcg_at_10
250 value: 50.217999999999996
251 - type: ndcg_at_100
252 value: 55.454
253 - type: ndcg_at_1000
254 value: 57.37
255 - type: ndcg_at_3
256 value: 44.885000000000005
257 - type: ndcg_at_5
258 value: 47.419
259 - type: precision_at_1
260 value: 39.771
261 - type: precision_at_10
262 value: 9.642000000000001
263 - type: precision_at_100
264 value: 1.538
265 - type: precision_at_1000
266 value: 0.198
267 - type: precision_at_3
268 value: 21.268
269 - type: precision_at_5
270 value: 15.536
271 - type: recall_at_1
272 value: 32.795
273 - type: recall_at_10
274 value: 62.580999999999996
275 - type: recall_at_100
276 value: 84.438
277 - type: recall_at_1000
278 value: 96.492
279 - type: recall_at_3
280 value: 47.071000000000005
281 - type: recall_at_5
282 value: 54.079
283 - task:
284 type: Retrieval
285 dataset:
286 type: BeIR/cqadupstack
287 name: MTEB CQADupstackEnglishRetrieval
288 config: default
289 split: test
290 revision: None
291 metrics:
292 - type: map_at_1
293 value: 32.671
294 - type: map_at_10
295 value: 43.334
296 - type: map_at_100
297 value: 44.566
298 - type: map_at_1000
299 value: 44.702999999999996
300 - type: map_at_3
301 value: 40.343
302 - type: map_at_5
303 value: 41.983
304 - type: mrr_at_1
305 value: 40.764
306 - type: mrr_at_10
307 value: 49.382
308 - type: mrr_at_100
309 value: 49.988
310 - type: mrr_at_1000
311 value: 50.03300000000001
312 - type: mrr_at_3
313 value: 47.293
314 - type: mrr_at_5
315 value: 48.51
316 - type: ndcg_at_1
317 value: 40.764
318 - type: ndcg_at_10
319 value: 49.039
320 - type: ndcg_at_100
321 value: 53.259
322 - type: ndcg_at_1000
323 value: 55.253
324 - type: ndcg_at_3
325 value: 45.091
326 - type: ndcg_at_5
327 value: 46.839999999999996
328 - type: precision_at_1
329 value: 40.764
330 - type: precision_at_10
331 value: 9.191
332 - type: precision_at_100
333 value: 1.476
334 - type: precision_at_1000
335 value: 0.19499999999999998
336 - type: precision_at_3
337 value: 21.72
338 - type: precision_at_5
339 value: 15.299
340 - type: recall_at_1
341 value: 32.671
342 - type: recall_at_10
343 value: 58.816
344 - type: recall_at_100
345 value: 76.654
346 - type: recall_at_1000
347 value: 89.05999999999999
348 - type: recall_at_3
349 value: 46.743
350 - type: recall_at_5
351 value: 51.783
352 - task:
353 type: Retrieval
354 dataset:
355 type: BeIR/cqadupstack
356 name: MTEB CQADupstackGamingRetrieval
357 config: default
358 split: test
359 revision: None
360 metrics:
361 - type: map_at_1
362 value: 40.328
363 - type: map_at_10
364 value: 53.32599999999999
365 - type: map_at_100
366 value: 54.37499999999999
367 - type: map_at_1000
368 value: 54.429
369 - type: map_at_3
370 value: 49.902
371 - type: map_at_5
372 value: 52.002
373 - type: mrr_at_1
374 value: 46.332
375 - type: mrr_at_10
376 value: 56.858
377 - type: mrr_at_100
378 value: 57.522
379 - type: mrr_at_1000
380 value: 57.54899999999999
381 - type: mrr_at_3
382 value: 54.472
383 - type: mrr_at_5
384 value: 55.996
385 - type: ndcg_at_1
386 value: 46.332
387 - type: ndcg_at_10
388 value: 59.313
389 - type: ndcg_at_100
390 value: 63.266999999999996
391 - type: ndcg_at_1000
392 value: 64.36
393 - type: ndcg_at_3
394 value: 53.815000000000005
395 - type: ndcg_at_5
396 value: 56.814
397 - type: precision_at_1
398 value: 46.332
399 - type: precision_at_10
400 value: 9.53
401 - type: precision_at_100
402 value: 1.238
403 - type: precision_at_1000
404 value: 0.13699999999999998
405 - type: precision_at_3
406 value: 24.054000000000002
407 - type: precision_at_5
408 value: 16.589000000000002
409 - type: recall_at_1
410 value: 40.328
411 - type: recall_at_10
412 value: 73.421
413 - type: recall_at_100
414 value: 90.059
415 - type: recall_at_1000
416 value: 97.81
417 - type: recall_at_3
418 value: 59.009
419 - type: recall_at_5
420 value: 66.352
421 - task:
422 type: Retrieval
423 dataset:
424 type: BeIR/cqadupstack
425 name: MTEB CQADupstackGisRetrieval
426 config: default
427 split: test
428 revision: None
429 metrics:
430 - type: map_at_1
431 value: 27.424
432 - type: map_at_10
433 value: 36.332
434 - type: map_at_100
435 value: 37.347
436 - type: map_at_1000
437 value: 37.422
438 - type: map_at_3
439 value: 33.743
440 - type: map_at_5
441 value: 35.176
442 - type: mrr_at_1
443 value: 29.153000000000002
444 - type: mrr_at_10
445 value: 38.233
446 - type: mrr_at_100
447 value: 39.109
448 - type: mrr_at_1000
449 value: 39.164
450 - type: mrr_at_3
451 value: 35.876000000000005
452 - type: mrr_at_5
453 value: 37.169000000000004
454 - type: ndcg_at_1
455 value: 29.153000000000002
456 - type: ndcg_at_10
457 value: 41.439
458 - type: ndcg_at_100
459 value: 46.42
460 - type: ndcg_at_1000
461 value: 48.242000000000004
462 - type: ndcg_at_3
463 value: 36.362
464 - type: ndcg_at_5
465 value: 38.743
466 - type: precision_at_1
467 value: 29.153000000000002
468 - type: precision_at_10
469 value: 6.315999999999999
470 - type: precision_at_100
471 value: 0.927
472 - type: precision_at_1000
473 value: 0.11199999999999999
474 - type: precision_at_3
475 value: 15.443000000000001
476 - type: precision_at_5
477 value: 10.644
478 - type: recall_at_1
479 value: 27.424
480 - type: recall_at_10
481 value: 55.364000000000004
482 - type: recall_at_100
483 value: 78.211
484 - type: recall_at_1000
485 value: 91.74600000000001
486 - type: recall_at_3
487 value: 41.379
488 - type: recall_at_5
489 value: 47.14
490 - task:
491 type: Retrieval
492 dataset:
493 type: BeIR/cqadupstack
494 name: MTEB CQADupstackMathematicaRetrieval
495 config: default
496 split: test
497 revision: None
498 metrics:
499 - type: map_at_1
500 value: 19.601
501 - type: map_at_10
502 value: 27.826
503 - type: map_at_100
504 value: 29.017
505 - type: map_at_1000
506 value: 29.137
507 - type: map_at_3
508 value: 25.125999999999998
509 - type: map_at_5
510 value: 26.765
511 - type: mrr_at_1
512 value: 24.005000000000003
513 - type: mrr_at_10
514 value: 32.716
515 - type: mrr_at_100
516 value: 33.631
517 - type: mrr_at_1000
518 value: 33.694
519 - type: mrr_at_3
520 value: 29.934
521 - type: mrr_at_5
522 value: 31.630999999999997
523 - type: ndcg_at_1
524 value: 24.005000000000003
525 - type: ndcg_at_10
526 value: 33.158
527 - type: ndcg_at_100
528 value: 38.739000000000004
529 - type: ndcg_at_1000
530 value: 41.495
531 - type: ndcg_at_3
532 value: 28.185
533 - type: ndcg_at_5
534 value: 30.796
535 - type: precision_at_1
536 value: 24.005000000000003
537 - type: precision_at_10
538 value: 5.908
539 - type: precision_at_100
540 value: 1.005
541 - type: precision_at_1000
542 value: 0.13899999999999998
543 - type: precision_at_3
544 value: 13.391
545 - type: precision_at_5
546 value: 9.876
547 - type: recall_at_1
548 value: 19.601
549 - type: recall_at_10
550 value: 44.746
551 - type: recall_at_100
552 value: 68.82300000000001
553 - type: recall_at_1000
554 value: 88.215
555 - type: recall_at_3
556 value: 31.239
557 - type: recall_at_5
558 value: 37.695
559 - task:
560 type: Retrieval
561 dataset:
562 type: BeIR/cqadupstack
563 name: MTEB CQADupstackPhysicsRetrieval
564 config: default
565 split: test
566 revision: None
567 metrics:
568 - type: map_at_1
569 value: 30.130000000000003
570 - type: map_at_10
571 value: 40.96
572 - type: map_at_100
573 value: 42.282
574 - type: map_at_1000
575 value: 42.392
576 - type: map_at_3
577 value: 37.889
578 - type: map_at_5
579 value: 39.661
580 - type: mrr_at_1
581 value: 36.958999999999996
582 - type: mrr_at_10
583 value: 46.835
584 - type: mrr_at_100
585 value: 47.644
586 - type: mrr_at_1000
587 value: 47.688
588 - type: mrr_at_3
589 value: 44.562000000000005
590 - type: mrr_at_5
591 value: 45.938
592 - type: ndcg_at_1
593 value: 36.958999999999996
594 - type: ndcg_at_10
595 value: 47.06
596 - type: ndcg_at_100
597 value: 52.345
598 - type: ndcg_at_1000
599 value: 54.35
600 - type: ndcg_at_3
601 value: 42.301
602 - type: ndcg_at_5
603 value: 44.635999999999996
604 - type: precision_at_1
605 value: 36.958999999999996
606 - type: precision_at_10
607 value: 8.479000000000001
608 - type: precision_at_100
609 value: 1.284
610 - type: precision_at_1000
611 value: 0.163
612 - type: precision_at_3
613 value: 20.244
614 - type: precision_at_5
615 value: 14.224999999999998
616 - type: recall_at_1
617 value: 30.130000000000003
618 - type: recall_at_10
619 value: 59.27
620 - type: recall_at_100
621 value: 81.195
622 - type: recall_at_1000
623 value: 94.21199999999999
624 - type: recall_at_3
625 value: 45.885
626 - type: recall_at_5
627 value: 52.016
628 - task:
629 type: Retrieval
630 dataset:
631 type: BeIR/cqadupstack
632 name: MTEB CQADupstackProgrammersRetrieval
633 config: default
634 split: test
635 revision: None
636 metrics:
637 - type: map_at_1
638 value: 26.169999999999998
639 - type: map_at_10
640 value: 36.451
641 - type: map_at_100
642 value: 37.791000000000004
643 - type: map_at_1000
644 value: 37.897
645 - type: map_at_3
646 value: 33.109
647 - type: map_at_5
648 value: 34.937000000000005
649 - type: mrr_at_1
650 value: 32.877
651 - type: mrr_at_10
652 value: 42.368
653 - type: mrr_at_100
654 value: 43.201
655 - type: mrr_at_1000
656 value: 43.259
657 - type: mrr_at_3
658 value: 39.763999999999996
659 - type: mrr_at_5
660 value: 41.260000000000005
661 - type: ndcg_at_1
662 value: 32.877
663 - type: ndcg_at_10
664 value: 42.659000000000006
665 - type: ndcg_at_100
666 value: 48.161
667 - type: ndcg_at_1000
668 value: 50.345
669 - type: ndcg_at_3
670 value: 37.302
671 - type: ndcg_at_5
672 value: 39.722
673 - type: precision_at_1
674 value: 32.877
675 - type: precision_at_10
676 value: 7.9
677 - type: precision_at_100
678 value: 1.236
679 - type: precision_at_1000
680 value: 0.158
681 - type: precision_at_3
682 value: 17.846
683 - type: precision_at_5
684 value: 12.9
685 - type: recall_at_1
686 value: 26.169999999999998
687 - type: recall_at_10
688 value: 55.35
689 - type: recall_at_100
690 value: 78.755
691 - type: recall_at_1000
692 value: 93.518
693 - type: recall_at_3
694 value: 40.176
695 - type: recall_at_5
696 value: 46.589000000000006
697 - task:
698 type: Retrieval
699 dataset:
700 type: BeIR/cqadupstack
701 name: MTEB CQADupstackRetrieval
702 config: default
703 split: test
704 revision: None
705 metrics:
706 - type: map_at_1
707 value: 27.15516666666667
708 - type: map_at_10
709 value: 36.65741666666667
710 - type: map_at_100
711 value: 37.84991666666666
712 - type: map_at_1000
713 value: 37.96316666666667
714 - type: map_at_3
715 value: 33.74974999999999
716 - type: map_at_5
717 value: 35.3765
718 - type: mrr_at_1
719 value: 32.08233333333334
720 - type: mrr_at_10
721 value: 41.033833333333334
722 - type: mrr_at_100
723 value: 41.84524999999999
724 - type: mrr_at_1000
725 value: 41.89983333333333
726 - type: mrr_at_3
727 value: 38.62008333333333
728 - type: mrr_at_5
729 value: 40.03441666666666
730 - type: ndcg_at_1
731 value: 32.08233333333334
732 - type: ndcg_at_10
733 value: 42.229
734 - type: ndcg_at_100
735 value: 47.26716666666667
736 - type: ndcg_at_1000
737 value: 49.43466666666667
738 - type: ndcg_at_3
739 value: 37.36408333333333
740 - type: ndcg_at_5
741 value: 39.6715
742 - type: precision_at_1
743 value: 32.08233333333334
744 - type: precision_at_10
745 value: 7.382583333333334
746 - type: precision_at_100
747 value: 1.16625
748 - type: precision_at_1000
749 value: 0.15408333333333332
750 - type: precision_at_3
751 value: 17.218
752 - type: precision_at_5
753 value: 12.21875
754 - type: recall_at_1
755 value: 27.15516666666667
756 - type: recall_at_10
757 value: 54.36683333333333
758 - type: recall_at_100
759 value: 76.37183333333333
760 - type: recall_at_1000
761 value: 91.26183333333333
762 - type: recall_at_3
763 value: 40.769916666666674
764 - type: recall_at_5
765 value: 46.702333333333335
766 - task:
767 type: Retrieval
768 dataset:
769 type: BeIR/cqadupstack
770 name: MTEB CQADupstackStatsRetrieval
771 config: default
772 split: test
773 revision: None
774 metrics:
775 - type: map_at_1
776 value: 25.749
777 - type: map_at_10
778 value: 33.001999999999995
779 - type: map_at_100
780 value: 33.891
781 - type: map_at_1000
782 value: 33.993
783 - type: map_at_3
784 value: 30.703999999999997
785 - type: map_at_5
786 value: 31.959
787 - type: mrr_at_1
788 value: 28.834
789 - type: mrr_at_10
790 value: 35.955
791 - type: mrr_at_100
792 value: 36.709
793 - type: mrr_at_1000
794 value: 36.779
795 - type: mrr_at_3
796 value: 33.947
797 - type: mrr_at_5
798 value: 35.089
799 - type: ndcg_at_1
800 value: 28.834
801 - type: ndcg_at_10
802 value: 37.329
803 - type: ndcg_at_100
804 value: 41.79
805 - type: ndcg_at_1000
806 value: 44.169000000000004
807 - type: ndcg_at_3
808 value: 33.184999999999995
809 - type: ndcg_at_5
810 value: 35.107
811 - type: precision_at_1
812 value: 28.834
813 - type: precision_at_10
814 value: 5.7669999999999995
815 - type: precision_at_100
816 value: 0.876
817 - type: precision_at_1000
818 value: 0.11399999999999999
819 - type: precision_at_3
820 value: 14.213000000000001
821 - type: precision_at_5
822 value: 9.754999999999999
823 - type: recall_at_1
824 value: 25.749
825 - type: recall_at_10
826 value: 47.791
827 - type: recall_at_100
828 value: 68.255
829 - type: recall_at_1000
830 value: 85.749
831 - type: recall_at_3
832 value: 36.199
833 - type: recall_at_5
834 value: 41.071999999999996
835 - task:
836 type: Retrieval
837 dataset:
838 type: BeIR/cqadupstack
839 name: MTEB CQADupstackTexRetrieval
840 config: default
841 split: test
842 revision: None
843 metrics:
844 - type: map_at_1
845 value: 17.777
846 - type: map_at_10
847 value: 25.201
848 - type: map_at_100
849 value: 26.423999999999996
850 - type: map_at_1000
851 value: 26.544
852 - type: map_at_3
853 value: 22.869
854 - type: map_at_5
855 value: 24.023
856 - type: mrr_at_1
857 value: 21.473
858 - type: mrr_at_10
859 value: 29.12
860 - type: mrr_at_100
861 value: 30.144
862 - type: mrr_at_1000
863 value: 30.215999999999998
864 - type: mrr_at_3
865 value: 26.933
866 - type: mrr_at_5
867 value: 28.051
868 - type: ndcg_at_1
869 value: 21.473
870 - type: ndcg_at_10
871 value: 30.003
872 - type: ndcg_at_100
873 value: 35.766
874 - type: ndcg_at_1000
875 value: 38.501000000000005
876 - type: ndcg_at_3
877 value: 25.773000000000003
878 - type: ndcg_at_5
879 value: 27.462999999999997
880 - type: precision_at_1
881 value: 21.473
882 - type: precision_at_10
883 value: 5.482
884 - type: precision_at_100
885 value: 0.975
886 - type: precision_at_1000
887 value: 0.13799999999999998
888 - type: precision_at_3
889 value: 12.205
890 - type: precision_at_5
891 value: 8.692
892 - type: recall_at_1
893 value: 17.777
894 - type: recall_at_10
895 value: 40.582
896 - type: recall_at_100
897 value: 66.305
898 - type: recall_at_1000
899 value: 85.636
900 - type: recall_at_3
901 value: 28.687
902 - type: recall_at_5
903 value: 33.089
904 - task:
905 type: Retrieval
906 dataset:
907 type: BeIR/cqadupstack
908 name: MTEB CQADupstackUnixRetrieval
909 config: default
910 split: test
911 revision: None
912 metrics:
913 - type: map_at_1
914 value: 26.677
915 - type: map_at_10
916 value: 36.309000000000005
917 - type: map_at_100
918 value: 37.403999999999996
919 - type: map_at_1000
920 value: 37.496
921 - type: map_at_3
922 value: 33.382
923 - type: map_at_5
924 value: 34.98
925 - type: mrr_at_1
926 value: 31.343
927 - type: mrr_at_10
928 value: 40.549
929 - type: mrr_at_100
930 value: 41.342
931 - type: mrr_at_1000
932 value: 41.397
933 - type: mrr_at_3
934 value: 38.029
935 - type: mrr_at_5
936 value: 39.451
937 - type: ndcg_at_1
938 value: 31.343
939 - type: ndcg_at_10
940 value: 42.1
941 - type: ndcg_at_100
942 value: 47.089999999999996
943 - type: ndcg_at_1000
944 value: 49.222
945 - type: ndcg_at_3
946 value: 36.836999999999996
947 - type: ndcg_at_5
948 value: 39.21
949 - type: precision_at_1
950 value: 31.343
951 - type: precision_at_10
952 value: 7.164
953 - type: precision_at_100
954 value: 1.0959999999999999
955 - type: precision_at_1000
956 value: 0.13899999999999998
957 - type: precision_at_3
958 value: 16.915
959 - type: precision_at_5
960 value: 11.940000000000001
961 - type: recall_at_1
962 value: 26.677
963 - type: recall_at_10
964 value: 55.54599999999999
965 - type: recall_at_100
966 value: 77.094
967 - type: recall_at_1000
968 value: 92.01
969 - type: recall_at_3
970 value: 41.191
971 - type: recall_at_5
972 value: 47.006
973 - task:
974 type: Retrieval
975 dataset:
976 type: BeIR/cqadupstack
977 name: MTEB CQADupstackWebmastersRetrieval
978 config: default
979 split: test
980 revision: None
981 metrics:
982 - type: map_at_1
983 value: 24.501
984 - type: map_at_10
985 value: 33.102
986 - type: map_at_100
987 value: 34.676
988 - type: map_at_1000
989 value: 34.888000000000005
990 - type: map_at_3
991 value: 29.944
992 - type: map_at_5
993 value: 31.613999999999997
994 - type: mrr_at_1
995 value: 29.447000000000003
996 - type: mrr_at_10
997 value: 37.996
998 - type: mrr_at_100
999 value: 38.946
1000 - type: mrr_at_1000
1001 value: 38.995000000000005
1002 - type: mrr_at_3
1003 value: 35.079
1004 - type: mrr_at_5
1005 value: 36.69
1006 - type: ndcg_at_1
1007 value: 29.447000000000003
1008 - type: ndcg_at_10
1009 value: 39.232
1010 - type: ndcg_at_100
1011 value: 45.247
1012 - type: ndcg_at_1000
1013 value: 47.613
1014 - type: ndcg_at_3
1015 value: 33.922999999999995
1016 - type: ndcg_at_5
1017 value: 36.284
1018 - type: precision_at_1
1019 value: 29.447000000000003
1020 - type: precision_at_10
1021 value: 7.648000000000001
1022 - type: precision_at_100
1023 value: 1.516
1024 - type: precision_at_1000
1025 value: 0.23900000000000002
1026 - type: precision_at_3
1027 value: 16.008
1028 - type: precision_at_5
1029 value: 11.779
1030 - type: recall_at_1
1031 value: 24.501
1032 - type: recall_at_10
1033 value: 51.18899999999999
1034 - type: recall_at_100
1035 value: 78.437
1036 - type: recall_at_1000
1037 value: 92.842
1038 - type: recall_at_3
1039 value: 35.808
1040 - type: recall_at_5
1041 value: 42.197
1042 - task:
1043 type: Retrieval
1044 dataset:
1045 type: BeIR/cqadupstack
1046 name: MTEB CQADupstackWordpressRetrieval
1047 config: default
1048 split: test
1049 revision: None
1050 metrics:
1051 - type: map_at_1
1052 value: 22.039
1053 - type: map_at_10
1054 value: 30.377
1055 - type: map_at_100
1056 value: 31.275
1057 - type: map_at_1000
1058 value: 31.379
1059 - type: map_at_3
1060 value: 27.98
1061 - type: map_at_5
1062 value: 29.358
1063 - type: mrr_at_1
1064 value: 24.03
1065 - type: mrr_at_10
1066 value: 32.568000000000005
1067 - type: mrr_at_100
1068 value: 33.403
1069 - type: mrr_at_1000
1070 value: 33.475
1071 - type: mrr_at_3
1072 value: 30.436999999999998
1073 - type: mrr_at_5
1074 value: 31.796000000000003
1075 - type: ndcg_at_1
1076 value: 24.03
1077 - type: ndcg_at_10
1078 value: 35.198
1079 - type: ndcg_at_100
1080 value: 39.668
1081 - type: ndcg_at_1000
1082 value: 42.296
1083 - type: ndcg_at_3
1084 value: 30.709999999999997
1085 - type: ndcg_at_5
1086 value: 33.024
1087 - type: precision_at_1
1088 value: 24.03
1089 - type: precision_at_10
1090 value: 5.564
1091 - type: precision_at_100
1092 value: 0.828
1093 - type: precision_at_1000
1094 value: 0.117
1095 - type: precision_at_3
1096 value: 13.309000000000001
1097 - type: precision_at_5
1098 value: 9.39
1099 - type: recall_at_1
1100 value: 22.039
1101 - type: recall_at_10
1102 value: 47.746
1103 - type: recall_at_100
1104 value: 68.23599999999999
1105 - type: recall_at_1000
1106 value: 87.852
1107 - type: recall_at_3
1108 value: 35.852000000000004
1109 - type: recall_at_5
1110 value: 41.410000000000004
1111 - task:
1112 type: Retrieval
1113 dataset:
1114 type: climate-fever
1115 name: MTEB ClimateFEVER
1116 config: default
1117 split: test
1118 revision: None
1119 metrics:
1120 - type: map_at_1
1121 value: 15.692999999999998
1122 - type: map_at_10
1123 value: 26.903
1124 - type: map_at_100
1125 value: 28.987000000000002
1126 - type: map_at_1000
1127 value: 29.176999999999996
1128 - type: map_at_3
1129 value: 22.137
1130 - type: map_at_5
1131 value: 24.758
1132 - type: mrr_at_1
1133 value: 35.57
1134 - type: mrr_at_10
1135 value: 47.821999999999996
1136 - type: mrr_at_100
1137 value: 48.608000000000004
1138 - type: mrr_at_1000
1139 value: 48.638999999999996
1140 - type: mrr_at_3
1141 value: 44.452000000000005
1142 - type: mrr_at_5
1143 value: 46.546
1144 - type: ndcg_at_1
1145 value: 35.57
1146 - type: ndcg_at_10
1147 value: 36.567
1148 - type: ndcg_at_100
1149 value: 44.085
1150 - type: ndcg_at_1000
1151 value: 47.24
1152 - type: ndcg_at_3
1153 value: 29.964000000000002
1154 - type: ndcg_at_5
1155 value: 32.511
1156 - type: precision_at_1
1157 value: 35.57
1158 - type: precision_at_10
1159 value: 11.485
1160 - type: precision_at_100
1161 value: 1.9619999999999997
1162 - type: precision_at_1000
1163 value: 0.256
1164 - type: precision_at_3
1165 value: 22.237000000000002
1166 - type: precision_at_5
1167 value: 17.471999999999998
1168 - type: recall_at_1
1169 value: 15.692999999999998
1170 - type: recall_at_10
1171 value: 43.056
1172 - type: recall_at_100
1173 value: 68.628
1174 - type: recall_at_1000
1175 value: 86.075
1176 - type: recall_at_3
1177 value: 26.918999999999997
1178 - type: recall_at_5
1179 value: 34.14
1180 - task:
1181 type: Retrieval
1182 dataset:
1183 type: dbpedia-entity
1184 name: MTEB DBPedia
1185 config: default
1186 split: test
1187 revision: None
1188 metrics:
1189 - type: map_at_1
1190 value: 9.53
1191 - type: map_at_10
1192 value: 20.951
1193 - type: map_at_100
1194 value: 30.136000000000003
1195 - type: map_at_1000
1196 value: 31.801000000000002
1197 - type: map_at_3
1198 value: 15.021
1199 - type: map_at_5
1200 value: 17.471999999999998
1201 - type: mrr_at_1
1202 value: 71.0
1203 - type: mrr_at_10
1204 value: 79.176
1205 - type: mrr_at_100
1206 value: 79.418
1207 - type: mrr_at_1000
1208 value: 79.426
1209 - type: mrr_at_3
1210 value: 78.125
1211 - type: mrr_at_5
1212 value: 78.61200000000001
1213 - type: ndcg_at_1
1214 value: 58.5
1215 - type: ndcg_at_10
1216 value: 44.106
1217 - type: ndcg_at_100
1218 value: 49.268
1219 - type: ndcg_at_1000
1220 value: 56.711999999999996
1221 - type: ndcg_at_3
1222 value: 48.934
1223 - type: ndcg_at_5
1224 value: 45.826
1225 - type: precision_at_1
1226 value: 71.0
1227 - type: precision_at_10
1228 value: 35.0
1229 - type: precision_at_100
1230 value: 11.360000000000001
1231 - type: precision_at_1000
1232 value: 2.046
1233 - type: precision_at_3
1234 value: 52.833
1235 - type: precision_at_5
1236 value: 44.15
1237 - type: recall_at_1
1238 value: 9.53
1239 - type: recall_at_10
1240 value: 26.811
1241 - type: recall_at_100
1242 value: 55.916999999999994
1243 - type: recall_at_1000
1244 value: 79.973
1245 - type: recall_at_3
1246 value: 16.413
1247 - type: recall_at_5
1248 value: 19.980999999999998
1249 - task:
1250 type: Classification
1251 dataset:
1252 type: mteb/emotion
1253 name: MTEB EmotionClassification
1254 config: default
1255 split: test
1256 revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1257 metrics:
1258 - type: accuracy
1259 value: 51.519999999999996
1260 - type: f1
1261 value: 46.36601294761231
1262 - task:
1263 type: Retrieval
1264 dataset:
1265 type: fever
1266 name: MTEB FEVER
1267 config: default
1268 split: test
1269 revision: None
1270 metrics:
1271 - type: map_at_1
1272 value: 74.413
1273 - type: map_at_10
1274 value: 83.414
1275 - type: map_at_100
1276 value: 83.621
1277 - type: map_at_1000
1278 value: 83.635
1279 - type: map_at_3
1280 value: 82.337
1281 - type: map_at_5
1282 value: 83.039
1283 - type: mrr_at_1
1284 value: 80.19800000000001
1285 - type: mrr_at_10
1286 value: 87.715
1287 - type: mrr_at_100
1288 value: 87.778
1289 - type: mrr_at_1000
1290 value: 87.779
1291 - type: mrr_at_3
1292 value: 87.106
1293 - type: mrr_at_5
1294 value: 87.555
1295 - type: ndcg_at_1
1296 value: 80.19800000000001
1297 - type: ndcg_at_10
1298 value: 87.182
1299 - type: ndcg_at_100
1300 value: 87.90299999999999
1301 - type: ndcg_at_1000
1302 value: 88.143
1303 - type: ndcg_at_3
1304 value: 85.60600000000001
1305 - type: ndcg_at_5
1306 value: 86.541
1307 - type: precision_at_1
1308 value: 80.19800000000001
1309 - type: precision_at_10
1310 value: 10.531
1311 - type: precision_at_100
1312 value: 1.113
1313 - type: precision_at_1000
1314 value: 0.11499999999999999
1315 - type: precision_at_3
1316 value: 32.933
1317 - type: precision_at_5
1318 value: 20.429
1319 - type: recall_at_1
1320 value: 74.413
1321 - type: recall_at_10
1322 value: 94.363
1323 - type: recall_at_100
1324 value: 97.165
1325 - type: recall_at_1000
1326 value: 98.668
1327 - type: recall_at_3
1328 value: 90.108
1329 - type: recall_at_5
1330 value: 92.52
1331 - task:
1332 type: Retrieval
1333 dataset:
1334 type: fiqa
1335 name: MTEB FiQA2018
1336 config: default
1337 split: test
1338 revision: None
1339 metrics:
1340 - type: map_at_1
1341 value: 22.701
1342 - type: map_at_10
1343 value: 37.122
1344 - type: map_at_100
1345 value: 39.178000000000004
1346 - type: map_at_1000
1347 value: 39.326
1348 - type: map_at_3
1349 value: 32.971000000000004
1350 - type: map_at_5
1351 value: 35.332
1352 - type: mrr_at_1
1353 value: 44.753
1354 - type: mrr_at_10
1355 value: 53.452
1356 - type: mrr_at_100
1357 value: 54.198
1358 - type: mrr_at_1000
1359 value: 54.225
1360 - type: mrr_at_3
1361 value: 50.952
1362 - type: mrr_at_5
1363 value: 52.464
1364 - type: ndcg_at_1
1365 value: 44.753
1366 - type: ndcg_at_10
1367 value: 45.021
1368 - type: ndcg_at_100
1369 value: 52.028
1370 - type: ndcg_at_1000
1371 value: 54.596000000000004
1372 - type: ndcg_at_3
1373 value: 41.622
1374 - type: ndcg_at_5
1375 value: 42.736000000000004
1376 - type: precision_at_1
1377 value: 44.753
1378 - type: precision_at_10
1379 value: 12.284
1380 - type: precision_at_100
1381 value: 1.955
1382 - type: precision_at_1000
1383 value: 0.243
1384 - type: precision_at_3
1385 value: 27.828999999999997
1386 - type: precision_at_5
1387 value: 20.061999999999998
1388 - type: recall_at_1
1389 value: 22.701
1390 - type: recall_at_10
1391 value: 51.432
1392 - type: recall_at_100
1393 value: 77.009
1394 - type: recall_at_1000
1395 value: 92.511
1396 - type: recall_at_3
1397 value: 37.919000000000004
1398 - type: recall_at_5
1399 value: 44.131
1400 - task:
1401 type: Retrieval
1402 dataset:
1403 type: hotpotqa
1404 name: MTEB HotpotQA
1405 config: default
1406 split: test
1407 revision: None
1408 metrics:
1409 - type: map_at_1
1410 value: 40.189
1411 - type: map_at_10
1412 value: 66.24600000000001
1413 - type: map_at_100
1414 value: 67.098
1415 - type: map_at_1000
1416 value: 67.149
1417 - type: map_at_3
1418 value: 62.684
1419 - type: map_at_5
1420 value: 64.974
1421 - type: mrr_at_1
1422 value: 80.378
1423 - type: mrr_at_10
1424 value: 86.127
1425 - type: mrr_at_100
1426 value: 86.29299999999999
1427 - type: mrr_at_1000
1428 value: 86.297
1429 - type: mrr_at_3
1430 value: 85.31400000000001
1431 - type: mrr_at_5
1432 value: 85.858
1433 - type: ndcg_at_1
1434 value: 80.378
1435 - type: ndcg_at_10
1436 value: 74.101
1437 - type: ndcg_at_100
1438 value: 76.993
1439 - type: ndcg_at_1000
1440 value: 77.948
1441 - type: ndcg_at_3
1442 value: 69.232
1443 - type: ndcg_at_5
1444 value: 72.04599999999999
1445 - type: precision_at_1
1446 value: 80.378
1447 - type: precision_at_10
1448 value: 15.595999999999998
1449 - type: precision_at_100
1450 value: 1.7840000000000003
1451 - type: precision_at_1000
1452 value: 0.191
1453 - type: precision_at_3
1454 value: 44.884
1455 - type: precision_at_5
1456 value: 29.145
1457 - type: recall_at_1
1458 value: 40.189
1459 - type: recall_at_10
1460 value: 77.981
1461 - type: recall_at_100
1462 value: 89.21
1463 - type: recall_at_1000
1464 value: 95.48299999999999
1465 - type: recall_at_3
1466 value: 67.326
1467 - type: recall_at_5
1468 value: 72.863
1469 - task:
1470 type: Classification
1471 dataset:
1472 type: mteb/imdb
1473 name: MTEB ImdbClassification
1474 config: default
1475 split: test
1476 revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1477 metrics:
1478 - type: accuracy
1479 value: 92.84599999999999
1480 - type: ap
1481 value: 89.4710787567357
1482 - type: f1
1483 value: 92.83752676932258
1484 - task:
1485 type: Retrieval
1486 dataset:
1487 type: msmarco
1488 name: MTEB MSMARCO
1489 config: default
1490 split: dev
1491 revision: None
1492 metrics:
1493 - type: map_at_1
1494 value: 23.132
1495 - type: map_at_10
1496 value: 35.543
1497 - type: map_at_100
1498 value: 36.702
1499 - type: map_at_1000
1500 value: 36.748999999999995
1501 - type: map_at_3
1502 value: 31.737
1503 - type: map_at_5
1504 value: 33.927
1505 - type: mrr_at_1
1506 value: 23.782
1507 - type: mrr_at_10
1508 value: 36.204
1509 - type: mrr_at_100
1510 value: 37.29
1511 - type: mrr_at_1000
1512 value: 37.330999999999996
1513 - type: mrr_at_3
1514 value: 32.458999999999996
1515 - type: mrr_at_5
1516 value: 34.631
1517 - type: ndcg_at_1
1518 value: 23.782
1519 - type: ndcg_at_10
1520 value: 42.492999999999995
1521 - type: ndcg_at_100
1522 value: 47.985
1523 - type: ndcg_at_1000
1524 value: 49.141
1525 - type: ndcg_at_3
1526 value: 34.748000000000005
1527 - type: ndcg_at_5
1528 value: 38.651
1529 - type: precision_at_1
1530 value: 23.782
1531 - type: precision_at_10
1532 value: 6.665
1533 - type: precision_at_100
1534 value: 0.941
1535 - type: precision_at_1000
1536 value: 0.104
1537 - type: precision_at_3
1538 value: 14.776
1539 - type: precision_at_5
1540 value: 10.84
1541 - type: recall_at_1
1542 value: 23.132
1543 - type: recall_at_10
1544 value: 63.794
1545 - type: recall_at_100
1546 value: 89.027
1547 - type: recall_at_1000
1548 value: 97.807
1549 - type: recall_at_3
1550 value: 42.765
1551 - type: recall_at_5
1552 value: 52.11
1553 - task:
1554 type: Classification
1555 dataset:
1556 type: mteb/mtop_domain
1557 name: MTEB MTOPDomainClassification (en)
1558 config: en
1559 split: test
1560 revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1561 metrics:
1562 - type: accuracy
1563 value: 94.59188326493388
1564 - type: f1
1565 value: 94.3842594786827
1566 - task:
1567 type: Classification
1568 dataset:
1569 type: mteb/mtop_intent
1570 name: MTEB MTOPIntentClassification (en)
1571 config: en
1572 split: test
1573 revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1574 metrics:
1575 - type: accuracy
1576 value: 79.49384404924761
1577 - type: f1
1578 value: 59.7580539534629
1579 - task:
1580 type: Classification
1581 dataset:
1582 type: mteb/amazon_massive_intent
1583 name: MTEB MassiveIntentClassification (en)
1584 config: en
1585 split: test
1586 revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1587 metrics:
1588 - type: accuracy
1589 value: 77.56220578345663
1590 - type: f1
1591 value: 75.27228165561478
1592 - task:
1593 type: Classification
1594 dataset:
1595 type: mteb/amazon_massive_scenario
1596 name: MTEB MassiveScenarioClassification (en)
1597 config: en
1598 split: test
1599 revision: 7d571f92784cd94a019292a1f45445077d0ef634
1600 metrics:
1601 - type: accuracy
1602 value: 80.53463349024884
1603 - type: f1
1604 value: 80.4893958236536
1605 - task:
1606 type: Clustering
1607 dataset:
1608 type: mteb/medrxiv-clustering-p2p
1609 name: MTEB MedrxivClusteringP2P
1610 config: default
1611 split: test
1612 revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1613 metrics:
1614 - type: v_measure
1615 value: 32.56100273484962
1616 - task:
1617 type: Clustering
1618 dataset:
1619 type: mteb/medrxiv-clustering-s2s
1620 name: MTEB MedrxivClusteringS2S
1621 config: default
1622 split: test
1623 revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1624 metrics:
1625 - type: v_measure
1626 value: 31.470380028839607
1627 - task:
1628 type: Reranking
1629 dataset:
1630 type: mteb/mind_small
1631 name: MTEB MindSmallReranking
1632 config: default
1633 split: test
1634 revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1635 metrics:
1636 - type: map
1637 value: 32.06102792457849
1638 - type: mrr
1639 value: 33.30709199672238
1640 - task:
1641 type: Retrieval
1642 dataset:
1643 type: nfcorpus
1644 name: MTEB NFCorpus
1645 config: default
1646 split: test
1647 revision: None
1648 metrics:
1649 - type: map_at_1
1650 value: 6.776999999999999
1651 - type: map_at_10
1652 value: 14.924000000000001
1653 - type: map_at_100
1654 value: 18.955
1655 - type: map_at_1000
1656 value: 20.538999999999998
1657 - type: map_at_3
1658 value: 10.982
1659 - type: map_at_5
1660 value: 12.679000000000002
1661 - type: mrr_at_1
1662 value: 47.988
1663 - type: mrr_at_10
1664 value: 57.232000000000006
1665 - type: mrr_at_100
1666 value: 57.818999999999996
1667 - type: mrr_at_1000
1668 value: 57.847
1669 - type: mrr_at_3
1670 value: 54.901999999999994
1671 - type: mrr_at_5
1672 value: 56.481
1673 - type: ndcg_at_1
1674 value: 46.594
1675 - type: ndcg_at_10
1676 value: 38.129000000000005
1677 - type: ndcg_at_100
1678 value: 35.54
1679 - type: ndcg_at_1000
1680 value: 44.172
1681 - type: ndcg_at_3
1682 value: 43.025999999999996
1683 - type: ndcg_at_5
1684 value: 41.052
1685 - type: precision_at_1
1686 value: 47.988
1687 - type: precision_at_10
1688 value: 28.111000000000004
1689 - type: precision_at_100
1690 value: 8.929
1691 - type: precision_at_1000
1692 value: 2.185
1693 - type: precision_at_3
1694 value: 40.144000000000005
1695 - type: precision_at_5
1696 value: 35.232
1697 - type: recall_at_1
1698 value: 6.776999999999999
1699 - type: recall_at_10
1700 value: 19.289
1701 - type: recall_at_100
1702 value: 36.359
1703 - type: recall_at_1000
1704 value: 67.54
1705 - type: recall_at_3
1706 value: 11.869
1707 - type: recall_at_5
1708 value: 14.999
1709 - task:
1710 type: Retrieval
1711 dataset:
1712 type: nq
1713 name: MTEB NQ
1714 config: default
1715 split: test
1716 revision: None
1717 metrics:
1718 - type: map_at_1
1719 value: 31.108000000000004
1720 - type: map_at_10
1721 value: 47.126000000000005
1722 - type: map_at_100
1723 value: 48.171
1724 - type: map_at_1000
1725 value: 48.199
1726 - type: map_at_3
1727 value: 42.734
1728 - type: map_at_5
1729 value: 45.362
1730 - type: mrr_at_1
1731 value: 34.936
1732 - type: mrr_at_10
1733 value: 49.571
1734 - type: mrr_at_100
1735 value: 50.345
1736 - type: mrr_at_1000
1737 value: 50.363
1738 - type: mrr_at_3
1739 value: 45.959
1740 - type: mrr_at_5
1741 value: 48.165
1742 - type: ndcg_at_1
1743 value: 34.936
1744 - type: ndcg_at_10
1745 value: 55.028999999999996
1746 - type: ndcg_at_100
1747 value: 59.244
1748 - type: ndcg_at_1000
1749 value: 59.861
1750 - type: ndcg_at_3
1751 value: 46.872
1752 - type: ndcg_at_5
1753 value: 51.217999999999996
1754 - type: precision_at_1
1755 value: 34.936
1756 - type: precision_at_10
1757 value: 9.099
1758 - type: precision_at_100
1759 value: 1.145
1760 - type: precision_at_1000
1761 value: 0.12
1762 - type: precision_at_3
1763 value: 21.456
1764 - type: precision_at_5
1765 value: 15.411
1766 - type: recall_at_1
1767 value: 31.108000000000004
1768 - type: recall_at_10
1769 value: 76.53999999999999
1770 - type: recall_at_100
1771 value: 94.39
1772 - type: recall_at_1000
1773 value: 98.947
1774 - type: recall_at_3
1775 value: 55.572
1776 - type: recall_at_5
1777 value: 65.525
1778 - task:
1779 type: Retrieval
1780 dataset:
1781 type: quora
1782 name: MTEB QuoraRetrieval
1783 config: default
1784 split: test
1785 revision: None
1786 metrics:
1787 - type: map_at_1
1788 value: 71.56400000000001
1789 - type: map_at_10
1790 value: 85.482
1791 - type: map_at_100
1792 value: 86.114
1793 - type: map_at_1000
1794 value: 86.13
1795 - type: map_at_3
1796 value: 82.607
1797 - type: map_at_5
1798 value: 84.405
1799 - type: mrr_at_1
1800 value: 82.42
1801 - type: mrr_at_10
1802 value: 88.304
1803 - type: mrr_at_100
1804 value: 88.399
1805 - type: mrr_at_1000
1806 value: 88.399
1807 - type: mrr_at_3
1808 value: 87.37
1809 - type: mrr_at_5
1810 value: 88.024
1811 - type: ndcg_at_1
1812 value: 82.45
1813 - type: ndcg_at_10
1814 value: 89.06500000000001
1815 - type: ndcg_at_100
1816 value: 90.232
1817 - type: ndcg_at_1000
1818 value: 90.305
1819 - type: ndcg_at_3
1820 value: 86.375
1821 - type: ndcg_at_5
1822 value: 87.85300000000001
1823 - type: precision_at_1
1824 value: 82.45
1825 - type: precision_at_10
1826 value: 13.486999999999998
1827 - type: precision_at_100
1828 value: 1.534
1829 - type: precision_at_1000
1830 value: 0.157
1831 - type: precision_at_3
1832 value: 37.813
1833 - type: precision_at_5
1834 value: 24.773999999999997
1835 - type: recall_at_1
1836 value: 71.56400000000001
1837 - type: recall_at_10
1838 value: 95.812
1839 - type: recall_at_100
1840 value: 99.7
1841 - type: recall_at_1000
1842 value: 99.979
1843 - type: recall_at_3
1844 value: 87.966
1845 - type: recall_at_5
1846 value: 92.268
1847 - task:
1848 type: Clustering
1849 dataset:
1850 type: mteb/reddit-clustering
1851 name: MTEB RedditClustering
1852 config: default
1853 split: test
1854 revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1855 metrics:
1856 - type: v_measure
1857 value: 57.241876648614145
1858 - task:
1859 type: Clustering
1860 dataset:
1861 type: mteb/reddit-clustering-p2p
1862 name: MTEB RedditClusteringP2P
1863 config: default
1864 split: test
1865 revision: 282350215ef01743dc01b456c7f5241fa8937f16
1866 metrics:
1867 - type: v_measure
1868 value: 64.66212576446223
1869 - task:
1870 type: Retrieval
1871 dataset:
1872 type: scidocs
1873 name: MTEB SCIDOCS
1874 config: default
1875 split: test
1876 revision: None
1877 metrics:
1878 - type: map_at_1
1879 value: 5.308
1880 - type: map_at_10
1881 value: 13.803
1882 - type: map_at_100
1883 value: 16.176
1884 - type: map_at_1000
1885 value: 16.561
1886 - type: map_at_3
1887 value: 9.761000000000001
1888 - type: map_at_5
1889 value: 11.802
1890 - type: mrr_at_1
1891 value: 26.200000000000003
1892 - type: mrr_at_10
1893 value: 37.621
1894 - type: mrr_at_100
1895 value: 38.767
1896 - type: mrr_at_1000
1897 value: 38.815
1898 - type: mrr_at_3
1899 value: 34.117
1900 - type: mrr_at_5
1901 value: 36.107
1902 - type: ndcg_at_1
1903 value: 26.200000000000003
1904 - type: ndcg_at_10
1905 value: 22.64
1906 - type: ndcg_at_100
1907 value: 31.567
1908 - type: ndcg_at_1000
1909 value: 37.623
1910 - type: ndcg_at_3
1911 value: 21.435000000000002
1912 - type: ndcg_at_5
1913 value: 18.87
1914 - type: precision_at_1
1915 value: 26.200000000000003
1916 - type: precision_at_10
1917 value: 11.74
1918 - type: precision_at_100
1919 value: 2.465
1920 - type: precision_at_1000
1921 value: 0.391
1922 - type: precision_at_3
1923 value: 20.033
1924 - type: precision_at_5
1925 value: 16.64
1926 - type: recall_at_1
1927 value: 5.308
1928 - type: recall_at_10
1929 value: 23.794999999999998
1930 - type: recall_at_100
1931 value: 50.015
1932 - type: recall_at_1000
1933 value: 79.283
1934 - type: recall_at_3
1935 value: 12.178
1936 - type: recall_at_5
1937 value: 16.882
1938 - task:
1939 type: STS
1940 dataset:
1941 type: mteb/sickr-sts
1942 name: MTEB SICK-R
1943 config: default
1944 split: test
1945 revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1946 metrics:
1947 - type: cos_sim_pearson
1948 value: 84.93231134675553
1949 - type: cos_sim_spearman
1950 value: 81.68319292603205
1951 - type: euclidean_pearson
1952 value: 81.8396814380367
1953 - type: euclidean_spearman
1954 value: 81.24641903349945
1955 - type: manhattan_pearson
1956 value: 81.84698799204274
1957 - type: manhattan_spearman
1958 value: 81.24269997904105
1959 - task:
1960 type: STS
1961 dataset:
1962 type: mteb/sts12-sts
1963 name: MTEB STS12
1964 config: default
1965 split: test
1966 revision: a0d554a64d88156834ff5ae9920b964011b16384
1967 metrics:
1968 - type: cos_sim_pearson
1969 value: 86.73241671587446
1970 - type: cos_sim_spearman
1971 value: 79.05091082971826
1972 - type: euclidean_pearson
1973 value: 83.91146869578044
1974 - type: euclidean_spearman
1975 value: 79.87978465370936
1976 - type: manhattan_pearson
1977 value: 83.90888338917678
1978 - type: manhattan_spearman
1979 value: 79.87482848584241
1980 - task:
1981 type: STS
1982 dataset:
1983 type: mteb/sts13-sts
1984 name: MTEB STS13
1985 config: default
1986 split: test
1987 revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1988 metrics:
1989 - type: cos_sim_pearson
1990 value: 85.14970731146177
1991 - type: cos_sim_spearman
1992 value: 86.37363490084627
1993 - type: euclidean_pearson
1994 value: 83.02154218530433
1995 - type: euclidean_spearman
1996 value: 83.80258761957367
1997 - type: manhattan_pearson
1998 value: 83.01664495119347
1999 - type: manhattan_spearman
2000 value: 83.77567458007952
2001 - task:
2002 type: STS
2003 dataset:
2004 type: mteb/sts14-sts
2005 name: MTEB STS14
2006 config: default
2007 split: test
2008 revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2009 metrics:
2010 - type: cos_sim_pearson
2011 value: 83.40474139886784
2012 - type: cos_sim_spearman
2013 value: 82.77768789165984
2014 - type: euclidean_pearson
2015 value: 80.7065877443695
2016 - type: euclidean_spearman
2017 value: 81.375940662505
2018 - type: manhattan_pearson
2019 value: 80.6507552270278
2020 - type: manhattan_spearman
2021 value: 81.32782179098741
2022 - task:
2023 type: STS
2024 dataset:
2025 type: mteb/sts15-sts
2026 name: MTEB STS15
2027 config: default
2028 split: test
2029 revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2030 metrics:
2031 - type: cos_sim_pearson
2032 value: 87.08585968722274
2033 - type: cos_sim_spearman
2034 value: 88.03110031451399
2035 - type: euclidean_pearson
2036 value: 85.74012019602384
2037 - type: euclidean_spearman
2038 value: 86.13592849438209
2039 - type: manhattan_pearson
2040 value: 85.74404842369206
2041 - type: manhattan_spearman
2042 value: 86.14492318960154
2043 - task:
2044 type: STS
2045 dataset:
2046 type: mteb/sts16-sts
2047 name: MTEB STS16
2048 config: default
2049 split: test
2050 revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2051 metrics:
2052 - type: cos_sim_pearson
2053 value: 84.95069052788875
2054 - type: cos_sim_spearman
2055 value: 86.4867991595147
2056 - type: euclidean_pearson
2057 value: 84.31013325754635
2058 - type: euclidean_spearman
2059 value: 85.01529258006482
2060 - type: manhattan_pearson
2061 value: 84.26995570085374
2062 - type: manhattan_spearman
2063 value: 84.96982104986162
2064 - task:
2065 type: STS
2066 dataset:
2067 type: mteb/sts17-crosslingual-sts
2068 name: MTEB STS17 (en-en)
2069 config: en-en
2070 split: test
2071 revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2072 metrics:
2073 - type: cos_sim_pearson
2074 value: 87.54617647971897
2075 - type: cos_sim_spearman
2076 value: 87.49834181751034
2077 - type: euclidean_pearson
2078 value: 86.01015322577122
2079 - type: euclidean_spearman
2080 value: 84.63362652063199
2081 - type: manhattan_pearson
2082 value: 86.13807574475706
2083 - type: manhattan_spearman
2084 value: 84.7772370721132
2085 - task:
2086 type: STS
2087 dataset:
2088 type: mteb/sts22-crosslingual-sts
2089 name: MTEB STS22 (en)
2090 config: en
2091 split: test
2092 revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2093 metrics:
2094 - type: cos_sim_pearson
2095 value: 67.20047755786615
2096 - type: cos_sim_spearman
2097 value: 67.05324077987636
2098 - type: euclidean_pearson
2099 value: 66.91930642976601
2100 - type: euclidean_spearman
2101 value: 65.21491856099105
2102 - type: manhattan_pearson
2103 value: 66.78756851976624
2104 - type: manhattan_spearman
2105 value: 65.12356257740728
2106 - task:
2107 type: STS
2108 dataset:
2109 type: mteb/stsbenchmark-sts
2110 name: MTEB STSBenchmark
2111 config: default
2112 split: test
2113 revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2114 metrics:
2115 - type: cos_sim_pearson
2116 value: 86.19852871539686
2117 - type: cos_sim_spearman
2118 value: 87.5161895296395
2119 - type: euclidean_pearson
2120 value: 84.59848645207485
2121 - type: euclidean_spearman
2122 value: 85.26427328757919
2123 - type: manhattan_pearson
2124 value: 84.59747366996524
2125 - type: manhattan_spearman
2126 value: 85.24045855146915
2127 - task:
2128 type: Reranking
2129 dataset:
2130 type: mteb/scidocs-reranking
2131 name: MTEB SciDocsRR
2132 config: default
2133 split: test
2134 revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2135 metrics:
2136 - type: map
2137 value: 87.63320317811032
2138 - type: mrr
2139 value: 96.26242947321379
2140 - task:
2141 type: Retrieval
2142 dataset:
2143 type: scifact
2144 name: MTEB SciFact
2145 config: default
2146 split: test
2147 revision: None
2148 metrics:
2149 - type: map_at_1
2150 value: 60.928000000000004
2151 - type: map_at_10
2152 value: 70.112
2153 - type: map_at_100
2154 value: 70.59299999999999
2155 - type: map_at_1000
2156 value: 70.623
2157 - type: map_at_3
2158 value: 66.846
2159 - type: map_at_5
2160 value: 68.447
2161 - type: mrr_at_1
2162 value: 64.0
2163 - type: mrr_at_10
2164 value: 71.212
2165 - type: mrr_at_100
2166 value: 71.616
2167 - type: mrr_at_1000
2168 value: 71.64500000000001
2169 - type: mrr_at_3
2170 value: 68.77799999999999
2171 - type: mrr_at_5
2172 value: 70.094
2173 - type: ndcg_at_1
2174 value: 64.0
2175 - type: ndcg_at_10
2176 value: 74.607
2177 - type: ndcg_at_100
2178 value: 76.416
2179 - type: ndcg_at_1000
2180 value: 77.102
2181 - type: ndcg_at_3
2182 value: 69.126
2183 - type: ndcg_at_5
2184 value: 71.41300000000001
2185 - type: precision_at_1
2186 value: 64.0
2187 - type: precision_at_10
2188 value: 9.933
2189 - type: precision_at_100
2190 value: 1.077
2191 - type: precision_at_1000
2192 value: 0.11299999999999999
2193 - type: precision_at_3
2194 value: 26.556
2195 - type: precision_at_5
2196 value: 17.467
2197 - type: recall_at_1
2198 value: 60.928000000000004
2199 - type: recall_at_10
2200 value: 87.322
2201 - type: recall_at_100
2202 value: 94.833
2203 - type: recall_at_1000
2204 value: 100.0
2205 - type: recall_at_3
2206 value: 72.628
2207 - type: recall_at_5
2208 value: 78.428
2209 - task:
2210 type: PairClassification
2211 dataset:
2212 type: mteb/sprintduplicatequestions-pairclassification
2213 name: MTEB SprintDuplicateQuestions
2214 config: default
2215 split: test
2216 revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2217 metrics:
2218 - type: cos_sim_accuracy
2219 value: 99.86237623762376
2220 - type: cos_sim_ap
2221 value: 96.72586477206649
2222 - type: cos_sim_f1
2223 value: 93.01858362631845
2224 - type: cos_sim_precision
2225 value: 93.4409687184662
2226 - type: cos_sim_recall
2227 value: 92.60000000000001
2228 - type: dot_accuracy
2229 value: 99.78019801980199
2230 - type: dot_ap
2231 value: 93.72748205246228
2232 - type: dot_f1
2233 value: 89.04109589041096
2234 - type: dot_precision
2235 value: 87.16475095785441
2236 - type: dot_recall
2237 value: 91.0
2238 - type: euclidean_accuracy
2239 value: 99.85445544554456
2240 - type: euclidean_ap
2241 value: 96.6661459876145
2242 - type: euclidean_f1
2243 value: 92.58337481333997
2244 - type: euclidean_precision
2245 value: 92.17046580773042
2246 - type: euclidean_recall
2247 value: 93.0
2248 - type: manhattan_accuracy
2249 value: 99.85445544554456
2250 - type: manhattan_ap
2251 value: 96.6883549244056
2252 - type: manhattan_f1
2253 value: 92.57598405580468
2254 - type: manhattan_precision
2255 value: 92.25422045680239
2256 - type: manhattan_recall
2257 value: 92.9
2258 - type: max_accuracy
2259 value: 99.86237623762376
2260 - type: max_ap
2261 value: 96.72586477206649
2262 - type: max_f1
2263 value: 93.01858362631845
2264 - task:
2265 type: Clustering
2266 dataset:
2267 type: mteb/stackexchange-clustering
2268 name: MTEB StackExchangeClustering
2269 config: default
2270 split: test
2271 revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2272 metrics:
2273 - type: v_measure
2274 value: 66.39930057069995
2275 - task:
2276 type: Clustering
2277 dataset:
2278 type: mteb/stackexchange-clustering-p2p
2279 name: MTEB StackExchangeClusteringP2P
2280 config: default
2281 split: test
2282 revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2283 metrics:
2284 - type: v_measure
2285 value: 34.96398659903402
2286 - task:
2287 type: Reranking
2288 dataset:
2289 type: mteb/stackoverflowdupquestions-reranking
2290 name: MTEB StackOverflowDupQuestions
2291 config: default
2292 split: test
2293 revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2294 metrics:
2295 - type: map
2296 value: 55.946944700355395
2297 - type: mrr
2298 value: 56.97151398438164
2299 - task:
2300 type: Summarization
2301 dataset:
2302 type: mteb/summeval
2303 name: MTEB SummEval
2304 config: default
2305 split: test
2306 revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2307 metrics:
2308 - type: cos_sim_pearson
2309 value: 31.541657650692905
2310 - type: cos_sim_spearman
2311 value: 31.605804192286303
2312 - type: dot_pearson
2313 value: 28.26905996736398
2314 - type: dot_spearman
2315 value: 27.864801765851187
2316 - task:
2317 type: Retrieval
2318 dataset:
2319 type: trec-covid
2320 name: MTEB TRECCOVID
2321 config: default
2322 split: test
2323 revision: None
2324 metrics:
2325 - type: map_at_1
2326 value: 0.22599999999999998
2327 - type: map_at_10
2328 value: 1.8870000000000002
2329 - type: map_at_100
2330 value: 9.78
2331 - type: map_at_1000
2332 value: 22.514
2333 - type: map_at_3
2334 value: 0.6669999999999999
2335 - type: map_at_5
2336 value: 1.077
2337 - type: mrr_at_1
2338 value: 82.0
2339 - type: mrr_at_10
2340 value: 89.86699999999999
2341 - type: mrr_at_100
2342 value: 89.86699999999999
2343 - type: mrr_at_1000
2344 value: 89.86699999999999
2345 - type: mrr_at_3
2346 value: 89.667
2347 - type: mrr_at_5
2348 value: 89.667
2349 - type: ndcg_at_1
2350 value: 79.0
2351 - type: ndcg_at_10
2352 value: 74.818
2353 - type: ndcg_at_100
2354 value: 53.715999999999994
2355 - type: ndcg_at_1000
2356 value: 47.082
2357 - type: ndcg_at_3
2358 value: 82.134
2359 - type: ndcg_at_5
2360 value: 79.81899999999999
2361 - type: precision_at_1
2362 value: 82.0
2363 - type: precision_at_10
2364 value: 78.0
2365 - type: precision_at_100
2366 value: 54.48
2367 - type: precision_at_1000
2368 value: 20.518
2369 - type: precision_at_3
2370 value: 87.333
2371 - type: precision_at_5
2372 value: 85.2
2373 - type: recall_at_1
2374 value: 0.22599999999999998
2375 - type: recall_at_10
2376 value: 2.072
2377 - type: recall_at_100
2378 value: 13.013
2379 - type: recall_at_1000
2380 value: 43.462
2381 - type: recall_at_3
2382 value: 0.695
2383 - type: recall_at_5
2384 value: 1.139
2385 - task:
2386 type: Retrieval
2387 dataset:
2388 type: webis-touche2020
2389 name: MTEB Touche2020
2390 config: default
2391 split: test
2392 revision: None
2393 metrics:
2394 - type: map_at_1
2395 value: 2.328
2396 - type: map_at_10
2397 value: 9.795
2398 - type: map_at_100
2399 value: 15.801000000000002
2400 - type: map_at_1000
2401 value: 17.23
2402 - type: map_at_3
2403 value: 4.734
2404 - type: map_at_5
2405 value: 6.644
2406 - type: mrr_at_1
2407 value: 30.612000000000002
2408 - type: mrr_at_10
2409 value: 46.902
2410 - type: mrr_at_100
2411 value: 47.495
2412 - type: mrr_at_1000
2413 value: 47.495
2414 - type: mrr_at_3
2415 value: 41.156
2416 - type: mrr_at_5
2417 value: 44.218
2418 - type: ndcg_at_1
2419 value: 28.571
2420 - type: ndcg_at_10
2421 value: 24.806
2422 - type: ndcg_at_100
2423 value: 36.419000000000004
2424 - type: ndcg_at_1000
2425 value: 47.272999999999996
2426 - type: ndcg_at_3
2427 value: 25.666
2428 - type: ndcg_at_5
2429 value: 25.448999999999998
2430 - type: precision_at_1
2431 value: 30.612000000000002
2432 - type: precision_at_10
2433 value: 23.061
2434 - type: precision_at_100
2435 value: 7.714
2436 - type: precision_at_1000
2437 value: 1.484
2438 - type: precision_at_3
2439 value: 26.531
2440 - type: precision_at_5
2441 value: 26.122
2442 - type: recall_at_1
2443 value: 2.328
2444 - type: recall_at_10
2445 value: 16.524
2446 - type: recall_at_100
2447 value: 47.179
2448 - type: recall_at_1000
2449 value: 81.22200000000001
2450 - type: recall_at_3
2451 value: 5.745
2452 - type: recall_at_5
2453 value: 9.339
2454 - task:
2455 type: Classification
2456 dataset:
2457 type: mteb/toxic_conversations_50k
2458 name: MTEB ToxicConversationsClassification
2459 config: default
2460 split: test
2461 revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2462 metrics:
2463 - type: accuracy
2464 value: 70.9142
2465 - type: ap
2466 value: 14.335574772555415
2467 - type: f1
2468 value: 54.62839595194111
2469 - task:
2470 type: Classification
2471 dataset:
2472 type: mteb/tweet_sentiment_extraction
2473 name: MTEB TweetSentimentExtractionClassification
2474 config: default
2475 split: test
2476 revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2477 metrics:
2478 - type: accuracy
2479 value: 59.94340690435768
2480 - type: f1
2481 value: 60.286487936731916
2482 - task:
2483 type: Clustering
2484 dataset:
2485 type: mteb/twentynewsgroups-clustering
2486 name: MTEB TwentyNewsgroupsClustering
2487 config: default
2488 split: test
2489 revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2490 metrics:
2491 - type: v_measure
2492 value: 51.26597708987974
2493 - task:
2494 type: PairClassification
2495 dataset:
2496 type: mteb/twittersemeval2015-pairclassification
2497 name: MTEB TwitterSemEval2015
2498 config: default
2499 split: test
2500 revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2501 metrics:
2502 - type: cos_sim_accuracy
2503 value: 87.48882398521786
2504 - type: cos_sim_ap
2505 value: 79.04326607602204
2506 - type: cos_sim_f1
2507 value: 71.64566826860633
2508 - type: cos_sim_precision
2509 value: 70.55512918905092
2510 - type: cos_sim_recall
2511 value: 72.77044854881267
2512 - type: dot_accuracy
2513 value: 84.19264469213805
2514 - type: dot_ap
2515 value: 67.96360043562528
2516 - type: dot_f1
2517 value: 64.06418393006827
2518 - type: dot_precision
2519 value: 58.64941898706424
2520 - type: dot_recall
2521 value: 70.58047493403694
2522 - type: euclidean_accuracy
2523 value: 87.45902127913214
2524 - type: euclidean_ap
2525 value: 78.9742237648272
2526 - type: euclidean_f1
2527 value: 71.5553235908142
2528 - type: euclidean_precision
2529 value: 70.77955601445535
2530 - type: euclidean_recall
2531 value: 72.34828496042216
2532 - type: manhattan_accuracy
2533 value: 87.41729749061214
2534 - type: manhattan_ap
2535 value: 78.90073137580596
2536 - type: manhattan_f1
2537 value: 71.3942611553533
2538 - type: manhattan_precision
2539 value: 68.52705653967483
2540 - type: manhattan_recall
2541 value: 74.51187335092348
2542 - type: max_accuracy
2543 value: 87.48882398521786
2544 - type: max_ap
2545 value: 79.04326607602204
2546 - type: max_f1
2547 value: 71.64566826860633
2548 - task:
2549 type: PairClassification
2550 dataset:
2551 type: mteb/twitterurlcorpus-pairclassification
2552 name: MTEB TwitterURLCorpus
2553 config: default
2554 split: test
2555 revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2556 metrics:
2557 - type: cos_sim_accuracy
2558 value: 88.68125897465751
2559 - type: cos_sim_ap
2560 value: 85.6003454431979
2561 - type: cos_sim_f1
2562 value: 77.6957163958641
2563 - type: cos_sim_precision
2564 value: 73.0110366307807
2565 - type: cos_sim_recall
2566 value: 83.02279026793964
2567 - type: dot_accuracy
2568 value: 87.7672992587418
2569 - type: dot_ap
2570 value: 82.4971301112899
2571 - type: dot_f1
2572 value: 75.90528233151184
2573 - type: dot_precision
2574 value: 72.0370626469368
2575 - type: dot_recall
2576 value: 80.21250384970742
2577 - type: euclidean_accuracy
2578 value: 88.4503434625684
2579 - type: euclidean_ap
2580 value: 84.91949884748384
2581 - type: euclidean_f1
2582 value: 76.92365018444684
2583 - type: euclidean_precision
2584 value: 74.53245721712759
2585 - type: euclidean_recall
2586 value: 79.47336002463813
2587 - type: manhattan_accuracy
2588 value: 88.47556952691427
2589 - type: manhattan_ap
2590 value: 84.8963689101517
2591 - type: manhattan_f1
2592 value: 76.85901249256395
2593 - type: manhattan_precision
2594 value: 74.31693989071039
2595 - type: manhattan_recall
2596 value: 79.58115183246073
2597 - type: max_accuracy
2598 value: 88.68125897465751
2599 - type: max_ap
2600 value: 85.6003454431979
2601 - type: max_f1
2602 value: 77.6957163958641
2603 license: mit
2604 language:
2605 - en
2606 ---
2607
2608
2609 <h1 align="center">FlagEmbedding</h1>
2610
2611
2612 <h4 align="center">
2613 <p>
2614 <a href=#model-list>Model List</a> |
2615 <a href=#frequently-asked-questions>FAQ</a> |
2616 <a href=#usage>Usage</a> |
2617 <a href="#evaluation">Evaluation</a> |
2618 <a href="#train">Train</a> |
2619 <a href="#contact">Contact</a> |
2620 <a href="#citation">Citation</a> |
2621 <a href="#license">License</a>
2622 <p>
2623 </h4>
2624
2625 For more details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
2626
2627 If you are looking for a model that supports more languages, longer texts, and other retrieval methods, you can try using [bge-m3](https://huggingface.co/BAAI/bge-m3).
2628
2629
2630 [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2631
2632 FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently:
2633
2634 - **Long-Context LLM**: [Activation Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon)
2635 - **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
2636 - **Dense Retrieval**: [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3), [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding)
2637 - **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2638 - **Benchmark**: [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
2639
2640 ## News
2641 - 1/30/2024: Release **BGE-M3**, a new member to BGE model series! M3 stands for **M**ulti-linguality (100+ languages), **M**ulti-granularities (input length up to 8192), **M**ulti-Functionality (unification of dense, lexical, multi-vec/colbert retrieval).
2642 It is the first embedding model that supports all three retrieval methods, achieving new SOTA on multi-lingual (MIRACL) and cross-lingual (MKQA) benchmarks.
2643 [Technical Report](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/BGE_M3.pdf) and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3). :fire:
2644 - 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
2645 - 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
2646 - 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
2647 - 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
2648 - 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) and [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
2649 - 09/12/2023: New models:
2650 - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
2651 - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
2652
2653
2654 <details>
2655 <summary>More</summary>
2656 <!-- ### More -->
2657
2658 - 09/07/2023: Update [fine-tune code](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md): Add script to mine hard negatives and support adding instruction during fine-tuning.
2659 - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like [this](#using-langchain); C-MTEB **leaderboard** is [available](https://huggingface.co/spaces/mteb/leaderboard).
2660 - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗**
2661 - 08/02/2023: Release `bge-large-*`(short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada:
2662 - 08/01/2023: We release the [Chinese Massive Text Embedding Benchmark](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), consisting of 31 test dataset.
2663
2664 </details>
2665
2666
2667 ## Model List
2668
2669 `bge` is short for `BAAI general embedding`.
2670
2671 | Model | Language | | Description | query instruction for retrieval [1] |
2672 |:-------------------------------|:--------:| :--------:| :--------:|:--------:|
2673 | [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
2674 | [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
2675 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2676 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2677 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2678 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2679 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2680 | [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2681 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2682 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2683 | [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard | `Represent this sentence for searching relevant passages: ` |
2684 | [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-en` | `Represent this sentence for searching relevant passages: ` |
2685 | [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) |a small-scale model but with competitive performance | `Represent this sentence for searching relevant passages: ` |
2686 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB) benchmark | `为这个句子生成表示以用于检索相关文章:` |
2687 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
2688 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
2689
2690 [1\]: If you need to search the relevant passages to a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages.
2691
2692 [2\]: Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
2693 For examples, use bge embedding model to retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 document to get the final top-3 results.
2694
2695 All models have been uploaded to Huggingface Hub, and you can see them at https://huggingface.co/BAAI.
2696 If you cannot open the Huggingface Hub, you also can download the models at https://model.baai.ac.cn/models .
2697
2698
2699 ## Frequently asked questions
2700
2701 <details>
2702 <summary>1. How to fine-tune bge embedding model?</summary>
2703
2704 <!-- ### How to fine-tune bge embedding model? -->
2705 Following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) to prepare data and fine-tune your model.
2706 Some suggestions:
2707 - Mine hard negatives following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune#hard-negatives), which can improve the retrieval performance.
2708 - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity.
2709 - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker.
2710
2711
2712 </details>
2713
2714 <details>
2715 <summary>2. The similarity score between two dissimilar sentences is higher than 0.5</summary>
2716
2717 <!-- ### The similarity score between two dissimilar sentences is higher than 0.5 -->
2718 **Suggest to use bge v1.5, which alleviates the issue of the similarity distribution.**
2719
2720 Since we finetune the models by contrastive learning with a temperature of 0.01,
2721 the similarity distribution of the current BGE model is about in the interval \[0.6, 1\].
2722 So a similarity score greater than 0.5 does not indicate that the two sentences are similar.
2723
2724 For downstream tasks, such as passage retrieval or semantic similarity,
2725 **what matters is the relative order of the scores, not the absolute value.**
2726 If you need to filter similar sentences based on a similarity threshold,
2727 please select an appropriate similarity threshold based on the similarity distribution on your data (such as 0.8, 0.85, or even 0.9).
2728
2729 </details>
2730
2731 <details>
2732 <summary>3. When does the query instruction need to be used</summary>
2733
2734 <!-- ### When does the query instruction need to be used -->
2735
2736 For the `bge-*-v1.5`, we improve its retrieval ability when not using instruction.
2737 No instruction only has a slight degradation in retrieval performance compared with using instruction.
2738 So you can generate embedding without instruction in all cases for convenience.
2739
2740 For a retrieval task that uses short queries to find long related documents,
2741 it is recommended to add instructions for these short queries.
2742 **The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.**
2743 In all cases, the documents/passages do not need to add the instruction.
2744
2745 </details>
2746
2747
2748 ## Usage
2749
2750 ### Usage for Embedding Model
2751
2752 Here are some examples for using `bge` models with
2753 [FlagEmbedding](#using-flagembedding), [Sentence-Transformers](#using-sentence-transformers), [Langchain](#using-langchain), or [Huggingface Transformers](#using-huggingface-transformers).
2754
2755 #### Using FlagEmbedding
2756 ```
2757 pip install -U FlagEmbedding
2758 ```
2759 If it doesn't work for you, you can see [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md) for more methods to install FlagEmbedding.
2760
2761 ```python
2762 from FlagEmbedding import FlagModel
2763 sentences_1 = ["样例数据-1", "样例数据-2"]
2764 sentences_2 = ["样例数据-3", "样例数据-4"]
2765 model = FlagModel('BAAI/bge-large-zh-v1.5',
2766 query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章:",
2767 use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2768 embeddings_1 = model.encode(sentences_1)
2769 embeddings_2 = model.encode(sentences_2)
2770 similarity = embeddings_1 @ embeddings_2.T
2771 print(similarity)
2772
2773 # for s2p(short query to long passage) retrieval task, suggest to use encode_queries() which will automatically add the instruction to each query
2774 # corpus in retrieval task can still use encode() or encode_corpus(), since they don't need instruction
2775 queries = ['query_1', 'query_2']
2776 passages = ["样例文档-1", "样例文档-2"]
2777 q_embeddings = model.encode_queries(queries)
2778 p_embeddings = model.encode(passages)
2779 scores = q_embeddings @ p_embeddings.T
2780 ```
2781 For the value of the argument `query_instruction_for_retrieval`, see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list).
2782
2783 By default, FlagModel will use all available GPUs when encoding. Please set `os.environ["CUDA_VISIBLE_DEVICES"]` to select specific GPUs.
2784 You also can set `os.environ["CUDA_VISIBLE_DEVICES"]=""` to make all GPUs unavailable.
2785
2786
2787 #### Using Sentence-Transformers
2788
2789 You can also use the `bge` models with [sentence-transformers](https://www.SBERT.net):
2790
2791 ```
2792 pip install -U sentence-transformers
2793 ```
2794 ```python
2795 from sentence_transformers import SentenceTransformer
2796 sentences_1 = ["样例数据-1", "样例数据-2"]
2797 sentences_2 = ["样例数据-3", "样例数据-4"]
2798 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2799 embeddings_1 = model.encode(sentences_1, normalize_embeddings=True)
2800 embeddings_2 = model.encode(sentences_2, normalize_embeddings=True)
2801 similarity = embeddings_1 @ embeddings_2.T
2802 print(similarity)
2803 ```
2804 For s2p(short query to long passage) retrieval task,
2805 each short query should start with an instruction (instructions see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list)).
2806 But the instruction is not needed for passages.
2807 ```python
2808 from sentence_transformers import SentenceTransformer
2809 queries = ['query_1', 'query_2']
2810 passages = ["样例文档-1", "样例文档-2"]
2811 instruction = "为这个句子生成表示以用于检索相关文章:"
2812
2813 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2814 q_embeddings = model.encode([instruction+q for q in queries], normalize_embeddings=True)
2815 p_embeddings = model.encode(passages, normalize_embeddings=True)
2816 scores = q_embeddings @ p_embeddings.T
2817 ```
2818
2819 #### Using Langchain
2820
2821 You can use `bge` in langchain like this:
2822 ```python
2823 from langchain.embeddings import HuggingFaceBgeEmbeddings
2824 model_name = "BAAI/bge-large-en-v1.5"
2825 model_kwargs = {'device': 'cuda'}
2826 encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity
2827 model = HuggingFaceBgeEmbeddings(
2828 model_name=model_name,
2829 model_kwargs=model_kwargs,
2830 encode_kwargs=encode_kwargs,
2831 query_instruction="为这个句子生成表示以用于检索相关文章:"
2832 )
2833 model.query_instruction = "为这个句子生成表示以用于检索相关文章:"
2834 ```
2835
2836
2837 #### Using HuggingFace Transformers
2838
2839 With the transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of the first token (i.e., [CLS]) as the sentence embedding.
2840
2841 ```python
2842 from transformers import AutoTokenizer, AutoModel
2843 import torch
2844 # Sentences we want sentence embeddings for
2845 sentences = ["样例数据-1", "样例数据-2"]
2846
2847 # Load model from HuggingFace Hub
2848 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-zh-v1.5')
2849 model = AutoModel.from_pretrained('BAAI/bge-large-zh-v1.5')
2850 model.eval()
2851
2852 # Tokenize sentences
2853 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2854 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2855 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2856
2857 # Compute token embeddings
2858 with torch.no_grad():
2859 model_output = model(**encoded_input)
2860 # Perform pooling. In this case, cls pooling.
2861 sentence_embeddings = model_output[0][:, 0]
2862 # normalize embeddings
2863 sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
2864 print("Sentence embeddings:", sentence_embeddings)
2865 ```
2866
2867 #### Usage of the ONNX files
2868
2869 ```python
2870 from optimum.onnxruntime import ORTModelForFeatureExtraction # type: ignore
2871
2872 import torch
2873 from transformers import AutoModel, AutoTokenizer
2874
2875 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-en-v1.5')
2876 model = AutoModel.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13")
2877 model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13",file_name="onnx/model.onnx")
2878
2879 # Sentences we want sentence embeddings for
2880 sentences = ["样例数据-1", "样例数据-2"]
2881
2882 # Tokenize sentences
2883 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2884 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2885 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2886
2887 model_output_ort = model_ort(**encoded_input)
2888 # Compute token embeddings
2889 with torch.no_grad():
2890 model_output = model(**encoded_input)
2891
2892 # model_output and model_output_ort are identical
2893
2894 ```
2895
2896 Its also possible to deploy the onnx files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
2897 ```python
2898 import asyncio
2899 from infinity_emb import AsyncEmbeddingEngine, EngineArgs
2900
2901 sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
2902 engine = AsyncEmbeddingEngine.from_args(
2903 EngineArgs(model_name_or_path = "BAAI/bge-large-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
2904 ))
2905
2906 async def main():
2907 async with engine:
2908 embeddings, usage = await engine.embed(sentences=sentences)
2909 asyncio.run(main())
2910 ```
2911
2912 ### Usage for Reranker
2913
2914 Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
2915 You can get a relevance score by inputting query and passage to the reranker.
2916 The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range.
2917
2918
2919 #### Using FlagEmbedding
2920 ```
2921 pip install -U FlagEmbedding
2922 ```
2923
2924 Get relevance scores (higher scores indicate more relevance):
2925 ```python
2926 from FlagEmbedding import FlagReranker
2927 reranker = FlagReranker('BAAI/bge-reranker-large', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2928
2929 score = reranker.compute_score(['query', 'passage'])
2930 print(score)
2931
2932 scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']])
2933 print(scores)
2934 ```
2935
2936
2937 #### Using Huggingface transformers
2938
2939 ```python
2940 import torch
2941 from transformers import AutoModelForSequenceClassification, AutoTokenizer
2942
2943 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large')
2944 model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-large')
2945 model.eval()
2946
2947 pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
2948 with torch.no_grad():
2949 inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
2950 scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
2951 print(scores)
2952 ```
2953
2954 ## Evaluation
2955
2956 `baai-general-embedding` models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!**
2957 For more details and evaluation tools see our [scripts](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md).
2958
2959 - **MTEB**:
2960
2961 | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) |
2962 |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
2963 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | 1024 | 512 | **64.23** | **54.29** | 46.08 | 87.12 | 60.03 | 83.11 | 31.61 | 75.97 |
2964 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | 768 | 512 | 63.55 | 53.25 | 45.77 | 86.55 | 58.86 | 82.4 | 31.07 | 75.53 |
2965 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | 384 | 512 | 62.17 |51.68 | 43.82 | 84.92 | 58.36 | 81.59 | 30.12 | 74.14 |
2966 | [bge-large-en](https://huggingface.co/BAAI/bge-large-en) | 1024 | 512 | 63.98 | 53.9 | 46.98 | 85.8 | 59.48 | 81.56 | 32.06 | 76.21 |
2967 | [bge-base-en](https://huggingface.co/BAAI/bge-base-en) | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 |
2968 | [gte-large](https://huggingface.co/thenlper/gte-large) | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 |
2969 | [gte-base](https://huggingface.co/thenlper/gte-base) | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 |
2970 | [e5-large-v2](https://huggingface.co/intfloat/e5-large-v2) | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 |
2971 | [bge-small-en](https://huggingface.co/BAAI/bge-small-en) | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 |
2972 | [instructor-xl](https://huggingface.co/hkunlp/instructor-xl) | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 |
2973 | [e5-base-v2](https://huggingface.co/intfloat/e5-base-v2) | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 |
2974 | [gte-small](https://huggingface.co/thenlper/gte-small) | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 |
2975 | [text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings) | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 |
2976 | [e5-small-v2](https://huggingface.co/intfloat/e5-base-v2) | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 |
2977 | [sentence-t5-xxl](https://huggingface.co/sentence-transformers/sentence-t5-xxl) | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 |
2978 | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 |
2979 | [sgpt-bloom-7b1-msmarco](https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco) | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 |
2980
2981
2982
2983 - **C-MTEB**:
2984 We create the benchmark C-MTEB for Chinese text embedding which consists of 31 datasets from 6 tasks.
2985 Please refer to [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md) for a detailed introduction.
2986
2987 | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering |
2988 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
2989 | [**BAAI/bge-large-zh-v1.5**](https://huggingface.co/BAAI/bge-large-zh-v1.5) | 1024 | **64.53** | 70.46 | 56.25 | 81.6 | 69.13 | 65.84 | 48.99 |
2990 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | 768 | 63.13 | 69.49 | 53.72 | 79.75 | 68.07 | 65.39 | 47.53 |
2991 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | 512 | 57.82 | 61.77 | 49.11 | 70.41 | 63.96 | 60.92 | 44.18 |
2992 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | 1024 | 64.20 | 71.53 | 54.98 | 78.94 | 68.32 | 65.11 | 48.39 |
2993 | [bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct) | 1024 | 63.53 | 70.55 | 53 | 76.77 | 68.58 | 64.91 | 50.01 |
2994 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | 768 | 62.96 | 69.53 | 54.12 | 77.5 | 67.07 | 64.91 | 47.63 |
2995 | [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) | 1024 | 58.79 | 63.66 | 48.44 | 69.89 | 67.34 | 56.00 | 48.23 |
2996 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | 512 | 58.27 | 63.07 | 49.45 | 70.35 | 63.64 | 61.48 | 45.09 |
2997 | [m3e-base](https://huggingface.co/moka-ai/m3e-base) | 768 | 57.10 | 56.91 | 50.47 | 63.99 | 67.52 | 59.34 | 47.68 |
2998 | [m3e-large](https://huggingface.co/moka-ai/m3e-large) | 1024 | 57.05 | 54.75 | 50.42 | 64.3 | 68.2 | 59.66 | 48.88 |
2999 | [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) | 768 | 55.48 | 61.63 | 46.49 | 67.07 | 65.35 | 54.35 | 40.68 |
3000 | [multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) | 384 | 55.38 | 59.95 | 45.27 | 66.45 | 65.85 | 53.86 | 45.26 |
3001 | [text-embedding-ada-002(OpenAI)](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings) | 1536 | 53.02 | 52.0 | 43.35 | 69.56 | 64.31 | 54.28 | 45.68 |
3002 | [luotuo](https://huggingface.co/silk-road/luotuo-bert-medium) | 1024 | 49.37 | 44.4 | 42.78 | 66.62 | 61 | 49.25 | 44.39 |
3003 | [text2vec-base](https://huggingface.co/shibing624/text2vec-base-chinese) | 768 | 47.63 | 38.79 | 43.41 | 67.41 | 62.19 | 49.45 | 37.66 |
3004 | [text2vec-large](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | 1024 | 47.36 | 41.94 | 44.97 | 70.86 | 60.66 | 49.16 | 30.02 |
3005
3006
3007 - **Reranking**:
3008 See [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/) for evaluation script.
3009
3010 | Model | T2Reranking | T2RerankingZh2En\* | T2RerankingEn2Zh\* | MMarcoReranking | CMedQAv1 | CMedQAv2 | Avg |
3011 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
3012 | text2vec-base-multilingual | 64.66 | 62.94 | 62.51 | 14.37 | 48.46 | 48.6 | 50.26 |
3013 | multilingual-e5-small | 65.62 | 60.94 | 56.41 | 29.91 | 67.26 | 66.54 | 57.78 |
3014 | multilingual-e5-large | 64.55 | 61.61 | 54.28 | 28.6 | 67.42 | 67.92 | 57.4 |
3015 | multilingual-e5-base | 64.21 | 62.13 | 54.68 | 29.5 | 66.23 | 66.98 | 57.29 |
3016 | m3e-base | 66.03 | 62.74 | 56.07 | 17.51 | 77.05 | 76.76 | 59.36 |
3017 | m3e-large | 66.13 | 62.72 | 56.1 | 16.46 | 77.76 | 78.27 | 59.57 |
3018 | bge-base-zh-v1.5 | 66.49 | 63.25 | 57.02 | 29.74 | 80.47 | 84.88 | 63.64 |
3019 | bge-large-zh-v1.5 | 65.74 | 63.39 | 57.03 | 28.74 | 83.45 | 85.44 | 63.97 |
3020 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | 67.28 | 63.95 | 60.45 | 35.46 | 81.26 | 84.1 | 65.42 |
3021 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | 67.6 | 64.03 | 61.44 | 37.16 | 82.15 | 84.18 | 66.09 |
3022
3023 \* : T2RerankingZh2En and T2RerankingEn2Zh are cross-language retrieval tasks
3024
3025 ## Train
3026
3027 ### BAAI Embedding
3028
3029 We pre-train the models using [retromae](https://github.com/staoxiao/RetroMAE) and train them on large-scale pairs data using contrastive learning.
3030 **You can fine-tune the embedding model on your data following our [examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune).**
3031 We also provide a [pre-train example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain).
3032 Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned.
3033 More training details for bge see [baai_general_embedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md).
3034
3035
3036
3037 ### BGE Reranker
3038
3039 Cross-encoder will perform full-attention over the input pair,
3040 which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model.
3041 Therefore, it can be used to re-rank the top-k documents returned by embedding model.
3042 We train the cross-encoder on a multilingual pair data,
3043 The data format is the same as embedding model, so you can fine-tune it easily following our [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker).
3044 More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
3045
3046
3047 ## Contact
3048 If you have any question or suggestion related to this project, feel free to open an issue or pull request.
3049 You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn).
3050
3051
3052 ## Citation
3053
3054 If you find this repository useful, please consider giving a star :star: and citation
3055
3056 ```
3057 @misc{bge_embedding,
3058 title={C-Pack: Packaged Resources To Advance General Chinese Embedding},
3059 author={Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff},
3060 year={2023},
3061 eprint={2309.07597},
3062 archivePrefix={arXiv},
3063 primaryClass={cs.CL}
3064 }
3065 ```
3066
3067 ## License
3068 FlagEmbedding is licensed under the [MIT License](https://github.com/FlagOpen/FlagEmbedding/blob/master/LICENSE). The released models can be used for commercial purposes free of charge.
3069
3070