README.md
92.3 KB · 3074 lines · markdown Raw
1 ---
2 tags:
3 - sentence-transformers
4 - feature-extraction
5 - sentence-similarity
6 - transformers
7 - mteb
8 model-index:
9 - name: bge-base-en-v1.5
10 results:
11 - task:
12 type: Classification
13 dataset:
14 type: mteb/amazon_counterfactual
15 name: MTEB AmazonCounterfactualClassification (en)
16 config: en
17 split: test
18 revision: e8379541af4e31359cca9fbcf4b00f2671dba205
19 metrics:
20 - type: accuracy
21 value: 76.14925373134328
22 - type: ap
23 value: 39.32336517995478
24 - type: f1
25 value: 70.16902252611425
26 - task:
27 type: Classification
28 dataset:
29 type: mteb/amazon_polarity
30 name: MTEB AmazonPolarityClassification
31 config: default
32 split: test
33 revision: e2d317d38cd51312af73b3d32a06d1a08b442046
34 metrics:
35 - type: accuracy
36 value: 93.386825
37 - type: ap
38 value: 90.21276917991995
39 - type: f1
40 value: 93.37741030006174
41 - task:
42 type: Classification
43 dataset:
44 type: mteb/amazon_reviews_multi
45 name: MTEB AmazonReviewsClassification (en)
46 config: en
47 split: test
48 revision: 1399c76144fd37290681b995c656ef9b2e06e26d
49 metrics:
50 - type: accuracy
51 value: 48.846000000000004
52 - type: f1
53 value: 48.14646269778261
54 - task:
55 type: Retrieval
56 dataset:
57 type: arguana
58 name: MTEB ArguAna
59 config: default
60 split: test
61 revision: None
62 metrics:
63 - type: map_at_1
64 value: 40.754000000000005
65 - type: map_at_10
66 value: 55.761
67 - type: map_at_100
68 value: 56.330999999999996
69 - type: map_at_1000
70 value: 56.333999999999996
71 - type: map_at_3
72 value: 51.92
73 - type: map_at_5
74 value: 54.010999999999996
75 - type: mrr_at_1
76 value: 41.181
77 - type: mrr_at_10
78 value: 55.967999999999996
79 - type: mrr_at_100
80 value: 56.538
81 - type: mrr_at_1000
82 value: 56.542
83 - type: mrr_at_3
84 value: 51.980000000000004
85 - type: mrr_at_5
86 value: 54.208999999999996
87 - type: ndcg_at_1
88 value: 40.754000000000005
89 - type: ndcg_at_10
90 value: 63.605000000000004
91 - type: ndcg_at_100
92 value: 66.05199999999999
93 - type: ndcg_at_1000
94 value: 66.12
95 - type: ndcg_at_3
96 value: 55.708
97 - type: ndcg_at_5
98 value: 59.452000000000005
99 - type: precision_at_1
100 value: 40.754000000000005
101 - type: precision_at_10
102 value: 8.841000000000001
103 - type: precision_at_100
104 value: 0.991
105 - type: precision_at_1000
106 value: 0.1
107 - type: precision_at_3
108 value: 22.238
109 - type: precision_at_5
110 value: 15.149000000000001
111 - type: recall_at_1
112 value: 40.754000000000005
113 - type: recall_at_10
114 value: 88.407
115 - type: recall_at_100
116 value: 99.14699999999999
117 - type: recall_at_1000
118 value: 99.644
119 - type: recall_at_3
120 value: 66.714
121 - type: recall_at_5
122 value: 75.747
123 - task:
124 type: Clustering
125 dataset:
126 type: mteb/arxiv-clustering-p2p
127 name: MTEB ArxivClusteringP2P
128 config: default
129 split: test
130 revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
131 metrics:
132 - type: v_measure
133 value: 48.74884539679369
134 - task:
135 type: Clustering
136 dataset:
137 type: mteb/arxiv-clustering-s2s
138 name: MTEB ArxivClusteringS2S
139 config: default
140 split: test
141 revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
142 metrics:
143 - type: v_measure
144 value: 42.8075893810716
145 - task:
146 type: Reranking
147 dataset:
148 type: mteb/askubuntudupquestions-reranking
149 name: MTEB AskUbuntuDupQuestions
150 config: default
151 split: test
152 revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
153 metrics:
154 - type: map
155 value: 62.128470519187736
156 - type: mrr
157 value: 74.28065778481289
158 - task:
159 type: STS
160 dataset:
161 type: mteb/biosses-sts
162 name: MTEB BIOSSES
163 config: default
164 split: test
165 revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
166 metrics:
167 - type: cos_sim_pearson
168 value: 89.24629081484655
169 - type: cos_sim_spearman
170 value: 86.93752309911496
171 - type: euclidean_pearson
172 value: 87.58589628573816
173 - type: euclidean_spearman
174 value: 88.05622328825284
175 - type: manhattan_pearson
176 value: 87.5594959805773
177 - type: manhattan_spearman
178 value: 88.19658793233961
179 - task:
180 type: Classification
181 dataset:
182 type: mteb/banking77
183 name: MTEB Banking77Classification
184 config: default
185 split: test
186 revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
187 metrics:
188 - type: accuracy
189 value: 86.9512987012987
190 - type: f1
191 value: 86.92515357973708
192 - task:
193 type: Clustering
194 dataset:
195 type: mteb/biorxiv-clustering-p2p
196 name: MTEB BiorxivClusteringP2P
197 config: default
198 split: test
199 revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
200 metrics:
201 - type: v_measure
202 value: 39.10263762928872
203 - task:
204 type: Clustering
205 dataset:
206 type: mteb/biorxiv-clustering-s2s
207 name: MTEB BiorxivClusteringS2S
208 config: default
209 split: test
210 revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
211 metrics:
212 - type: v_measure
213 value: 36.69711517426737
214 - task:
215 type: Retrieval
216 dataset:
217 type: BeIR/cqadupstack
218 name: MTEB CQADupstackAndroidRetrieval
219 config: default
220 split: test
221 revision: None
222 metrics:
223 - type: map_at_1
224 value: 32.327
225 - type: map_at_10
226 value: 44.099
227 - type: map_at_100
228 value: 45.525
229 - type: map_at_1000
230 value: 45.641999999999996
231 - type: map_at_3
232 value: 40.47
233 - type: map_at_5
234 value: 42.36
235 - type: mrr_at_1
236 value: 39.199
237 - type: mrr_at_10
238 value: 49.651
239 - type: mrr_at_100
240 value: 50.29
241 - type: mrr_at_1000
242 value: 50.329
243 - type: mrr_at_3
244 value: 46.924
245 - type: mrr_at_5
246 value: 48.548
247 - type: ndcg_at_1
248 value: 39.199
249 - type: ndcg_at_10
250 value: 50.773
251 - type: ndcg_at_100
252 value: 55.67999999999999
253 - type: ndcg_at_1000
254 value: 57.495
255 - type: ndcg_at_3
256 value: 45.513999999999996
257 - type: ndcg_at_5
258 value: 47.703
259 - type: precision_at_1
260 value: 39.199
261 - type: precision_at_10
262 value: 9.914000000000001
263 - type: precision_at_100
264 value: 1.5310000000000001
265 - type: precision_at_1000
266 value: 0.198
267 - type: precision_at_3
268 value: 21.984
269 - type: precision_at_5
270 value: 15.737000000000002
271 - type: recall_at_1
272 value: 32.327
273 - type: recall_at_10
274 value: 63.743
275 - type: recall_at_100
276 value: 84.538
277 - type: recall_at_1000
278 value: 96.089
279 - type: recall_at_3
280 value: 48.065000000000005
281 - type: recall_at_5
282 value: 54.519
283 - task:
284 type: Retrieval
285 dataset:
286 type: BeIR/cqadupstack
287 name: MTEB CQADupstackEnglishRetrieval
288 config: default
289 split: test
290 revision: None
291 metrics:
292 - type: map_at_1
293 value: 32.671
294 - type: map_at_10
295 value: 42.954
296 - type: map_at_100
297 value: 44.151
298 - type: map_at_1000
299 value: 44.287
300 - type: map_at_3
301 value: 39.912
302 - type: map_at_5
303 value: 41.798
304 - type: mrr_at_1
305 value: 41.465
306 - type: mrr_at_10
307 value: 49.351
308 - type: mrr_at_100
309 value: 49.980000000000004
310 - type: mrr_at_1000
311 value: 50.016000000000005
312 - type: mrr_at_3
313 value: 47.144000000000005
314 - type: mrr_at_5
315 value: 48.592999999999996
316 - type: ndcg_at_1
317 value: 41.465
318 - type: ndcg_at_10
319 value: 48.565999999999995
320 - type: ndcg_at_100
321 value: 52.76499999999999
322 - type: ndcg_at_1000
323 value: 54.749
324 - type: ndcg_at_3
325 value: 44.57
326 - type: ndcg_at_5
327 value: 46.759
328 - type: precision_at_1
329 value: 41.465
330 - type: precision_at_10
331 value: 9.107999999999999
332 - type: precision_at_100
333 value: 1.433
334 - type: precision_at_1000
335 value: 0.191
336 - type: precision_at_3
337 value: 21.423000000000002
338 - type: precision_at_5
339 value: 15.414
340 - type: recall_at_1
341 value: 32.671
342 - type: recall_at_10
343 value: 57.738
344 - type: recall_at_100
345 value: 75.86500000000001
346 - type: recall_at_1000
347 value: 88.36
348 - type: recall_at_3
349 value: 45.626
350 - type: recall_at_5
351 value: 51.812000000000005
352 - task:
353 type: Retrieval
354 dataset:
355 type: BeIR/cqadupstack
356 name: MTEB CQADupstackGamingRetrieval
357 config: default
358 split: test
359 revision: None
360 metrics:
361 - type: map_at_1
362 value: 41.185
363 - type: map_at_10
364 value: 53.929
365 - type: map_at_100
366 value: 54.92
367 - type: map_at_1000
368 value: 54.967999999999996
369 - type: map_at_3
370 value: 50.70400000000001
371 - type: map_at_5
372 value: 52.673
373 - type: mrr_at_1
374 value: 47.398
375 - type: mrr_at_10
376 value: 57.303000000000004
377 - type: mrr_at_100
378 value: 57.959
379 - type: mrr_at_1000
380 value: 57.985
381 - type: mrr_at_3
382 value: 54.932
383 - type: mrr_at_5
384 value: 56.464999999999996
385 - type: ndcg_at_1
386 value: 47.398
387 - type: ndcg_at_10
388 value: 59.653
389 - type: ndcg_at_100
390 value: 63.627
391 - type: ndcg_at_1000
392 value: 64.596
393 - type: ndcg_at_3
394 value: 54.455
395 - type: ndcg_at_5
396 value: 57.245000000000005
397 - type: precision_at_1
398 value: 47.398
399 - type: precision_at_10
400 value: 9.524000000000001
401 - type: precision_at_100
402 value: 1.243
403 - type: precision_at_1000
404 value: 0.13699999999999998
405 - type: precision_at_3
406 value: 24.389
407 - type: precision_at_5
408 value: 16.752
409 - type: recall_at_1
410 value: 41.185
411 - type: recall_at_10
412 value: 73.193
413 - type: recall_at_100
414 value: 90.357
415 - type: recall_at_1000
416 value: 97.253
417 - type: recall_at_3
418 value: 59.199999999999996
419 - type: recall_at_5
420 value: 66.118
421 - task:
422 type: Retrieval
423 dataset:
424 type: BeIR/cqadupstack
425 name: MTEB CQADupstackGisRetrieval
426 config: default
427 split: test
428 revision: None
429 metrics:
430 - type: map_at_1
431 value: 27.27
432 - type: map_at_10
433 value: 36.223
434 - type: map_at_100
435 value: 37.218
436 - type: map_at_1000
437 value: 37.293
438 - type: map_at_3
439 value: 33.503
440 - type: map_at_5
441 value: 35.097
442 - type: mrr_at_1
443 value: 29.492
444 - type: mrr_at_10
445 value: 38.352000000000004
446 - type: mrr_at_100
447 value: 39.188
448 - type: mrr_at_1000
449 value: 39.247
450 - type: mrr_at_3
451 value: 35.876000000000005
452 - type: mrr_at_5
453 value: 37.401
454 - type: ndcg_at_1
455 value: 29.492
456 - type: ndcg_at_10
457 value: 41.239
458 - type: ndcg_at_100
459 value: 46.066
460 - type: ndcg_at_1000
461 value: 47.992000000000004
462 - type: ndcg_at_3
463 value: 36.11
464 - type: ndcg_at_5
465 value: 38.772
466 - type: precision_at_1
467 value: 29.492
468 - type: precision_at_10
469 value: 6.260000000000001
470 - type: precision_at_100
471 value: 0.914
472 - type: precision_at_1000
473 value: 0.11100000000000002
474 - type: precision_at_3
475 value: 15.104000000000001
476 - type: precision_at_5
477 value: 10.644
478 - type: recall_at_1
479 value: 27.27
480 - type: recall_at_10
481 value: 54.589
482 - type: recall_at_100
483 value: 76.70700000000001
484 - type: recall_at_1000
485 value: 91.158
486 - type: recall_at_3
487 value: 40.974
488 - type: recall_at_5
489 value: 47.327000000000005
490 - task:
491 type: Retrieval
492 dataset:
493 type: BeIR/cqadupstack
494 name: MTEB CQADupstackMathematicaRetrieval
495 config: default
496 split: test
497 revision: None
498 metrics:
499 - type: map_at_1
500 value: 17.848
501 - type: map_at_10
502 value: 26.207
503 - type: map_at_100
504 value: 27.478
505 - type: map_at_1000
506 value: 27.602
507 - type: map_at_3
508 value: 23.405
509 - type: map_at_5
510 value: 24.98
511 - type: mrr_at_1
512 value: 21.891
513 - type: mrr_at_10
514 value: 31.041999999999998
515 - type: mrr_at_100
516 value: 32.092
517 - type: mrr_at_1000
518 value: 32.151999999999994
519 - type: mrr_at_3
520 value: 28.358
521 - type: mrr_at_5
522 value: 29.969
523 - type: ndcg_at_1
524 value: 21.891
525 - type: ndcg_at_10
526 value: 31.585
527 - type: ndcg_at_100
528 value: 37.531
529 - type: ndcg_at_1000
530 value: 40.256
531 - type: ndcg_at_3
532 value: 26.508
533 - type: ndcg_at_5
534 value: 28.894
535 - type: precision_at_1
536 value: 21.891
537 - type: precision_at_10
538 value: 5.795999999999999
539 - type: precision_at_100
540 value: 0.9990000000000001
541 - type: precision_at_1000
542 value: 0.13799999999999998
543 - type: precision_at_3
544 value: 12.769
545 - type: precision_at_5
546 value: 9.279
547 - type: recall_at_1
548 value: 17.848
549 - type: recall_at_10
550 value: 43.452
551 - type: recall_at_100
552 value: 69.216
553 - type: recall_at_1000
554 value: 88.102
555 - type: recall_at_3
556 value: 29.18
557 - type: recall_at_5
558 value: 35.347
559 - task:
560 type: Retrieval
561 dataset:
562 type: BeIR/cqadupstack
563 name: MTEB CQADupstackPhysicsRetrieval
564 config: default
565 split: test
566 revision: None
567 metrics:
568 - type: map_at_1
569 value: 30.94
570 - type: map_at_10
571 value: 41.248000000000005
572 - type: map_at_100
573 value: 42.495
574 - type: map_at_1000
575 value: 42.602000000000004
576 - type: map_at_3
577 value: 37.939
578 - type: map_at_5
579 value: 39.924
580 - type: mrr_at_1
581 value: 37.824999999999996
582 - type: mrr_at_10
583 value: 47.041
584 - type: mrr_at_100
585 value: 47.83
586 - type: mrr_at_1000
587 value: 47.878
588 - type: mrr_at_3
589 value: 44.466
590 - type: mrr_at_5
591 value: 46.111999999999995
592 - type: ndcg_at_1
593 value: 37.824999999999996
594 - type: ndcg_at_10
595 value: 47.223
596 - type: ndcg_at_100
597 value: 52.394
598 - type: ndcg_at_1000
599 value: 54.432
600 - type: ndcg_at_3
601 value: 42.032000000000004
602 - type: ndcg_at_5
603 value: 44.772
604 - type: precision_at_1
605 value: 37.824999999999996
606 - type: precision_at_10
607 value: 8.393
608 - type: precision_at_100
609 value: 1.2890000000000001
610 - type: precision_at_1000
611 value: 0.164
612 - type: precision_at_3
613 value: 19.698
614 - type: precision_at_5
615 value: 14.013
616 - type: recall_at_1
617 value: 30.94
618 - type: recall_at_10
619 value: 59.316
620 - type: recall_at_100
621 value: 80.783
622 - type: recall_at_1000
623 value: 94.15400000000001
624 - type: recall_at_3
625 value: 44.712
626 - type: recall_at_5
627 value: 51.932
628 - task:
629 type: Retrieval
630 dataset:
631 type: BeIR/cqadupstack
632 name: MTEB CQADupstackProgrammersRetrieval
633 config: default
634 split: test
635 revision: None
636 metrics:
637 - type: map_at_1
638 value: 27.104
639 - type: map_at_10
640 value: 36.675999999999995
641 - type: map_at_100
642 value: 38.076
643 - type: map_at_1000
644 value: 38.189
645 - type: map_at_3
646 value: 33.733999999999995
647 - type: map_at_5
648 value: 35.287
649 - type: mrr_at_1
650 value: 33.904
651 - type: mrr_at_10
652 value: 42.55
653 - type: mrr_at_100
654 value: 43.434
655 - type: mrr_at_1000
656 value: 43.494
657 - type: mrr_at_3
658 value: 40.126
659 - type: mrr_at_5
660 value: 41.473
661 - type: ndcg_at_1
662 value: 33.904
663 - type: ndcg_at_10
664 value: 42.414
665 - type: ndcg_at_100
666 value: 48.203
667 - type: ndcg_at_1000
668 value: 50.437
669 - type: ndcg_at_3
670 value: 37.633
671 - type: ndcg_at_5
672 value: 39.67
673 - type: precision_at_1
674 value: 33.904
675 - type: precision_at_10
676 value: 7.82
677 - type: precision_at_100
678 value: 1.2409999999999999
679 - type: precision_at_1000
680 value: 0.159
681 - type: precision_at_3
682 value: 17.884
683 - type: precision_at_5
684 value: 12.648000000000001
685 - type: recall_at_1
686 value: 27.104
687 - type: recall_at_10
688 value: 53.563
689 - type: recall_at_100
690 value: 78.557
691 - type: recall_at_1000
692 value: 93.533
693 - type: recall_at_3
694 value: 39.92
695 - type: recall_at_5
696 value: 45.457
697 - task:
698 type: Retrieval
699 dataset:
700 type: BeIR/cqadupstack
701 name: MTEB CQADupstackRetrieval
702 config: default
703 split: test
704 revision: None
705 metrics:
706 - type: map_at_1
707 value: 27.707749999999997
708 - type: map_at_10
709 value: 36.961
710 - type: map_at_100
711 value: 38.158833333333334
712 - type: map_at_1000
713 value: 38.270333333333326
714 - type: map_at_3
715 value: 34.07183333333334
716 - type: map_at_5
717 value: 35.69533333333334
718 - type: mrr_at_1
719 value: 32.81875
720 - type: mrr_at_10
721 value: 41.293
722 - type: mrr_at_100
723 value: 42.116499999999995
724 - type: mrr_at_1000
725 value: 42.170249999999996
726 - type: mrr_at_3
727 value: 38.83983333333333
728 - type: mrr_at_5
729 value: 40.29775
730 - type: ndcg_at_1
731 value: 32.81875
732 - type: ndcg_at_10
733 value: 42.355
734 - type: ndcg_at_100
735 value: 47.41374999999999
736 - type: ndcg_at_1000
737 value: 49.5805
738 - type: ndcg_at_3
739 value: 37.52825
740 - type: ndcg_at_5
741 value: 39.83266666666667
742 - type: precision_at_1
743 value: 32.81875
744 - type: precision_at_10
745 value: 7.382416666666666
746 - type: precision_at_100
747 value: 1.1640833333333334
748 - type: precision_at_1000
749 value: 0.15383333333333335
750 - type: precision_at_3
751 value: 17.134166666666665
752 - type: precision_at_5
753 value: 12.174833333333336
754 - type: recall_at_1
755 value: 27.707749999999997
756 - type: recall_at_10
757 value: 53.945
758 - type: recall_at_100
759 value: 76.191
760 - type: recall_at_1000
761 value: 91.101
762 - type: recall_at_3
763 value: 40.39083333333334
764 - type: recall_at_5
765 value: 46.40083333333333
766 - task:
767 type: Retrieval
768 dataset:
769 type: BeIR/cqadupstack
770 name: MTEB CQADupstackStatsRetrieval
771 config: default
772 split: test
773 revision: None
774 metrics:
775 - type: map_at_1
776 value: 26.482
777 - type: map_at_10
778 value: 33.201
779 - type: map_at_100
780 value: 34.107
781 - type: map_at_1000
782 value: 34.197
783 - type: map_at_3
784 value: 31.174000000000003
785 - type: map_at_5
786 value: 32.279
787 - type: mrr_at_1
788 value: 29.908
789 - type: mrr_at_10
790 value: 36.235
791 - type: mrr_at_100
792 value: 37.04
793 - type: mrr_at_1000
794 value: 37.105
795 - type: mrr_at_3
796 value: 34.355999999999995
797 - type: mrr_at_5
798 value: 35.382999999999996
799 - type: ndcg_at_1
800 value: 29.908
801 - type: ndcg_at_10
802 value: 37.325
803 - type: ndcg_at_100
804 value: 41.795
805 - type: ndcg_at_1000
806 value: 44.105
807 - type: ndcg_at_3
808 value: 33.555
809 - type: ndcg_at_5
810 value: 35.266999999999996
811 - type: precision_at_1
812 value: 29.908
813 - type: precision_at_10
814 value: 5.721
815 - type: precision_at_100
816 value: 0.8630000000000001
817 - type: precision_at_1000
818 value: 0.11299999999999999
819 - type: precision_at_3
820 value: 14.008000000000001
821 - type: precision_at_5
822 value: 9.754999999999999
823 - type: recall_at_1
824 value: 26.482
825 - type: recall_at_10
826 value: 47.072
827 - type: recall_at_100
828 value: 67.27
829 - type: recall_at_1000
830 value: 84.371
831 - type: recall_at_3
832 value: 36.65
833 - type: recall_at_5
834 value: 40.774
835 - task:
836 type: Retrieval
837 dataset:
838 type: BeIR/cqadupstack
839 name: MTEB CQADupstackTexRetrieval
840 config: default
841 split: test
842 revision: None
843 metrics:
844 - type: map_at_1
845 value: 18.815
846 - type: map_at_10
847 value: 26.369999999999997
848 - type: map_at_100
849 value: 27.458
850 - type: map_at_1000
851 value: 27.588
852 - type: map_at_3
853 value: 23.990000000000002
854 - type: map_at_5
855 value: 25.345000000000002
856 - type: mrr_at_1
857 value: 22.953000000000003
858 - type: mrr_at_10
859 value: 30.342999999999996
860 - type: mrr_at_100
861 value: 31.241000000000003
862 - type: mrr_at_1000
863 value: 31.319000000000003
864 - type: mrr_at_3
865 value: 28.16
866 - type: mrr_at_5
867 value: 29.406
868 - type: ndcg_at_1
869 value: 22.953000000000003
870 - type: ndcg_at_10
871 value: 31.151
872 - type: ndcg_at_100
873 value: 36.309000000000005
874 - type: ndcg_at_1000
875 value: 39.227000000000004
876 - type: ndcg_at_3
877 value: 26.921
878 - type: ndcg_at_5
879 value: 28.938000000000002
880 - type: precision_at_1
881 value: 22.953000000000003
882 - type: precision_at_10
883 value: 5.602
884 - type: precision_at_100
885 value: 0.9530000000000001
886 - type: precision_at_1000
887 value: 0.13899999999999998
888 - type: precision_at_3
889 value: 12.606
890 - type: precision_at_5
891 value: 9.119
892 - type: recall_at_1
893 value: 18.815
894 - type: recall_at_10
895 value: 41.574
896 - type: recall_at_100
897 value: 64.84400000000001
898 - type: recall_at_1000
899 value: 85.406
900 - type: recall_at_3
901 value: 29.694
902 - type: recall_at_5
903 value: 34.935
904 - task:
905 type: Retrieval
906 dataset:
907 type: BeIR/cqadupstack
908 name: MTEB CQADupstackUnixRetrieval
909 config: default
910 split: test
911 revision: None
912 metrics:
913 - type: map_at_1
914 value: 27.840999999999998
915 - type: map_at_10
916 value: 36.797999999999995
917 - type: map_at_100
918 value: 37.993
919 - type: map_at_1000
920 value: 38.086999999999996
921 - type: map_at_3
922 value: 34.050999999999995
923 - type: map_at_5
924 value: 35.379
925 - type: mrr_at_1
926 value: 32.649
927 - type: mrr_at_10
928 value: 41.025
929 - type: mrr_at_100
930 value: 41.878
931 - type: mrr_at_1000
932 value: 41.929
933 - type: mrr_at_3
934 value: 38.573
935 - type: mrr_at_5
936 value: 39.715
937 - type: ndcg_at_1
938 value: 32.649
939 - type: ndcg_at_10
940 value: 42.142
941 - type: ndcg_at_100
942 value: 47.558
943 - type: ndcg_at_1000
944 value: 49.643
945 - type: ndcg_at_3
946 value: 37.12
947 - type: ndcg_at_5
948 value: 38.983000000000004
949 - type: precision_at_1
950 value: 32.649
951 - type: precision_at_10
952 value: 7.08
953 - type: precision_at_100
954 value: 1.1039999999999999
955 - type: precision_at_1000
956 value: 0.13899999999999998
957 - type: precision_at_3
958 value: 16.698
959 - type: precision_at_5
960 value: 11.511000000000001
961 - type: recall_at_1
962 value: 27.840999999999998
963 - type: recall_at_10
964 value: 54.245
965 - type: recall_at_100
966 value: 77.947
967 - type: recall_at_1000
968 value: 92.36999999999999
969 - type: recall_at_3
970 value: 40.146
971 - type: recall_at_5
972 value: 44.951
973 - task:
974 type: Retrieval
975 dataset:
976 type: BeIR/cqadupstack
977 name: MTEB CQADupstackWebmastersRetrieval
978 config: default
979 split: test
980 revision: None
981 metrics:
982 - type: map_at_1
983 value: 26.529000000000003
984 - type: map_at_10
985 value: 35.010000000000005
986 - type: map_at_100
987 value: 36.647
988 - type: map_at_1000
989 value: 36.857
990 - type: map_at_3
991 value: 31.968000000000004
992 - type: map_at_5
993 value: 33.554
994 - type: mrr_at_1
995 value: 31.818
996 - type: mrr_at_10
997 value: 39.550999999999995
998 - type: mrr_at_100
999 value: 40.54
1000 - type: mrr_at_1000
1001 value: 40.596
1002 - type: mrr_at_3
1003 value: 36.726
1004 - type: mrr_at_5
1005 value: 38.416
1006 - type: ndcg_at_1
1007 value: 31.818
1008 - type: ndcg_at_10
1009 value: 40.675
1010 - type: ndcg_at_100
1011 value: 46.548
1012 - type: ndcg_at_1000
1013 value: 49.126
1014 - type: ndcg_at_3
1015 value: 35.829
1016 - type: ndcg_at_5
1017 value: 38.0
1018 - type: precision_at_1
1019 value: 31.818
1020 - type: precision_at_10
1021 value: 7.826
1022 - type: precision_at_100
1023 value: 1.538
1024 - type: precision_at_1000
1025 value: 0.24
1026 - type: precision_at_3
1027 value: 16.601
1028 - type: precision_at_5
1029 value: 12.095
1030 - type: recall_at_1
1031 value: 26.529000000000003
1032 - type: recall_at_10
1033 value: 51.03
1034 - type: recall_at_100
1035 value: 77.556
1036 - type: recall_at_1000
1037 value: 93.804
1038 - type: recall_at_3
1039 value: 36.986000000000004
1040 - type: recall_at_5
1041 value: 43.096000000000004
1042 - task:
1043 type: Retrieval
1044 dataset:
1045 type: BeIR/cqadupstack
1046 name: MTEB CQADupstackWordpressRetrieval
1047 config: default
1048 split: test
1049 revision: None
1050 metrics:
1051 - type: map_at_1
1052 value: 23.480999999999998
1053 - type: map_at_10
1054 value: 30.817
1055 - type: map_at_100
1056 value: 31.838
1057 - type: map_at_1000
1058 value: 31.932
1059 - type: map_at_3
1060 value: 28.011999999999997
1061 - type: map_at_5
1062 value: 29.668
1063 - type: mrr_at_1
1064 value: 25.323
1065 - type: mrr_at_10
1066 value: 33.072
1067 - type: mrr_at_100
1068 value: 33.926
1069 - type: mrr_at_1000
1070 value: 33.993
1071 - type: mrr_at_3
1072 value: 30.436999999999998
1073 - type: mrr_at_5
1074 value: 32.092
1075 - type: ndcg_at_1
1076 value: 25.323
1077 - type: ndcg_at_10
1078 value: 35.514
1079 - type: ndcg_at_100
1080 value: 40.489000000000004
1081 - type: ndcg_at_1000
1082 value: 42.908
1083 - type: ndcg_at_3
1084 value: 30.092000000000002
1085 - type: ndcg_at_5
1086 value: 32.989000000000004
1087 - type: precision_at_1
1088 value: 25.323
1089 - type: precision_at_10
1090 value: 5.545
1091 - type: precision_at_100
1092 value: 0.861
1093 - type: precision_at_1000
1094 value: 0.117
1095 - type: precision_at_3
1096 value: 12.446
1097 - type: precision_at_5
1098 value: 9.131
1099 - type: recall_at_1
1100 value: 23.480999999999998
1101 - type: recall_at_10
1102 value: 47.825
1103 - type: recall_at_100
1104 value: 70.652
1105 - type: recall_at_1000
1106 value: 88.612
1107 - type: recall_at_3
1108 value: 33.537
1109 - type: recall_at_5
1110 value: 40.542
1111 - task:
1112 type: Retrieval
1113 dataset:
1114 type: climate-fever
1115 name: MTEB ClimateFEVER
1116 config: default
1117 split: test
1118 revision: None
1119 metrics:
1120 - type: map_at_1
1121 value: 13.333999999999998
1122 - type: map_at_10
1123 value: 22.524
1124 - type: map_at_100
1125 value: 24.506
1126 - type: map_at_1000
1127 value: 24.715
1128 - type: map_at_3
1129 value: 19.022
1130 - type: map_at_5
1131 value: 20.693
1132 - type: mrr_at_1
1133 value: 29.186
1134 - type: mrr_at_10
1135 value: 41.22
1136 - type: mrr_at_100
1137 value: 42.16
1138 - type: mrr_at_1000
1139 value: 42.192
1140 - type: mrr_at_3
1141 value: 38.013000000000005
1142 - type: mrr_at_5
1143 value: 39.704
1144 - type: ndcg_at_1
1145 value: 29.186
1146 - type: ndcg_at_10
1147 value: 31.167
1148 - type: ndcg_at_100
1149 value: 38.879000000000005
1150 - type: ndcg_at_1000
1151 value: 42.376000000000005
1152 - type: ndcg_at_3
1153 value: 25.817
1154 - type: ndcg_at_5
1155 value: 27.377000000000002
1156 - type: precision_at_1
1157 value: 29.186
1158 - type: precision_at_10
1159 value: 9.693999999999999
1160 - type: precision_at_100
1161 value: 1.8030000000000002
1162 - type: precision_at_1000
1163 value: 0.246
1164 - type: precision_at_3
1165 value: 19.11
1166 - type: precision_at_5
1167 value: 14.344999999999999
1168 - type: recall_at_1
1169 value: 13.333999999999998
1170 - type: recall_at_10
1171 value: 37.092000000000006
1172 - type: recall_at_100
1173 value: 63.651
1174 - type: recall_at_1000
1175 value: 83.05
1176 - type: recall_at_3
1177 value: 23.74
1178 - type: recall_at_5
1179 value: 28.655
1180 - task:
1181 type: Retrieval
1182 dataset:
1183 type: dbpedia-entity
1184 name: MTEB DBPedia
1185 config: default
1186 split: test
1187 revision: None
1188 metrics:
1189 - type: map_at_1
1190 value: 9.151
1191 - type: map_at_10
1192 value: 19.653000000000002
1193 - type: map_at_100
1194 value: 28.053
1195 - type: map_at_1000
1196 value: 29.709000000000003
1197 - type: map_at_3
1198 value: 14.191
1199 - type: map_at_5
1200 value: 16.456
1201 - type: mrr_at_1
1202 value: 66.25
1203 - type: mrr_at_10
1204 value: 74.4
1205 - type: mrr_at_100
1206 value: 74.715
1207 - type: mrr_at_1000
1208 value: 74.726
1209 - type: mrr_at_3
1210 value: 72.417
1211 - type: mrr_at_5
1212 value: 73.667
1213 - type: ndcg_at_1
1214 value: 54.25
1215 - type: ndcg_at_10
1216 value: 40.77
1217 - type: ndcg_at_100
1218 value: 46.359
1219 - type: ndcg_at_1000
1220 value: 54.193000000000005
1221 - type: ndcg_at_3
1222 value: 44.832
1223 - type: ndcg_at_5
1224 value: 42.63
1225 - type: precision_at_1
1226 value: 66.25
1227 - type: precision_at_10
1228 value: 32.175
1229 - type: precision_at_100
1230 value: 10.668
1231 - type: precision_at_1000
1232 value: 2.067
1233 - type: precision_at_3
1234 value: 47.667
1235 - type: precision_at_5
1236 value: 41.3
1237 - type: recall_at_1
1238 value: 9.151
1239 - type: recall_at_10
1240 value: 25.003999999999998
1241 - type: recall_at_100
1242 value: 52.976
1243 - type: recall_at_1000
1244 value: 78.315
1245 - type: recall_at_3
1246 value: 15.487
1247 - type: recall_at_5
1248 value: 18.999
1249 - task:
1250 type: Classification
1251 dataset:
1252 type: mteb/emotion
1253 name: MTEB EmotionClassification
1254 config: default
1255 split: test
1256 revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1257 metrics:
1258 - type: accuracy
1259 value: 51.89999999999999
1260 - type: f1
1261 value: 46.47777925067403
1262 - task:
1263 type: Retrieval
1264 dataset:
1265 type: fever
1266 name: MTEB FEVER
1267 config: default
1268 split: test
1269 revision: None
1270 metrics:
1271 - type: map_at_1
1272 value: 73.706
1273 - type: map_at_10
1274 value: 82.423
1275 - type: map_at_100
1276 value: 82.67999999999999
1277 - type: map_at_1000
1278 value: 82.694
1279 - type: map_at_3
1280 value: 81.328
1281 - type: map_at_5
1282 value: 82.001
1283 - type: mrr_at_1
1284 value: 79.613
1285 - type: mrr_at_10
1286 value: 87.07000000000001
1287 - type: mrr_at_100
1288 value: 87.169
1289 - type: mrr_at_1000
1290 value: 87.17
1291 - type: mrr_at_3
1292 value: 86.404
1293 - type: mrr_at_5
1294 value: 86.856
1295 - type: ndcg_at_1
1296 value: 79.613
1297 - type: ndcg_at_10
1298 value: 86.289
1299 - type: ndcg_at_100
1300 value: 87.201
1301 - type: ndcg_at_1000
1302 value: 87.428
1303 - type: ndcg_at_3
1304 value: 84.625
1305 - type: ndcg_at_5
1306 value: 85.53699999999999
1307 - type: precision_at_1
1308 value: 79.613
1309 - type: precision_at_10
1310 value: 10.399
1311 - type: precision_at_100
1312 value: 1.1079999999999999
1313 - type: precision_at_1000
1314 value: 0.11499999999999999
1315 - type: precision_at_3
1316 value: 32.473
1317 - type: precision_at_5
1318 value: 20.132
1319 - type: recall_at_1
1320 value: 73.706
1321 - type: recall_at_10
1322 value: 93.559
1323 - type: recall_at_100
1324 value: 97.188
1325 - type: recall_at_1000
1326 value: 98.555
1327 - type: recall_at_3
1328 value: 88.98700000000001
1329 - type: recall_at_5
1330 value: 91.373
1331 - task:
1332 type: Retrieval
1333 dataset:
1334 type: fiqa
1335 name: MTEB FiQA2018
1336 config: default
1337 split: test
1338 revision: None
1339 metrics:
1340 - type: map_at_1
1341 value: 19.841
1342 - type: map_at_10
1343 value: 32.643
1344 - type: map_at_100
1345 value: 34.575
1346 - type: map_at_1000
1347 value: 34.736
1348 - type: map_at_3
1349 value: 28.317999999999998
1350 - type: map_at_5
1351 value: 30.964000000000002
1352 - type: mrr_at_1
1353 value: 39.660000000000004
1354 - type: mrr_at_10
1355 value: 48.620000000000005
1356 - type: mrr_at_100
1357 value: 49.384
1358 - type: mrr_at_1000
1359 value: 49.415
1360 - type: mrr_at_3
1361 value: 45.988
1362 - type: mrr_at_5
1363 value: 47.361
1364 - type: ndcg_at_1
1365 value: 39.660000000000004
1366 - type: ndcg_at_10
1367 value: 40.646
1368 - type: ndcg_at_100
1369 value: 47.657
1370 - type: ndcg_at_1000
1371 value: 50.428
1372 - type: ndcg_at_3
1373 value: 36.689
1374 - type: ndcg_at_5
1375 value: 38.211
1376 - type: precision_at_1
1377 value: 39.660000000000004
1378 - type: precision_at_10
1379 value: 11.235000000000001
1380 - type: precision_at_100
1381 value: 1.8530000000000002
1382 - type: precision_at_1000
1383 value: 0.23600000000000002
1384 - type: precision_at_3
1385 value: 24.587999999999997
1386 - type: precision_at_5
1387 value: 18.395
1388 - type: recall_at_1
1389 value: 19.841
1390 - type: recall_at_10
1391 value: 48.135
1392 - type: recall_at_100
1393 value: 74.224
1394 - type: recall_at_1000
1395 value: 90.826
1396 - type: recall_at_3
1397 value: 33.536
1398 - type: recall_at_5
1399 value: 40.311
1400 - task:
1401 type: Retrieval
1402 dataset:
1403 type: hotpotqa
1404 name: MTEB HotpotQA
1405 config: default
1406 split: test
1407 revision: None
1408 metrics:
1409 - type: map_at_1
1410 value: 40.358
1411 - type: map_at_10
1412 value: 64.497
1413 - type: map_at_100
1414 value: 65.362
1415 - type: map_at_1000
1416 value: 65.41900000000001
1417 - type: map_at_3
1418 value: 61.06700000000001
1419 - type: map_at_5
1420 value: 63.317
1421 - type: mrr_at_1
1422 value: 80.716
1423 - type: mrr_at_10
1424 value: 86.10799999999999
1425 - type: mrr_at_100
1426 value: 86.265
1427 - type: mrr_at_1000
1428 value: 86.27
1429 - type: mrr_at_3
1430 value: 85.271
1431 - type: mrr_at_5
1432 value: 85.82499999999999
1433 - type: ndcg_at_1
1434 value: 80.716
1435 - type: ndcg_at_10
1436 value: 72.597
1437 - type: ndcg_at_100
1438 value: 75.549
1439 - type: ndcg_at_1000
1440 value: 76.61
1441 - type: ndcg_at_3
1442 value: 67.874
1443 - type: ndcg_at_5
1444 value: 70.655
1445 - type: precision_at_1
1446 value: 80.716
1447 - type: precision_at_10
1448 value: 15.148
1449 - type: precision_at_100
1450 value: 1.745
1451 - type: precision_at_1000
1452 value: 0.188
1453 - type: precision_at_3
1454 value: 43.597
1455 - type: precision_at_5
1456 value: 28.351
1457 - type: recall_at_1
1458 value: 40.358
1459 - type: recall_at_10
1460 value: 75.739
1461 - type: recall_at_100
1462 value: 87.259
1463 - type: recall_at_1000
1464 value: 94.234
1465 - type: recall_at_3
1466 value: 65.39500000000001
1467 - type: recall_at_5
1468 value: 70.878
1469 - task:
1470 type: Classification
1471 dataset:
1472 type: mteb/imdb
1473 name: MTEB ImdbClassification
1474 config: default
1475 split: test
1476 revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1477 metrics:
1478 - type: accuracy
1479 value: 90.80799999999998
1480 - type: ap
1481 value: 86.81350378180757
1482 - type: f1
1483 value: 90.79901248314215
1484 - task:
1485 type: Retrieval
1486 dataset:
1487 type: msmarco
1488 name: MTEB MSMARCO
1489 config: default
1490 split: dev
1491 revision: None
1492 metrics:
1493 - type: map_at_1
1494 value: 22.096
1495 - type: map_at_10
1496 value: 34.384
1497 - type: map_at_100
1498 value: 35.541
1499 - type: map_at_1000
1500 value: 35.589999999999996
1501 - type: map_at_3
1502 value: 30.496000000000002
1503 - type: map_at_5
1504 value: 32.718
1505 - type: mrr_at_1
1506 value: 22.750999999999998
1507 - type: mrr_at_10
1508 value: 35.024
1509 - type: mrr_at_100
1510 value: 36.125
1511 - type: mrr_at_1000
1512 value: 36.168
1513 - type: mrr_at_3
1514 value: 31.225
1515 - type: mrr_at_5
1516 value: 33.416000000000004
1517 - type: ndcg_at_1
1518 value: 22.750999999999998
1519 - type: ndcg_at_10
1520 value: 41.351
1521 - type: ndcg_at_100
1522 value: 46.92
1523 - type: ndcg_at_1000
1524 value: 48.111
1525 - type: ndcg_at_3
1526 value: 33.439
1527 - type: ndcg_at_5
1528 value: 37.407000000000004
1529 - type: precision_at_1
1530 value: 22.750999999999998
1531 - type: precision_at_10
1532 value: 6.564
1533 - type: precision_at_100
1534 value: 0.935
1535 - type: precision_at_1000
1536 value: 0.104
1537 - type: precision_at_3
1538 value: 14.288
1539 - type: precision_at_5
1540 value: 10.581999999999999
1541 - type: recall_at_1
1542 value: 22.096
1543 - type: recall_at_10
1544 value: 62.771
1545 - type: recall_at_100
1546 value: 88.529
1547 - type: recall_at_1000
1548 value: 97.55
1549 - type: recall_at_3
1550 value: 41.245
1551 - type: recall_at_5
1552 value: 50.788
1553 - task:
1554 type: Classification
1555 dataset:
1556 type: mteb/mtop_domain
1557 name: MTEB MTOPDomainClassification (en)
1558 config: en
1559 split: test
1560 revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1561 metrics:
1562 - type: accuracy
1563 value: 94.16780665754673
1564 - type: f1
1565 value: 93.96331194859894
1566 - task:
1567 type: Classification
1568 dataset:
1569 type: mteb/mtop_intent
1570 name: MTEB MTOPIntentClassification (en)
1571 config: en
1572 split: test
1573 revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1574 metrics:
1575 - type: accuracy
1576 value: 76.90606475148198
1577 - type: f1
1578 value: 58.58344986604187
1579 - task:
1580 type: Classification
1581 dataset:
1582 type: mteb/amazon_massive_intent
1583 name: MTEB MassiveIntentClassification (en)
1584 config: en
1585 split: test
1586 revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1587 metrics:
1588 - type: accuracy
1589 value: 76.14660390047075
1590 - type: f1
1591 value: 74.31533923533614
1592 - task:
1593 type: Classification
1594 dataset:
1595 type: mteb/amazon_massive_scenario
1596 name: MTEB MassiveScenarioClassification (en)
1597 config: en
1598 split: test
1599 revision: 7d571f92784cd94a019292a1f45445077d0ef634
1600 metrics:
1601 - type: accuracy
1602 value: 80.16139878950908
1603 - type: f1
1604 value: 80.18532656824924
1605 - task:
1606 type: Clustering
1607 dataset:
1608 type: mteb/medrxiv-clustering-p2p
1609 name: MTEB MedrxivClusteringP2P
1610 config: default
1611 split: test
1612 revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1613 metrics:
1614 - type: v_measure
1615 value: 32.949880906135085
1616 - task:
1617 type: Clustering
1618 dataset:
1619 type: mteb/medrxiv-clustering-s2s
1620 name: MTEB MedrxivClusteringS2S
1621 config: default
1622 split: test
1623 revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1624 metrics:
1625 - type: v_measure
1626 value: 31.56300351524862
1627 - task:
1628 type: Reranking
1629 dataset:
1630 type: mteb/mind_small
1631 name: MTEB MindSmallReranking
1632 config: default
1633 split: test
1634 revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1635 metrics:
1636 - type: map
1637 value: 31.196521894371315
1638 - type: mrr
1639 value: 32.22644231694389
1640 - task:
1641 type: Retrieval
1642 dataset:
1643 type: nfcorpus
1644 name: MTEB NFCorpus
1645 config: default
1646 split: test
1647 revision: None
1648 metrics:
1649 - type: map_at_1
1650 value: 6.783
1651 - type: map_at_10
1652 value: 14.549000000000001
1653 - type: map_at_100
1654 value: 18.433
1655 - type: map_at_1000
1656 value: 19.949
1657 - type: map_at_3
1658 value: 10.936
1659 - type: map_at_5
1660 value: 12.514
1661 - type: mrr_at_1
1662 value: 47.368
1663 - type: mrr_at_10
1664 value: 56.42
1665 - type: mrr_at_100
1666 value: 56.908
1667 - type: mrr_at_1000
1668 value: 56.95
1669 - type: mrr_at_3
1670 value: 54.283
1671 - type: mrr_at_5
1672 value: 55.568
1673 - type: ndcg_at_1
1674 value: 45.666000000000004
1675 - type: ndcg_at_10
1676 value: 37.389
1677 - type: ndcg_at_100
1678 value: 34.253
1679 - type: ndcg_at_1000
1680 value: 43.059999999999995
1681 - type: ndcg_at_3
1682 value: 42.725
1683 - type: ndcg_at_5
1684 value: 40.193
1685 - type: precision_at_1
1686 value: 47.368
1687 - type: precision_at_10
1688 value: 27.988000000000003
1689 - type: precision_at_100
1690 value: 8.672
1691 - type: precision_at_1000
1692 value: 2.164
1693 - type: precision_at_3
1694 value: 40.248
1695 - type: precision_at_5
1696 value: 34.737
1697 - type: recall_at_1
1698 value: 6.783
1699 - type: recall_at_10
1700 value: 17.838
1701 - type: recall_at_100
1702 value: 33.672000000000004
1703 - type: recall_at_1000
1704 value: 66.166
1705 - type: recall_at_3
1706 value: 11.849
1707 - type: recall_at_5
1708 value: 14.205000000000002
1709 - task:
1710 type: Retrieval
1711 dataset:
1712 type: nq
1713 name: MTEB NQ
1714 config: default
1715 split: test
1716 revision: None
1717 metrics:
1718 - type: map_at_1
1719 value: 31.698999999999998
1720 - type: map_at_10
1721 value: 46.556
1722 - type: map_at_100
1723 value: 47.652
1724 - type: map_at_1000
1725 value: 47.68
1726 - type: map_at_3
1727 value: 42.492000000000004
1728 - type: map_at_5
1729 value: 44.763999999999996
1730 - type: mrr_at_1
1731 value: 35.747
1732 - type: mrr_at_10
1733 value: 49.242999999999995
1734 - type: mrr_at_100
1735 value: 50.052
1736 - type: mrr_at_1000
1737 value: 50.068
1738 - type: mrr_at_3
1739 value: 45.867000000000004
1740 - type: mrr_at_5
1741 value: 47.778999999999996
1742 - type: ndcg_at_1
1743 value: 35.717999999999996
1744 - type: ndcg_at_10
1745 value: 54.14600000000001
1746 - type: ndcg_at_100
1747 value: 58.672999999999995
1748 - type: ndcg_at_1000
1749 value: 59.279
1750 - type: ndcg_at_3
1751 value: 46.407
1752 - type: ndcg_at_5
1753 value: 50.181
1754 - type: precision_at_1
1755 value: 35.717999999999996
1756 - type: precision_at_10
1757 value: 8.844000000000001
1758 - type: precision_at_100
1759 value: 1.139
1760 - type: precision_at_1000
1761 value: 0.12
1762 - type: precision_at_3
1763 value: 20.993000000000002
1764 - type: precision_at_5
1765 value: 14.791000000000002
1766 - type: recall_at_1
1767 value: 31.698999999999998
1768 - type: recall_at_10
1769 value: 74.693
1770 - type: recall_at_100
1771 value: 94.15299999999999
1772 - type: recall_at_1000
1773 value: 98.585
1774 - type: recall_at_3
1775 value: 54.388999999999996
1776 - type: recall_at_5
1777 value: 63.08200000000001
1778 - task:
1779 type: Retrieval
1780 dataset:
1781 type: quora
1782 name: MTEB QuoraRetrieval
1783 config: default
1784 split: test
1785 revision: None
1786 metrics:
1787 - type: map_at_1
1788 value: 71.283
1789 - type: map_at_10
1790 value: 85.24000000000001
1791 - type: map_at_100
1792 value: 85.882
1793 - type: map_at_1000
1794 value: 85.897
1795 - type: map_at_3
1796 value: 82.326
1797 - type: map_at_5
1798 value: 84.177
1799 - type: mrr_at_1
1800 value: 82.21000000000001
1801 - type: mrr_at_10
1802 value: 88.228
1803 - type: mrr_at_100
1804 value: 88.32
1805 - type: mrr_at_1000
1806 value: 88.32
1807 - type: mrr_at_3
1808 value: 87.323
1809 - type: mrr_at_5
1810 value: 87.94800000000001
1811 - type: ndcg_at_1
1812 value: 82.17999999999999
1813 - type: ndcg_at_10
1814 value: 88.9
1815 - type: ndcg_at_100
1816 value: 90.079
1817 - type: ndcg_at_1000
1818 value: 90.158
1819 - type: ndcg_at_3
1820 value: 86.18299999999999
1821 - type: ndcg_at_5
1822 value: 87.71799999999999
1823 - type: precision_at_1
1824 value: 82.17999999999999
1825 - type: precision_at_10
1826 value: 13.464
1827 - type: precision_at_100
1828 value: 1.533
1829 - type: precision_at_1000
1830 value: 0.157
1831 - type: precision_at_3
1832 value: 37.693
1833 - type: precision_at_5
1834 value: 24.792
1835 - type: recall_at_1
1836 value: 71.283
1837 - type: recall_at_10
1838 value: 95.742
1839 - type: recall_at_100
1840 value: 99.67200000000001
1841 - type: recall_at_1000
1842 value: 99.981
1843 - type: recall_at_3
1844 value: 87.888
1845 - type: recall_at_5
1846 value: 92.24
1847 - task:
1848 type: Clustering
1849 dataset:
1850 type: mteb/reddit-clustering
1851 name: MTEB RedditClustering
1852 config: default
1853 split: test
1854 revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1855 metrics:
1856 - type: v_measure
1857 value: 56.24267063669042
1858 - task:
1859 type: Clustering
1860 dataset:
1861 type: mteb/reddit-clustering-p2p
1862 name: MTEB RedditClusteringP2P
1863 config: default
1864 split: test
1865 revision: 282350215ef01743dc01b456c7f5241fa8937f16
1866 metrics:
1867 - type: v_measure
1868 value: 62.88056988932578
1869 - task:
1870 type: Retrieval
1871 dataset:
1872 type: scidocs
1873 name: MTEB SCIDOCS
1874 config: default
1875 split: test
1876 revision: None
1877 metrics:
1878 - type: map_at_1
1879 value: 4.903
1880 - type: map_at_10
1881 value: 13.202
1882 - type: map_at_100
1883 value: 15.5
1884 - type: map_at_1000
1885 value: 15.870999999999999
1886 - type: map_at_3
1887 value: 9.407
1888 - type: map_at_5
1889 value: 11.238
1890 - type: mrr_at_1
1891 value: 24.2
1892 - type: mrr_at_10
1893 value: 35.867
1894 - type: mrr_at_100
1895 value: 37.001
1896 - type: mrr_at_1000
1897 value: 37.043
1898 - type: mrr_at_3
1899 value: 32.5
1900 - type: mrr_at_5
1901 value: 34.35
1902 - type: ndcg_at_1
1903 value: 24.2
1904 - type: ndcg_at_10
1905 value: 21.731
1906 - type: ndcg_at_100
1907 value: 30.7
1908 - type: ndcg_at_1000
1909 value: 36.618
1910 - type: ndcg_at_3
1911 value: 20.72
1912 - type: ndcg_at_5
1913 value: 17.954
1914 - type: precision_at_1
1915 value: 24.2
1916 - type: precision_at_10
1917 value: 11.33
1918 - type: precision_at_100
1919 value: 2.4410000000000003
1920 - type: precision_at_1000
1921 value: 0.386
1922 - type: precision_at_3
1923 value: 19.667
1924 - type: precision_at_5
1925 value: 15.86
1926 - type: recall_at_1
1927 value: 4.903
1928 - type: recall_at_10
1929 value: 22.962
1930 - type: recall_at_100
1931 value: 49.563
1932 - type: recall_at_1000
1933 value: 78.238
1934 - type: recall_at_3
1935 value: 11.953
1936 - type: recall_at_5
1937 value: 16.067999999999998
1938 - task:
1939 type: STS
1940 dataset:
1941 type: mteb/sickr-sts
1942 name: MTEB SICK-R
1943 config: default
1944 split: test
1945 revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1946 metrics:
1947 - type: cos_sim_pearson
1948 value: 84.12694254604078
1949 - type: cos_sim_spearman
1950 value: 80.30141815181918
1951 - type: euclidean_pearson
1952 value: 81.34015449877128
1953 - type: euclidean_spearman
1954 value: 80.13984197010849
1955 - type: manhattan_pearson
1956 value: 81.31767068124086
1957 - type: manhattan_spearman
1958 value: 80.11720513114103
1959 - task:
1960 type: STS
1961 dataset:
1962 type: mteb/sts12-sts
1963 name: MTEB STS12
1964 config: default
1965 split: test
1966 revision: a0d554a64d88156834ff5ae9920b964011b16384
1967 metrics:
1968 - type: cos_sim_pearson
1969 value: 86.13112984010417
1970 - type: cos_sim_spearman
1971 value: 78.03063573402875
1972 - type: euclidean_pearson
1973 value: 83.51928418844804
1974 - type: euclidean_spearman
1975 value: 78.4045235411144
1976 - type: manhattan_pearson
1977 value: 83.49981637388689
1978 - type: manhattan_spearman
1979 value: 78.4042575139372
1980 - task:
1981 type: STS
1982 dataset:
1983 type: mteb/sts13-sts
1984 name: MTEB STS13
1985 config: default
1986 split: test
1987 revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1988 metrics:
1989 - type: cos_sim_pearson
1990 value: 82.50327987379504
1991 - type: cos_sim_spearman
1992 value: 84.18556767756205
1993 - type: euclidean_pearson
1994 value: 82.69684424327679
1995 - type: euclidean_spearman
1996 value: 83.5368106038335
1997 - type: manhattan_pearson
1998 value: 82.57967581007374
1999 - type: manhattan_spearman
2000 value: 83.43009053133697
2001 - task:
2002 type: STS
2003 dataset:
2004 type: mteb/sts14-sts
2005 name: MTEB STS14
2006 config: default
2007 split: test
2008 revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2009 metrics:
2010 - type: cos_sim_pearson
2011 value: 82.50756863007814
2012 - type: cos_sim_spearman
2013 value: 82.27204331279108
2014 - type: euclidean_pearson
2015 value: 81.39535251429741
2016 - type: euclidean_spearman
2017 value: 81.84386626336239
2018 - type: manhattan_pearson
2019 value: 81.34281737280695
2020 - type: manhattan_spearman
2021 value: 81.81149375673166
2022 - task:
2023 type: STS
2024 dataset:
2025 type: mteb/sts15-sts
2026 name: MTEB STS15
2027 config: default
2028 split: test
2029 revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2030 metrics:
2031 - type: cos_sim_pearson
2032 value: 86.8727714856726
2033 - type: cos_sim_spearman
2034 value: 87.95738287792312
2035 - type: euclidean_pearson
2036 value: 86.62920602795887
2037 - type: euclidean_spearman
2038 value: 87.05207355381243
2039 - type: manhattan_pearson
2040 value: 86.53587918472225
2041 - type: manhattan_spearman
2042 value: 86.95382961029586
2043 - task:
2044 type: STS
2045 dataset:
2046 type: mteb/sts16-sts
2047 name: MTEB STS16
2048 config: default
2049 split: test
2050 revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2051 metrics:
2052 - type: cos_sim_pearson
2053 value: 83.52240359769479
2054 - type: cos_sim_spearman
2055 value: 85.47685776238286
2056 - type: euclidean_pearson
2057 value: 84.25815333483058
2058 - type: euclidean_spearman
2059 value: 85.27415639683198
2060 - type: manhattan_pearson
2061 value: 84.29127757025637
2062 - type: manhattan_spearman
2063 value: 85.30226224917351
2064 - task:
2065 type: STS
2066 dataset:
2067 type: mteb/sts17-crosslingual-sts
2068 name: MTEB STS17 (en-en)
2069 config: en-en
2070 split: test
2071 revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2072 metrics:
2073 - type: cos_sim_pearson
2074 value: 86.42501708915708
2075 - type: cos_sim_spearman
2076 value: 86.42276182795041
2077 - type: euclidean_pearson
2078 value: 86.5408207354761
2079 - type: euclidean_spearman
2080 value: 85.46096321750838
2081 - type: manhattan_pearson
2082 value: 86.54177303026881
2083 - type: manhattan_spearman
2084 value: 85.50313151916117
2085 - task:
2086 type: STS
2087 dataset:
2088 type: mteb/sts22-crosslingual-sts
2089 name: MTEB STS22 (en)
2090 config: en
2091 split: test
2092 revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2093 metrics:
2094 - type: cos_sim_pearson
2095 value: 64.86521089250766
2096 - type: cos_sim_spearman
2097 value: 65.94868540323003
2098 - type: euclidean_pearson
2099 value: 67.16569626533084
2100 - type: euclidean_spearman
2101 value: 66.37667004134917
2102 - type: manhattan_pearson
2103 value: 67.1482365102333
2104 - type: manhattan_spearman
2105 value: 66.53240122580029
2106 - task:
2107 type: STS
2108 dataset:
2109 type: mteb/stsbenchmark-sts
2110 name: MTEB STSBenchmark
2111 config: default
2112 split: test
2113 revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2114 metrics:
2115 - type: cos_sim_pearson
2116 value: 84.64746265365318
2117 - type: cos_sim_spearman
2118 value: 86.41888825906786
2119 - type: euclidean_pearson
2120 value: 85.27453642725811
2121 - type: euclidean_spearman
2122 value: 85.94095796602544
2123 - type: manhattan_pearson
2124 value: 85.28643660505334
2125 - type: manhattan_spearman
2126 value: 85.95028003260744
2127 - task:
2128 type: Reranking
2129 dataset:
2130 type: mteb/scidocs-reranking
2131 name: MTEB SciDocsRR
2132 config: default
2133 split: test
2134 revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2135 metrics:
2136 - type: map
2137 value: 87.48903153618527
2138 - type: mrr
2139 value: 96.41081503826601
2140 - task:
2141 type: Retrieval
2142 dataset:
2143 type: scifact
2144 name: MTEB SciFact
2145 config: default
2146 split: test
2147 revision: None
2148 metrics:
2149 - type: map_at_1
2150 value: 58.594
2151 - type: map_at_10
2152 value: 69.296
2153 - type: map_at_100
2154 value: 69.782
2155 - type: map_at_1000
2156 value: 69.795
2157 - type: map_at_3
2158 value: 66.23
2159 - type: map_at_5
2160 value: 68.293
2161 - type: mrr_at_1
2162 value: 61.667
2163 - type: mrr_at_10
2164 value: 70.339
2165 - type: mrr_at_100
2166 value: 70.708
2167 - type: mrr_at_1000
2168 value: 70.722
2169 - type: mrr_at_3
2170 value: 68.0
2171 - type: mrr_at_5
2172 value: 69.56700000000001
2173 - type: ndcg_at_1
2174 value: 61.667
2175 - type: ndcg_at_10
2176 value: 74.039
2177 - type: ndcg_at_100
2178 value: 76.103
2179 - type: ndcg_at_1000
2180 value: 76.47800000000001
2181 - type: ndcg_at_3
2182 value: 68.967
2183 - type: ndcg_at_5
2184 value: 71.96900000000001
2185 - type: precision_at_1
2186 value: 61.667
2187 - type: precision_at_10
2188 value: 9.866999999999999
2189 - type: precision_at_100
2190 value: 1.097
2191 - type: precision_at_1000
2192 value: 0.11299999999999999
2193 - type: precision_at_3
2194 value: 27.111
2195 - type: precision_at_5
2196 value: 18.2
2197 - type: recall_at_1
2198 value: 58.594
2199 - type: recall_at_10
2200 value: 87.422
2201 - type: recall_at_100
2202 value: 96.667
2203 - type: recall_at_1000
2204 value: 99.667
2205 - type: recall_at_3
2206 value: 74.217
2207 - type: recall_at_5
2208 value: 81.539
2209 - task:
2210 type: PairClassification
2211 dataset:
2212 type: mteb/sprintduplicatequestions-pairclassification
2213 name: MTEB SprintDuplicateQuestions
2214 config: default
2215 split: test
2216 revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2217 metrics:
2218 - type: cos_sim_accuracy
2219 value: 99.85049504950496
2220 - type: cos_sim_ap
2221 value: 96.33111544137081
2222 - type: cos_sim_f1
2223 value: 92.35443037974684
2224 - type: cos_sim_precision
2225 value: 93.53846153846153
2226 - type: cos_sim_recall
2227 value: 91.2
2228 - type: dot_accuracy
2229 value: 99.82376237623762
2230 - type: dot_ap
2231 value: 95.38082527310888
2232 - type: dot_f1
2233 value: 90.90909090909092
2234 - type: dot_precision
2235 value: 92.90187891440502
2236 - type: dot_recall
2237 value: 89.0
2238 - type: euclidean_accuracy
2239 value: 99.84851485148515
2240 - type: euclidean_ap
2241 value: 96.32316003996347
2242 - type: euclidean_f1
2243 value: 92.2071392659628
2244 - type: euclidean_precision
2245 value: 92.71991911021233
2246 - type: euclidean_recall
2247 value: 91.7
2248 - type: manhattan_accuracy
2249 value: 99.84851485148515
2250 - type: manhattan_ap
2251 value: 96.3655668249217
2252 - type: manhattan_f1
2253 value: 92.18356026222895
2254 - type: manhattan_precision
2255 value: 92.98067141403867
2256 - type: manhattan_recall
2257 value: 91.4
2258 - type: max_accuracy
2259 value: 99.85049504950496
2260 - type: max_ap
2261 value: 96.3655668249217
2262 - type: max_f1
2263 value: 92.35443037974684
2264 - task:
2265 type: Clustering
2266 dataset:
2267 type: mteb/stackexchange-clustering
2268 name: MTEB StackExchangeClustering
2269 config: default
2270 split: test
2271 revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2272 metrics:
2273 - type: v_measure
2274 value: 65.94861371629051
2275 - task:
2276 type: Clustering
2277 dataset:
2278 type: mteb/stackexchange-clustering-p2p
2279 name: MTEB StackExchangeClusteringP2P
2280 config: default
2281 split: test
2282 revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2283 metrics:
2284 - type: v_measure
2285 value: 35.009430451385
2286 - task:
2287 type: Reranking
2288 dataset:
2289 type: mteb/stackoverflowdupquestions-reranking
2290 name: MTEB StackOverflowDupQuestions
2291 config: default
2292 split: test
2293 revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2294 metrics:
2295 - type: map
2296 value: 54.61164066427969
2297 - type: mrr
2298 value: 55.49710603938544
2299 - task:
2300 type: Summarization
2301 dataset:
2302 type: mteb/summeval
2303 name: MTEB SummEval
2304 config: default
2305 split: test
2306 revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2307 metrics:
2308 - type: cos_sim_pearson
2309 value: 30.622620124907662
2310 - type: cos_sim_spearman
2311 value: 31.0678351356163
2312 - type: dot_pearson
2313 value: 30.863727693306814
2314 - type: dot_spearman
2315 value: 31.230306567021255
2316 - task:
2317 type: Retrieval
2318 dataset:
2319 type: trec-covid
2320 name: MTEB TRECCOVID
2321 config: default
2322 split: test
2323 revision: None
2324 metrics:
2325 - type: map_at_1
2326 value: 0.22
2327 - type: map_at_10
2328 value: 2.011
2329 - type: map_at_100
2330 value: 10.974
2331 - type: map_at_1000
2332 value: 25.819
2333 - type: map_at_3
2334 value: 0.6649999999999999
2335 - type: map_at_5
2336 value: 1.076
2337 - type: mrr_at_1
2338 value: 86.0
2339 - type: mrr_at_10
2340 value: 91.8
2341 - type: mrr_at_100
2342 value: 91.8
2343 - type: mrr_at_1000
2344 value: 91.8
2345 - type: mrr_at_3
2346 value: 91.0
2347 - type: mrr_at_5
2348 value: 91.8
2349 - type: ndcg_at_1
2350 value: 82.0
2351 - type: ndcg_at_10
2352 value: 78.07300000000001
2353 - type: ndcg_at_100
2354 value: 58.231
2355 - type: ndcg_at_1000
2356 value: 51.153000000000006
2357 - type: ndcg_at_3
2358 value: 81.123
2359 - type: ndcg_at_5
2360 value: 81.059
2361 - type: precision_at_1
2362 value: 86.0
2363 - type: precision_at_10
2364 value: 83.0
2365 - type: precision_at_100
2366 value: 59.38
2367 - type: precision_at_1000
2368 value: 22.55
2369 - type: precision_at_3
2370 value: 87.333
2371 - type: precision_at_5
2372 value: 86.8
2373 - type: recall_at_1
2374 value: 0.22
2375 - type: recall_at_10
2376 value: 2.2079999999999997
2377 - type: recall_at_100
2378 value: 14.069
2379 - type: recall_at_1000
2380 value: 47.678
2381 - type: recall_at_3
2382 value: 0.7040000000000001
2383 - type: recall_at_5
2384 value: 1.161
2385 - task:
2386 type: Retrieval
2387 dataset:
2388 type: webis-touche2020
2389 name: MTEB Touche2020
2390 config: default
2391 split: test
2392 revision: None
2393 metrics:
2394 - type: map_at_1
2395 value: 2.809
2396 - type: map_at_10
2397 value: 10.394
2398 - type: map_at_100
2399 value: 16.598
2400 - type: map_at_1000
2401 value: 18.142
2402 - type: map_at_3
2403 value: 5.572
2404 - type: map_at_5
2405 value: 7.1370000000000005
2406 - type: mrr_at_1
2407 value: 32.653
2408 - type: mrr_at_10
2409 value: 46.564
2410 - type: mrr_at_100
2411 value: 47.469
2412 - type: mrr_at_1000
2413 value: 47.469
2414 - type: mrr_at_3
2415 value: 42.177
2416 - type: mrr_at_5
2417 value: 44.524
2418 - type: ndcg_at_1
2419 value: 30.612000000000002
2420 - type: ndcg_at_10
2421 value: 25.701
2422 - type: ndcg_at_100
2423 value: 37.532
2424 - type: ndcg_at_1000
2425 value: 48.757
2426 - type: ndcg_at_3
2427 value: 28.199999999999996
2428 - type: ndcg_at_5
2429 value: 25.987
2430 - type: precision_at_1
2431 value: 32.653
2432 - type: precision_at_10
2433 value: 23.469
2434 - type: precision_at_100
2435 value: 7.9799999999999995
2436 - type: precision_at_1000
2437 value: 1.5350000000000001
2438 - type: precision_at_3
2439 value: 29.932
2440 - type: precision_at_5
2441 value: 26.122
2442 - type: recall_at_1
2443 value: 2.809
2444 - type: recall_at_10
2445 value: 16.887
2446 - type: recall_at_100
2447 value: 48.67
2448 - type: recall_at_1000
2449 value: 82.89699999999999
2450 - type: recall_at_3
2451 value: 6.521000000000001
2452 - type: recall_at_5
2453 value: 9.609
2454 - task:
2455 type: Classification
2456 dataset:
2457 type: mteb/toxic_conversations_50k
2458 name: MTEB ToxicConversationsClassification
2459 config: default
2460 split: test
2461 revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2462 metrics:
2463 - type: accuracy
2464 value: 71.57860000000001
2465 - type: ap
2466 value: 13.82629211536393
2467 - type: f1
2468 value: 54.59860966183956
2469 - task:
2470 type: Classification
2471 dataset:
2472 type: mteb/tweet_sentiment_extraction
2473 name: MTEB TweetSentimentExtractionClassification
2474 config: default
2475 split: test
2476 revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2477 metrics:
2478 - type: accuracy
2479 value: 59.38030560271647
2480 - type: f1
2481 value: 59.69685552567865
2482 - task:
2483 type: Clustering
2484 dataset:
2485 type: mteb/twentynewsgroups-clustering
2486 name: MTEB TwentyNewsgroupsClustering
2487 config: default
2488 split: test
2489 revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2490 metrics:
2491 - type: v_measure
2492 value: 51.4736717043405
2493 - task:
2494 type: PairClassification
2495 dataset:
2496 type: mteb/twittersemeval2015-pairclassification
2497 name: MTEB TwitterSemEval2015
2498 config: default
2499 split: test
2500 revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2501 metrics:
2502 - type: cos_sim_accuracy
2503 value: 86.92853311080646
2504 - type: cos_sim_ap
2505 value: 77.67872502591382
2506 - type: cos_sim_f1
2507 value: 70.33941236068895
2508 - type: cos_sim_precision
2509 value: 67.63273258645884
2510 - type: cos_sim_recall
2511 value: 73.27176781002639
2512 - type: dot_accuracy
2513 value: 85.79603027954938
2514 - type: dot_ap
2515 value: 73.73786190233379
2516 - type: dot_f1
2517 value: 67.3437901774235
2518 - type: dot_precision
2519 value: 65.67201604814443
2520 - type: dot_recall
2521 value: 69.10290237467018
2522 - type: euclidean_accuracy
2523 value: 86.94045419324074
2524 - type: euclidean_ap
2525 value: 77.6687791535167
2526 - type: euclidean_f1
2527 value: 70.47209214023542
2528 - type: euclidean_precision
2529 value: 67.7207492094381
2530 - type: euclidean_recall
2531 value: 73.45646437994723
2532 - type: manhattan_accuracy
2533 value: 86.87488823985218
2534 - type: manhattan_ap
2535 value: 77.63373392430728
2536 - type: manhattan_f1
2537 value: 70.40920716112532
2538 - type: manhattan_precision
2539 value: 68.31265508684864
2540 - type: manhattan_recall
2541 value: 72.63852242744063
2542 - type: max_accuracy
2543 value: 86.94045419324074
2544 - type: max_ap
2545 value: 77.67872502591382
2546 - type: max_f1
2547 value: 70.47209214023542
2548 - task:
2549 type: PairClassification
2550 dataset:
2551 type: mteb/twitterurlcorpus-pairclassification
2552 name: MTEB TwitterURLCorpus
2553 config: default
2554 split: test
2555 revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2556 metrics:
2557 - type: cos_sim_accuracy
2558 value: 88.67155664221679
2559 - type: cos_sim_ap
2560 value: 85.64591703003417
2561 - type: cos_sim_f1
2562 value: 77.59531005352656
2563 - type: cos_sim_precision
2564 value: 73.60967184801382
2565 - type: cos_sim_recall
2566 value: 82.03726516784724
2567 - type: dot_accuracy
2568 value: 88.41541506578181
2569 - type: dot_ap
2570 value: 84.6482788957769
2571 - type: dot_f1
2572 value: 77.04748541466657
2573 - type: dot_precision
2574 value: 74.02440754931176
2575 - type: dot_recall
2576 value: 80.3279950723745
2577 - type: euclidean_accuracy
2578 value: 88.63080684596576
2579 - type: euclidean_ap
2580 value: 85.44570045321562
2581 - type: euclidean_f1
2582 value: 77.28769403336106
2583 - type: euclidean_precision
2584 value: 72.90600040958427
2585 - type: euclidean_recall
2586 value: 82.22975053895904
2587 - type: manhattan_accuracy
2588 value: 88.59393798269105
2589 - type: manhattan_ap
2590 value: 85.40271361038187
2591 - type: manhattan_f1
2592 value: 77.17606419344392
2593 - type: manhattan_precision
2594 value: 72.4447747078295
2595 - type: manhattan_recall
2596 value: 82.5685247921158
2597 - type: max_accuracy
2598 value: 88.67155664221679
2599 - type: max_ap
2600 value: 85.64591703003417
2601 - type: max_f1
2602 value: 77.59531005352656
2603 license: mit
2604 language:
2605 - en
2606 ---
2607
2608
2609 <h1 align="center">FlagEmbedding</h1>
2610
2611
2612 <h4 align="center">
2613 <p>
2614 <a href=#model-list>Model List</a> |
2615 <a href=#frequently-asked-questions>FAQ</a> |
2616 <a href=#usage>Usage</a> |
2617 <a href="#evaluation">Evaluation</a> |
2618 <a href="#train">Train</a> |
2619 <a href="#contact">Contact</a> |
2620 <a href="#citation">Citation</a> |
2621 <a href="#license">License</a>
2622 <p>
2623 </h4>
2624
2625
2626 For more details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
2627
2628 If you are looking for a model that supports more languages, longer texts, and other retrieval methods, you can try using [bge-m3](https://huggingface.co/BAAI/bge-m3).
2629
2630
2631 [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2632
2633 FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently:
2634
2635 - **Long-Context LLM**: [Activation Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon)
2636 - **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
2637 - **Dense Retrieval**: [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3), [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding)
2638 - **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2639 - **Benchmark**: [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
2640
2641 ## News
2642 - 1/30/2024: Release **BGE-M3**, a new member to BGE model series! M3 stands for **M**ulti-linguality (100+ languages), **M**ulti-granularities (input length up to 8192), **M**ulti-Functionality (unification of dense, lexical, multi-vec/colbert retrieval).
2643 It is the first embedding model which supports all three retrieval methods, achieving new SOTA on multi-lingual (MIRACL) and cross-lingual (MKQA) benchmarks.
2644 [Technical Report](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/BGE_M3.pdf) and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3). :fire:
2645 - 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
2646 - 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
2647 - 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
2648 - 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
2649 - 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) and [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
2650 - 09/12/2023: New models:
2651 - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
2652 - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
2653
2654
2655 <details>
2656 <summary>More</summary>
2657 <!-- ### More -->
2658
2659 - 09/07/2023: Update [fine-tune code](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md): Add script to mine hard negatives and support adding instruction during fine-tuning.
2660 - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like [this](#using-langchain); C-MTEB **leaderboard** is [available](https://huggingface.co/spaces/mteb/leaderboard).
2661 - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗**
2662 - 08/02/2023: Release `bge-large-*`(short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada:
2663 - 08/01/2023: We release the [Chinese Massive Text Embedding Benchmark](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), consisting of 31 test dataset.
2664
2665 </details>
2666
2667
2668 ## Model List
2669
2670 `bge` is short for `BAAI general embedding`.
2671
2672 | Model | Language | | Description | query instruction for retrieval [1] |
2673 |:-------------------------------|:--------:| :--------:| :--------:|:--------:|
2674 | [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
2675 | [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
2676 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2677 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2678 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2679 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2680 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2681 | [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2682 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2683 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2684 | [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard | `Represent this sentence for searching relevant passages: ` |
2685 | [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-en` | `Represent this sentence for searching relevant passages: ` |
2686 | [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) |a small-scale model but with competitive performance | `Represent this sentence for searching relevant passages: ` |
2687 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB) benchmark | `为这个句子生成表示以用于检索相关文章:` |
2688 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
2689 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
2690
2691
2692 [1\]: If you need to search the relevant passages to a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages.
2693
2694 [2\]: Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
2695 For examples, use bge embedding model to retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 document to get the final top-3 results.
2696
2697 All models have been uploaded to Huggingface Hub, and you can see them at https://huggingface.co/BAAI.
2698 If you cannot open the Huggingface Hub, you also can download the models at https://model.baai.ac.cn/models .
2699
2700
2701 ## Frequently asked questions
2702
2703 <details>
2704 <summary>1. How to fine-tune bge embedding model?</summary>
2705
2706 <!-- ### How to fine-tune bge embedding model? -->
2707 Following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) to prepare data and fine-tune your model.
2708 Some suggestions:
2709 - Mine hard negatives following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune#hard-negatives), which can improve the retrieval performance.
2710 - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity.
2711 - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker.
2712
2713
2714 </details>
2715
2716 <details>
2717 <summary>2. The similarity score between two dissimilar sentences is higher than 0.5</summary>
2718
2719 <!-- ### The similarity score between two dissimilar sentences is higher than 0.5 -->
2720 **Suggest to use bge v1.5, which alleviates the issue of the similarity distribution.**
2721
2722 Since we finetune the models by contrastive learning with a temperature of 0.01,
2723 the similarity distribution of the current BGE model is about in the interval \[0.6, 1\].
2724 So a similarity score greater than 0.5 does not indicate that the two sentences are similar.
2725
2726 For downstream tasks, such as passage retrieval or semantic similarity,
2727 **what matters is the relative order of the scores, not the absolute value.**
2728 If you need to filter similar sentences based on a similarity threshold,
2729 please select an appropriate similarity threshold based on the similarity distribution on your data (such as 0.8, 0.85, or even 0.9).
2730
2731 </details>
2732
2733 <details>
2734 <summary>3. When does the query instruction need to be used</summary>
2735
2736 <!-- ### When does the query instruction need to be used -->
2737
2738 For the `bge-*-v1.5`, we improve its retrieval ability when not using instruction.
2739 No instruction only has a slight degradation in retrieval performance compared with using instruction.
2740 So you can generate embedding without instruction in all cases for convenience.
2741
2742 For a retrieval task that uses short queries to find long related documents,
2743 it is recommended to add instructions for these short queries.
2744 **The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.**
2745 In all cases, the documents/passages do not need to add the instruction.
2746
2747 </details>
2748
2749
2750 ## Usage
2751
2752 ### Usage for Embedding Model
2753
2754 Here are some examples for using `bge` models with
2755 [FlagEmbedding](#using-flagembedding), [Sentence-Transformers](#using-sentence-transformers), [Langchain](#using-langchain), or [Huggingface Transformers](#using-huggingface-transformers).
2756
2757 #### Using FlagEmbedding
2758 ```
2759 pip install -U FlagEmbedding
2760 ```
2761 If it doesn't work for you, you can see [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md) for more methods to install FlagEmbedding.
2762
2763 ```python
2764 from FlagEmbedding import FlagModel
2765 sentences_1 = ["样例数据-1", "样例数据-2"]
2766 sentences_2 = ["样例数据-3", "样例数据-4"]
2767 model = FlagModel('BAAI/bge-large-zh-v1.5',
2768 query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章:",
2769 use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2770 embeddings_1 = model.encode(sentences_1)
2771 embeddings_2 = model.encode(sentences_2)
2772 similarity = embeddings_1 @ embeddings_2.T
2773 print(similarity)
2774
2775 # for s2p(short query to long passage) retrieval task, suggest to use encode_queries() which will automatically add the instruction to each query
2776 # corpus in retrieval task can still use encode() or encode_corpus(), since they don't need instruction
2777 queries = ['query_1', 'query_2']
2778 passages = ["样例文档-1", "样例文档-2"]
2779 q_embeddings = model.encode_queries(queries)
2780 p_embeddings = model.encode(passages)
2781 scores = q_embeddings @ p_embeddings.T
2782 ```
2783 For the value of the argument `query_instruction_for_retrieval`, see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list).
2784
2785 By default, FlagModel will use all available GPUs when encoding. Please set `os.environ["CUDA_VISIBLE_DEVICES"]` to select specific GPUs.
2786 You also can set `os.environ["CUDA_VISIBLE_DEVICES"]=""` to make all GPUs unavailable.
2787
2788
2789 #### Using Sentence-Transformers
2790
2791 You can also use the `bge` models with [sentence-transformers](https://www.SBERT.net):
2792
2793 ```
2794 pip install -U sentence-transformers
2795 ```
2796 ```python
2797 from sentence_transformers import SentenceTransformer
2798 sentences_1 = ["样例数据-1", "样例数据-2"]
2799 sentences_2 = ["样例数据-3", "样例数据-4"]
2800 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2801 embeddings_1 = model.encode(sentences_1, normalize_embeddings=True)
2802 embeddings_2 = model.encode(sentences_2, normalize_embeddings=True)
2803 similarity = embeddings_1 @ embeddings_2.T
2804 print(similarity)
2805 ```
2806 For s2p(short query to long passage) retrieval task,
2807 each short query should start with an instruction (instructions see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list)).
2808 But the instruction is not needed for passages.
2809 ```python
2810 from sentence_transformers import SentenceTransformer
2811 queries = ['query_1', 'query_2']
2812 passages = ["样例文档-1", "样例文档-2"]
2813 instruction = "为这个句子生成表示以用于检索相关文章:"
2814
2815 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2816 q_embeddings = model.encode([instruction+q for q in queries], normalize_embeddings=True)
2817 p_embeddings = model.encode(passages, normalize_embeddings=True)
2818 scores = q_embeddings @ p_embeddings.T
2819 ```
2820
2821 #### Using Langchain
2822
2823 You can use `bge` in langchain like this:
2824 ```python
2825 from langchain.embeddings import HuggingFaceBgeEmbeddings
2826 model_name = "BAAI/bge-large-en-v1.5"
2827 model_kwargs = {'device': 'cuda'}
2828 encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity
2829 model = HuggingFaceBgeEmbeddings(
2830 model_name=model_name,
2831 model_kwargs=model_kwargs,
2832 encode_kwargs=encode_kwargs,
2833 query_instruction="为这个句子生成表示以用于检索相关文章:"
2834 )
2835 model.query_instruction = "为这个句子生成表示以用于检索相关文章:"
2836 ```
2837
2838
2839 #### Using HuggingFace Transformers
2840
2841 With the transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of the first token (i.e., [CLS]) as the sentence embedding.
2842
2843 ```python
2844 from transformers import AutoTokenizer, AutoModel
2845 import torch
2846 # Sentences we want sentence embeddings for
2847 sentences = ["样例数据-1", "样例数据-2"]
2848
2849 # Load model from HuggingFace Hub
2850 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-zh-v1.5')
2851 model = AutoModel.from_pretrained('BAAI/bge-large-zh-v1.5')
2852 model.eval()
2853
2854 # Tokenize sentences
2855 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2856 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2857 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2858
2859 # Compute token embeddings
2860 with torch.no_grad():
2861 model_output = model(**encoded_input)
2862 # Perform pooling. In this case, cls pooling.
2863 sentence_embeddings = model_output[0][:, 0]
2864 # normalize embeddings
2865 sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
2866 print("Sentence embeddings:", sentence_embeddings)
2867 ```
2868
2869
2870 #### Usage of the ONNX files
2871
2872 ```python
2873 from optimum.onnxruntime import ORTModelForFeatureExtraction # type: ignore
2874
2875 import torch
2876 from transformers import AutoModel, AutoTokenizer
2877
2878 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-en-v1.5')
2879 model = AutoModel.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13")
2880 model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13",file_name="onnx/model.onnx")
2881
2882 # Sentences we want sentence embeddings for
2883 sentences = ["样例数据-1", "样例数据-2"]
2884
2885 # Tokenize sentences
2886 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2887 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2888 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2889
2890 model_output_ort = model_ort(**encoded_input)
2891 # Compute token embeddings
2892 with torch.no_grad():
2893 model_output = model(**encoded_input)
2894
2895 # model_output and model_output_ort are identical
2896
2897 ```
2898
2899 #### Usage via infinity
2900 Its also possible to deploy the onnx files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
2901 ```python
2902 import asyncio
2903 from infinity_emb import AsyncEmbeddingEngine, EngineArgs
2904
2905 sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
2906 engine = AsyncEmbeddingEngine.from_args(
2907 EngineArgs(model_name_or_path = "BAAI/bge-large-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
2908 ))
2909
2910 async def main():
2911 async with engine:
2912 embeddings, usage = await engine.embed(sentences=sentences)
2913 asyncio.run(main())
2914 ```
2915
2916 ### Usage for Reranker
2917
2918 Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
2919 You can get a relevance score by inputting query and passage to the reranker.
2920 The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range.
2921
2922
2923 #### Using FlagEmbedding
2924 ```
2925 pip install -U FlagEmbedding
2926 ```
2927
2928 Get relevance scores (higher scores indicate more relevance):
2929 ```python
2930 from FlagEmbedding import FlagReranker
2931 reranker = FlagReranker('BAAI/bge-reranker-large', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2932
2933 score = reranker.compute_score(['query', 'passage'])
2934 print(score)
2935
2936 scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']])
2937 print(scores)
2938 ```
2939
2940
2941 #### Using Huggingface transformers
2942
2943 ```python
2944 import torch
2945 from transformers import AutoModelForSequenceClassification, AutoTokenizer
2946
2947 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large')
2948 model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-large')
2949 model.eval()
2950
2951 pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
2952 with torch.no_grad():
2953 inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
2954 scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
2955 print(scores)
2956 ```
2957
2958 ## Evaluation
2959
2960 `baai-general-embedding` models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!**
2961 For more details and evaluation tools see our [scripts](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md).
2962
2963 - **MTEB**:
2964
2965 | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) |
2966 |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
2967 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | 1024 | 512 | **64.23** | **54.29** | 46.08 | 87.12 | 60.03 | 83.11 | 31.61 | 75.97 |
2968 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | 768 | 512 | 63.55 | 53.25 | 45.77 | 86.55 | 58.86 | 82.4 | 31.07 | 75.53 |
2969 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | 384 | 512 | 62.17 |51.68 | 43.82 | 84.92 | 58.36 | 81.59 | 30.12 | 74.14 |
2970 | [bge-large-en](https://huggingface.co/BAAI/bge-large-en) | 1024 | 512 | 63.98 | 53.9 | 46.98 | 85.8 | 59.48 | 81.56 | 32.06 | 76.21 |
2971 | [bge-base-en](https://huggingface.co/BAAI/bge-base-en) | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 |
2972 | [gte-large](https://huggingface.co/thenlper/gte-large) | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 |
2973 | [gte-base](https://huggingface.co/thenlper/gte-base) | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 |
2974 | [e5-large-v2](https://huggingface.co/intfloat/e5-large-v2) | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 |
2975 | [bge-small-en](https://huggingface.co/BAAI/bge-small-en) | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 |
2976 | [instructor-xl](https://huggingface.co/hkunlp/instructor-xl) | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 |
2977 | [e5-base-v2](https://huggingface.co/intfloat/e5-base-v2) | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 |
2978 | [gte-small](https://huggingface.co/thenlper/gte-small) | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 |
2979 | [text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings) | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 |
2980 | [e5-small-v2](https://huggingface.co/intfloat/e5-base-v2) | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 |
2981 | [sentence-t5-xxl](https://huggingface.co/sentence-transformers/sentence-t5-xxl) | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 |
2982 | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 |
2983 | [sgpt-bloom-7b1-msmarco](https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco) | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 |
2984
2985
2986
2987 - **C-MTEB**:
2988 We create the benchmark C-MTEB for Chinese text embedding which consists of 31 datasets from 6 tasks.
2989 Please refer to [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md) for a detailed introduction.
2990
2991 | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering |
2992 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
2993 | [**BAAI/bge-large-zh-v1.5**](https://huggingface.co/BAAI/bge-large-zh-v1.5) | 1024 | **64.53** | 70.46 | 56.25 | 81.6 | 69.13 | 65.84 | 48.99 |
2994 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | 768 | 63.13 | 69.49 | 53.72 | 79.75 | 68.07 | 65.39 | 47.53 |
2995 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | 512 | 57.82 | 61.77 | 49.11 | 70.41 | 63.96 | 60.92 | 44.18 |
2996 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | 1024 | 64.20 | 71.53 | 54.98 | 78.94 | 68.32 | 65.11 | 48.39 |
2997 | [bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct) | 1024 | 63.53 | 70.55 | 53 | 76.77 | 68.58 | 64.91 | 50.01 |
2998 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | 768 | 62.96 | 69.53 | 54.12 | 77.5 | 67.07 | 64.91 | 47.63 |
2999 | [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) | 1024 | 58.79 | 63.66 | 48.44 | 69.89 | 67.34 | 56.00 | 48.23 |
3000 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | 512 | 58.27 | 63.07 | 49.45 | 70.35 | 63.64 | 61.48 | 45.09 |
3001 | [m3e-base](https://huggingface.co/moka-ai/m3e-base) | 768 | 57.10 | 56.91 | 50.47 | 63.99 | 67.52 | 59.34 | 47.68 |
3002 | [m3e-large](https://huggingface.co/moka-ai/m3e-large) | 1024 | 57.05 | 54.75 | 50.42 | 64.3 | 68.2 | 59.66 | 48.88 |
3003 | [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) | 768 | 55.48 | 61.63 | 46.49 | 67.07 | 65.35 | 54.35 | 40.68 |
3004 | [multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) | 384 | 55.38 | 59.95 | 45.27 | 66.45 | 65.85 | 53.86 | 45.26 |
3005 | [text-embedding-ada-002(OpenAI)](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings) | 1536 | 53.02 | 52.0 | 43.35 | 69.56 | 64.31 | 54.28 | 45.68 |
3006 | [luotuo](https://huggingface.co/silk-road/luotuo-bert-medium) | 1024 | 49.37 | 44.4 | 42.78 | 66.62 | 61 | 49.25 | 44.39 |
3007 | [text2vec-base](https://huggingface.co/shibing624/text2vec-base-chinese) | 768 | 47.63 | 38.79 | 43.41 | 67.41 | 62.19 | 49.45 | 37.66 |
3008 | [text2vec-large](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | 1024 | 47.36 | 41.94 | 44.97 | 70.86 | 60.66 | 49.16 | 30.02 |
3009
3010
3011 - **Reranking**:
3012 See [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/) for evaluation script.
3013
3014 | Model | T2Reranking | T2RerankingZh2En\* | T2RerankingEn2Zh\* | MMarcoReranking | CMedQAv1 | CMedQAv2 | Avg |
3015 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
3016 | text2vec-base-multilingual | 64.66 | 62.94 | 62.51 | 14.37 | 48.46 | 48.6 | 50.26 |
3017 | multilingual-e5-small | 65.62 | 60.94 | 56.41 | 29.91 | 67.26 | 66.54 | 57.78 |
3018 | multilingual-e5-large | 64.55 | 61.61 | 54.28 | 28.6 | 67.42 | 67.92 | 57.4 |
3019 | multilingual-e5-base | 64.21 | 62.13 | 54.68 | 29.5 | 66.23 | 66.98 | 57.29 |
3020 | m3e-base | 66.03 | 62.74 | 56.07 | 17.51 | 77.05 | 76.76 | 59.36 |
3021 | m3e-large | 66.13 | 62.72 | 56.1 | 16.46 | 77.76 | 78.27 | 59.57 |
3022 | bge-base-zh-v1.5 | 66.49 | 63.25 | 57.02 | 29.74 | 80.47 | 84.88 | 63.64 |
3023 | bge-large-zh-v1.5 | 65.74 | 63.39 | 57.03 | 28.74 | 83.45 | 85.44 | 63.97 |
3024 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | 67.28 | 63.95 | 60.45 | 35.46 | 81.26 | 84.1 | 65.42 |
3025 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | 67.6 | 64.03 | 61.44 | 37.16 | 82.15 | 84.18 | 66.09 |
3026
3027 \* : T2RerankingZh2En and T2RerankingEn2Zh are cross-language retrieval tasks
3028
3029 ## Train
3030
3031 ### BAAI Embedding
3032
3033 We pre-train the models using [retromae](https://github.com/staoxiao/RetroMAE) and train them on large-scale pairs data using contrastive learning.
3034 **You can fine-tune the embedding model on your data following our [examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune).**
3035 We also provide a [pre-train example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain).
3036 Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned.
3037 More training details for bge see [baai_general_embedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md).
3038
3039
3040
3041 ### BGE Reranker
3042
3043 Cross-encoder will perform full-attention over the input pair,
3044 which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model.
3045 Therefore, it can be used to re-rank the top-k documents returned by embedding model.
3046 We train the cross-encoder on a multilingual pair data,
3047 The data format is the same as embedding model, so you can fine-tune it easily following our [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker).
3048 More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
3049
3050
3051 ## Contact
3052 If you have any question or suggestion related to this project, feel free to open an issue or pull request.
3053 You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn).
3054
3055
3056 ## Citation
3057
3058 If you find this repository useful, please consider giving a star :star: and citation
3059
3060 ```
3061 @misc{bge_embedding,
3062 title={C-Pack: Packaged Resources To Advance General Chinese Embedding},
3063 author={Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff},
3064 year={2023},
3065 eprint={2309.07597},
3066 archivePrefix={arXiv},
3067 primaryClass={cs.CL}
3068 }
3069 ```
3070
3071 ## License
3072 FlagEmbedding is licensed under the [MIT License](https://github.com/FlagOpen/FlagEmbedding/blob/master/LICENSE). The released models can be used for commercial purposes free of charge.
3073
3074