README.md
92.6 KB · 3075 lines · markdown Raw
1 ---
2 tags:
3 - sentence-transformers
4 - feature-extraction
5 - sentence-similarity
6 - transformers
7 - mteb
8 model-index:
9 - name: bge-small-en-v1.5
10 results:
11 - task:
12 type: Classification
13 dataset:
14 type: mteb/amazon_counterfactual
15 name: MTEB AmazonCounterfactualClassification (en)
16 config: en
17 split: test
18 revision: e8379541af4e31359cca9fbcf4b00f2671dba205
19 metrics:
20 - type: accuracy
21 value: 73.79104477611939
22 - type: ap
23 value: 37.21923821573361
24 - type: f1
25 value: 68.0914945617093
26 - task:
27 type: Classification
28 dataset:
29 type: mteb/amazon_polarity
30 name: MTEB AmazonPolarityClassification
31 config: default
32 split: test
33 revision: e2d317d38cd51312af73b3d32a06d1a08b442046
34 metrics:
35 - type: accuracy
36 value: 92.75377499999999
37 - type: ap
38 value: 89.46766124546022
39 - type: f1
40 value: 92.73884001331487
41 - task:
42 type: Classification
43 dataset:
44 type: mteb/amazon_reviews_multi
45 name: MTEB AmazonReviewsClassification (en)
46 config: en
47 split: test
48 revision: 1399c76144fd37290681b995c656ef9b2e06e26d
49 metrics:
50 - type: accuracy
51 value: 46.986
52 - type: f1
53 value: 46.55936786727896
54 - task:
55 type: Retrieval
56 dataset:
57 type: arguana
58 name: MTEB ArguAna
59 config: default
60 split: test
61 revision: None
62 metrics:
63 - type: map_at_1
64 value: 35.846000000000004
65 - type: map_at_10
66 value: 51.388
67 - type: map_at_100
68 value: 52.132999999999996
69 - type: map_at_1000
70 value: 52.141000000000005
71 - type: map_at_3
72 value: 47.037
73 - type: map_at_5
74 value: 49.579
75 - type: mrr_at_1
76 value: 36.558
77 - type: mrr_at_10
78 value: 51.658
79 - type: mrr_at_100
80 value: 52.402
81 - type: mrr_at_1000
82 value: 52.410000000000004
83 - type: mrr_at_3
84 value: 47.345
85 - type: mrr_at_5
86 value: 49.797999999999995
87 - type: ndcg_at_1
88 value: 35.846000000000004
89 - type: ndcg_at_10
90 value: 59.550000000000004
91 - type: ndcg_at_100
92 value: 62.596
93 - type: ndcg_at_1000
94 value: 62.759
95 - type: ndcg_at_3
96 value: 50.666999999999994
97 - type: ndcg_at_5
98 value: 55.228
99 - type: precision_at_1
100 value: 35.846000000000004
101 - type: precision_at_10
102 value: 8.542
103 - type: precision_at_100
104 value: 0.984
105 - type: precision_at_1000
106 value: 0.1
107 - type: precision_at_3
108 value: 20.389
109 - type: precision_at_5
110 value: 14.438
111 - type: recall_at_1
112 value: 35.846000000000004
113 - type: recall_at_10
114 value: 85.42
115 - type: recall_at_100
116 value: 98.43499999999999
117 - type: recall_at_1000
118 value: 99.644
119 - type: recall_at_3
120 value: 61.166
121 - type: recall_at_5
122 value: 72.191
123 - task:
124 type: Clustering
125 dataset:
126 type: mteb/arxiv-clustering-p2p
127 name: MTEB ArxivClusteringP2P
128 config: default
129 split: test
130 revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
131 metrics:
132 - type: v_measure
133 value: 47.402770198163594
134 - task:
135 type: Clustering
136 dataset:
137 type: mteb/arxiv-clustering-s2s
138 name: MTEB ArxivClusteringS2S
139 config: default
140 split: test
141 revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
142 metrics:
143 - type: v_measure
144 value: 40.01545436974177
145 - task:
146 type: Reranking
147 dataset:
148 type: mteb/askubuntudupquestions-reranking
149 name: MTEB AskUbuntuDupQuestions
150 config: default
151 split: test
152 revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
153 metrics:
154 - type: map
155 value: 62.586465273207196
156 - type: mrr
157 value: 74.42169019038825
158 - task:
159 type: STS
160 dataset:
161 type: mteb/biosses-sts
162 name: MTEB BIOSSES
163 config: default
164 split: test
165 revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
166 metrics:
167 - type: cos_sim_pearson
168 value: 85.1891186537969
169 - type: cos_sim_spearman
170 value: 83.75492046087288
171 - type: euclidean_pearson
172 value: 84.11766204805357
173 - type: euclidean_spearman
174 value: 84.01456493126516
175 - type: manhattan_pearson
176 value: 84.2132950502772
177 - type: manhattan_spearman
178 value: 83.89227298813377
179 - task:
180 type: Classification
181 dataset:
182 type: mteb/banking77
183 name: MTEB Banking77Classification
184 config: default
185 split: test
186 revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
187 metrics:
188 - type: accuracy
189 value: 85.74025974025975
190 - type: f1
191 value: 85.71493566466381
192 - task:
193 type: Clustering
194 dataset:
195 type: mteb/biorxiv-clustering-p2p
196 name: MTEB BiorxivClusteringP2P
197 config: default
198 split: test
199 revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
200 metrics:
201 - type: v_measure
202 value: 38.467181385006434
203 - task:
204 type: Clustering
205 dataset:
206 type: mteb/biorxiv-clustering-s2s
207 name: MTEB BiorxivClusteringS2S
208 config: default
209 split: test
210 revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
211 metrics:
212 - type: v_measure
213 value: 34.719496037339056
214 - task:
215 type: Retrieval
216 dataset:
217 type: BeIR/cqadupstack
218 name: MTEB CQADupstackAndroidRetrieval
219 config: default
220 split: test
221 revision: None
222 metrics:
223 - type: map_at_1
224 value: 29.587000000000003
225 - type: map_at_10
226 value: 41.114
227 - type: map_at_100
228 value: 42.532
229 - type: map_at_1000
230 value: 42.661
231 - type: map_at_3
232 value: 37.483
233 - type: map_at_5
234 value: 39.652
235 - type: mrr_at_1
236 value: 36.338
237 - type: mrr_at_10
238 value: 46.763
239 - type: mrr_at_100
240 value: 47.393
241 - type: mrr_at_1000
242 value: 47.445
243 - type: mrr_at_3
244 value: 43.538
245 - type: mrr_at_5
246 value: 45.556000000000004
247 - type: ndcg_at_1
248 value: 36.338
249 - type: ndcg_at_10
250 value: 47.658
251 - type: ndcg_at_100
252 value: 52.824000000000005
253 - type: ndcg_at_1000
254 value: 54.913999999999994
255 - type: ndcg_at_3
256 value: 41.989
257 - type: ndcg_at_5
258 value: 44.944
259 - type: precision_at_1
260 value: 36.338
261 - type: precision_at_10
262 value: 9.156
263 - type: precision_at_100
264 value: 1.4789999999999999
265 - type: precision_at_1000
266 value: 0.196
267 - type: precision_at_3
268 value: 20.076
269 - type: precision_at_5
270 value: 14.85
271 - type: recall_at_1
272 value: 29.587000000000003
273 - type: recall_at_10
274 value: 60.746
275 - type: recall_at_100
276 value: 82.157
277 - type: recall_at_1000
278 value: 95.645
279 - type: recall_at_3
280 value: 44.821
281 - type: recall_at_5
282 value: 52.819
283 - task:
284 type: Retrieval
285 dataset:
286 type: BeIR/cqadupstack
287 name: MTEB CQADupstackEnglishRetrieval
288 config: default
289 split: test
290 revision: None
291 metrics:
292 - type: map_at_1
293 value: 30.239
294 - type: map_at_10
295 value: 39.989000000000004
296 - type: map_at_100
297 value: 41.196
298 - type: map_at_1000
299 value: 41.325
300 - type: map_at_3
301 value: 37.261
302 - type: map_at_5
303 value: 38.833
304 - type: mrr_at_1
305 value: 37.516
306 - type: mrr_at_10
307 value: 46.177
308 - type: mrr_at_100
309 value: 46.806
310 - type: mrr_at_1000
311 value: 46.849000000000004
312 - type: mrr_at_3
313 value: 44.002
314 - type: mrr_at_5
315 value: 45.34
316 - type: ndcg_at_1
317 value: 37.516
318 - type: ndcg_at_10
319 value: 45.586
320 - type: ndcg_at_100
321 value: 49.897000000000006
322 - type: ndcg_at_1000
323 value: 51.955
324 - type: ndcg_at_3
325 value: 41.684
326 - type: ndcg_at_5
327 value: 43.617
328 - type: precision_at_1
329 value: 37.516
330 - type: precision_at_10
331 value: 8.522
332 - type: precision_at_100
333 value: 1.374
334 - type: precision_at_1000
335 value: 0.184
336 - type: precision_at_3
337 value: 20.105999999999998
338 - type: precision_at_5
339 value: 14.152999999999999
340 - type: recall_at_1
341 value: 30.239
342 - type: recall_at_10
343 value: 55.03
344 - type: recall_at_100
345 value: 73.375
346 - type: recall_at_1000
347 value: 86.29599999999999
348 - type: recall_at_3
349 value: 43.269000000000005
350 - type: recall_at_5
351 value: 48.878
352 - task:
353 type: Retrieval
354 dataset:
355 type: BeIR/cqadupstack
356 name: MTEB CQADupstackGamingRetrieval
357 config: default
358 split: test
359 revision: None
360 metrics:
361 - type: map_at_1
362 value: 38.338
363 - type: map_at_10
364 value: 50.468999999999994
365 - type: map_at_100
366 value: 51.553000000000004
367 - type: map_at_1000
368 value: 51.608
369 - type: map_at_3
370 value: 47.107
371 - type: map_at_5
372 value: 49.101
373 - type: mrr_at_1
374 value: 44.201
375 - type: mrr_at_10
376 value: 54.057
377 - type: mrr_at_100
378 value: 54.764
379 - type: mrr_at_1000
380 value: 54.791000000000004
381 - type: mrr_at_3
382 value: 51.56699999999999
383 - type: mrr_at_5
384 value: 53.05
385 - type: ndcg_at_1
386 value: 44.201
387 - type: ndcg_at_10
388 value: 56.379000000000005
389 - type: ndcg_at_100
390 value: 60.645
391 - type: ndcg_at_1000
392 value: 61.73499999999999
393 - type: ndcg_at_3
394 value: 50.726000000000006
395 - type: ndcg_at_5
396 value: 53.58500000000001
397 - type: precision_at_1
398 value: 44.201
399 - type: precision_at_10
400 value: 9.141
401 - type: precision_at_100
402 value: 1.216
403 - type: precision_at_1000
404 value: 0.135
405 - type: precision_at_3
406 value: 22.654
407 - type: precision_at_5
408 value: 15.723999999999998
409 - type: recall_at_1
410 value: 38.338
411 - type: recall_at_10
412 value: 70.30499999999999
413 - type: recall_at_100
414 value: 88.77199999999999
415 - type: recall_at_1000
416 value: 96.49799999999999
417 - type: recall_at_3
418 value: 55.218
419 - type: recall_at_5
420 value: 62.104000000000006
421 - task:
422 type: Retrieval
423 dataset:
424 type: BeIR/cqadupstack
425 name: MTEB CQADupstackGisRetrieval
426 config: default
427 split: test
428 revision: None
429 metrics:
430 - type: map_at_1
431 value: 25.682
432 - type: map_at_10
433 value: 33.498
434 - type: map_at_100
435 value: 34.461000000000006
436 - type: map_at_1000
437 value: 34.544000000000004
438 - type: map_at_3
439 value: 30.503999999999998
440 - type: map_at_5
441 value: 32.216
442 - type: mrr_at_1
443 value: 27.683999999999997
444 - type: mrr_at_10
445 value: 35.467999999999996
446 - type: mrr_at_100
447 value: 36.32
448 - type: mrr_at_1000
449 value: 36.386
450 - type: mrr_at_3
451 value: 32.618
452 - type: mrr_at_5
453 value: 34.262
454 - type: ndcg_at_1
455 value: 27.683999999999997
456 - type: ndcg_at_10
457 value: 38.378
458 - type: ndcg_at_100
459 value: 43.288
460 - type: ndcg_at_1000
461 value: 45.413
462 - type: ndcg_at_3
463 value: 32.586
464 - type: ndcg_at_5
465 value: 35.499
466 - type: precision_at_1
467 value: 27.683999999999997
468 - type: precision_at_10
469 value: 5.864
470 - type: precision_at_100
471 value: 0.882
472 - type: precision_at_1000
473 value: 0.11
474 - type: precision_at_3
475 value: 13.446
476 - type: precision_at_5
477 value: 9.718
478 - type: recall_at_1
479 value: 25.682
480 - type: recall_at_10
481 value: 51.712
482 - type: recall_at_100
483 value: 74.446
484 - type: recall_at_1000
485 value: 90.472
486 - type: recall_at_3
487 value: 36.236000000000004
488 - type: recall_at_5
489 value: 43.234
490 - task:
491 type: Retrieval
492 dataset:
493 type: BeIR/cqadupstack
494 name: MTEB CQADupstackMathematicaRetrieval
495 config: default
496 split: test
497 revision: None
498 metrics:
499 - type: map_at_1
500 value: 16.073999999999998
501 - type: map_at_10
502 value: 24.352999999999998
503 - type: map_at_100
504 value: 25.438
505 - type: map_at_1000
506 value: 25.545
507 - type: map_at_3
508 value: 21.614
509 - type: map_at_5
510 value: 23.104
511 - type: mrr_at_1
512 value: 19.776
513 - type: mrr_at_10
514 value: 28.837000000000003
515 - type: mrr_at_100
516 value: 29.755
517 - type: mrr_at_1000
518 value: 29.817
519 - type: mrr_at_3
520 value: 26.201999999999998
521 - type: mrr_at_5
522 value: 27.714
523 - type: ndcg_at_1
524 value: 19.776
525 - type: ndcg_at_10
526 value: 29.701
527 - type: ndcg_at_100
528 value: 35.307
529 - type: ndcg_at_1000
530 value: 37.942
531 - type: ndcg_at_3
532 value: 24.764
533 - type: ndcg_at_5
534 value: 27.025
535 - type: precision_at_1
536 value: 19.776
537 - type: precision_at_10
538 value: 5.659
539 - type: precision_at_100
540 value: 0.971
541 - type: precision_at_1000
542 value: 0.133
543 - type: precision_at_3
544 value: 12.065
545 - type: precision_at_5
546 value: 8.905000000000001
547 - type: recall_at_1
548 value: 16.073999999999998
549 - type: recall_at_10
550 value: 41.647
551 - type: recall_at_100
552 value: 66.884
553 - type: recall_at_1000
554 value: 85.91499999999999
555 - type: recall_at_3
556 value: 27.916
557 - type: recall_at_5
558 value: 33.729
559 - task:
560 type: Retrieval
561 dataset:
562 type: BeIR/cqadupstack
563 name: MTEB CQADupstackPhysicsRetrieval
564 config: default
565 split: test
566 revision: None
567 metrics:
568 - type: map_at_1
569 value: 28.444999999999997
570 - type: map_at_10
571 value: 38.218999999999994
572 - type: map_at_100
573 value: 39.595
574 - type: map_at_1000
575 value: 39.709
576 - type: map_at_3
577 value: 35.586
578 - type: map_at_5
579 value: 36.895
580 - type: mrr_at_1
581 value: 34.841
582 - type: mrr_at_10
583 value: 44.106
584 - type: mrr_at_100
585 value: 44.98
586 - type: mrr_at_1000
587 value: 45.03
588 - type: mrr_at_3
589 value: 41.979
590 - type: mrr_at_5
591 value: 43.047999999999995
592 - type: ndcg_at_1
593 value: 34.841
594 - type: ndcg_at_10
595 value: 43.922
596 - type: ndcg_at_100
597 value: 49.504999999999995
598 - type: ndcg_at_1000
599 value: 51.675000000000004
600 - type: ndcg_at_3
601 value: 39.858
602 - type: ndcg_at_5
603 value: 41.408
604 - type: precision_at_1
605 value: 34.841
606 - type: precision_at_10
607 value: 7.872999999999999
608 - type: precision_at_100
609 value: 1.2449999999999999
610 - type: precision_at_1000
611 value: 0.161
612 - type: precision_at_3
613 value: 18.993
614 - type: precision_at_5
615 value: 13.032
616 - type: recall_at_1
617 value: 28.444999999999997
618 - type: recall_at_10
619 value: 54.984
620 - type: recall_at_100
621 value: 78.342
622 - type: recall_at_1000
623 value: 92.77
624 - type: recall_at_3
625 value: 42.842999999999996
626 - type: recall_at_5
627 value: 47.247
628 - task:
629 type: Retrieval
630 dataset:
631 type: BeIR/cqadupstack
632 name: MTEB CQADupstackProgrammersRetrieval
633 config: default
634 split: test
635 revision: None
636 metrics:
637 - type: map_at_1
638 value: 23.072
639 - type: map_at_10
640 value: 32.354
641 - type: map_at_100
642 value: 33.800000000000004
643 - type: map_at_1000
644 value: 33.908
645 - type: map_at_3
646 value: 29.232000000000003
647 - type: map_at_5
648 value: 31.049
649 - type: mrr_at_1
650 value: 29.110000000000003
651 - type: mrr_at_10
652 value: 38.03
653 - type: mrr_at_100
654 value: 39.032
655 - type: mrr_at_1000
656 value: 39.086999999999996
657 - type: mrr_at_3
658 value: 35.407
659 - type: mrr_at_5
660 value: 36.76
661 - type: ndcg_at_1
662 value: 29.110000000000003
663 - type: ndcg_at_10
664 value: 38.231
665 - type: ndcg_at_100
666 value: 44.425
667 - type: ndcg_at_1000
668 value: 46.771
669 - type: ndcg_at_3
670 value: 33.095
671 - type: ndcg_at_5
672 value: 35.459
673 - type: precision_at_1
674 value: 29.110000000000003
675 - type: precision_at_10
676 value: 7.215000000000001
677 - type: precision_at_100
678 value: 1.2109999999999999
679 - type: precision_at_1000
680 value: 0.157
681 - type: precision_at_3
682 value: 16.058
683 - type: precision_at_5
684 value: 11.644
685 - type: recall_at_1
686 value: 23.072
687 - type: recall_at_10
688 value: 50.285999999999994
689 - type: recall_at_100
690 value: 76.596
691 - type: recall_at_1000
692 value: 92.861
693 - type: recall_at_3
694 value: 35.702
695 - type: recall_at_5
696 value: 42.152
697 - task:
698 type: Retrieval
699 dataset:
700 type: BeIR/cqadupstack
701 name: MTEB CQADupstackRetrieval
702 config: default
703 split: test
704 revision: None
705 metrics:
706 - type: map_at_1
707 value: 24.937916666666666
708 - type: map_at_10
709 value: 33.755250000000004
710 - type: map_at_100
711 value: 34.955999999999996
712 - type: map_at_1000
713 value: 35.070499999999996
714 - type: map_at_3
715 value: 30.98708333333333
716 - type: map_at_5
717 value: 32.51491666666666
718 - type: mrr_at_1
719 value: 29.48708333333333
720 - type: mrr_at_10
721 value: 37.92183333333334
722 - type: mrr_at_100
723 value: 38.76583333333333
724 - type: mrr_at_1000
725 value: 38.82466666666667
726 - type: mrr_at_3
727 value: 35.45125
728 - type: mrr_at_5
729 value: 36.827000000000005
730 - type: ndcg_at_1
731 value: 29.48708333333333
732 - type: ndcg_at_10
733 value: 39.05225
734 - type: ndcg_at_100
735 value: 44.25983333333334
736 - type: ndcg_at_1000
737 value: 46.568333333333335
738 - type: ndcg_at_3
739 value: 34.271583333333325
740 - type: ndcg_at_5
741 value: 36.483916666666666
742 - type: precision_at_1
743 value: 29.48708333333333
744 - type: precision_at_10
745 value: 6.865749999999999
746 - type: precision_at_100
747 value: 1.1195833333333332
748 - type: precision_at_1000
749 value: 0.15058333333333335
750 - type: precision_at_3
751 value: 15.742083333333333
752 - type: precision_at_5
753 value: 11.221916666666667
754 - type: recall_at_1
755 value: 24.937916666666666
756 - type: recall_at_10
757 value: 50.650416666666665
758 - type: recall_at_100
759 value: 73.55383333333334
760 - type: recall_at_1000
761 value: 89.61691666666667
762 - type: recall_at_3
763 value: 37.27808333333334
764 - type: recall_at_5
765 value: 42.99475
766 - task:
767 type: Retrieval
768 dataset:
769 type: BeIR/cqadupstack
770 name: MTEB CQADupstackStatsRetrieval
771 config: default
772 split: test
773 revision: None
774 metrics:
775 - type: map_at_1
776 value: 23.947
777 - type: map_at_10
778 value: 30.575000000000003
779 - type: map_at_100
780 value: 31.465
781 - type: map_at_1000
782 value: 31.558000000000003
783 - type: map_at_3
784 value: 28.814
785 - type: map_at_5
786 value: 29.738999999999997
787 - type: mrr_at_1
788 value: 26.994
789 - type: mrr_at_10
790 value: 33.415
791 - type: mrr_at_100
792 value: 34.18
793 - type: mrr_at_1000
794 value: 34.245
795 - type: mrr_at_3
796 value: 31.621
797 - type: mrr_at_5
798 value: 32.549
799 - type: ndcg_at_1
800 value: 26.994
801 - type: ndcg_at_10
802 value: 34.482
803 - type: ndcg_at_100
804 value: 38.915
805 - type: ndcg_at_1000
806 value: 41.355
807 - type: ndcg_at_3
808 value: 31.139
809 - type: ndcg_at_5
810 value: 32.589
811 - type: precision_at_1
812 value: 26.994
813 - type: precision_at_10
814 value: 5.322
815 - type: precision_at_100
816 value: 0.8160000000000001
817 - type: precision_at_1000
818 value: 0.11100000000000002
819 - type: precision_at_3
820 value: 13.344000000000001
821 - type: precision_at_5
822 value: 8.988
823 - type: recall_at_1
824 value: 23.947
825 - type: recall_at_10
826 value: 43.647999999999996
827 - type: recall_at_100
828 value: 63.851
829 - type: recall_at_1000
830 value: 82.0
831 - type: recall_at_3
832 value: 34.288000000000004
833 - type: recall_at_5
834 value: 38.117000000000004
835 - task:
836 type: Retrieval
837 dataset:
838 type: BeIR/cqadupstack
839 name: MTEB CQADupstackTexRetrieval
840 config: default
841 split: test
842 revision: None
843 metrics:
844 - type: map_at_1
845 value: 16.197
846 - type: map_at_10
847 value: 22.968
848 - type: map_at_100
849 value: 24.095
850 - type: map_at_1000
851 value: 24.217
852 - type: map_at_3
853 value: 20.771
854 - type: map_at_5
855 value: 21.995
856 - type: mrr_at_1
857 value: 19.511
858 - type: mrr_at_10
859 value: 26.55
860 - type: mrr_at_100
861 value: 27.500999999999998
862 - type: mrr_at_1000
863 value: 27.578999999999997
864 - type: mrr_at_3
865 value: 24.421
866 - type: mrr_at_5
867 value: 25.604
868 - type: ndcg_at_1
869 value: 19.511
870 - type: ndcg_at_10
871 value: 27.386
872 - type: ndcg_at_100
873 value: 32.828
874 - type: ndcg_at_1000
875 value: 35.739
876 - type: ndcg_at_3
877 value: 23.405
878 - type: ndcg_at_5
879 value: 25.255
880 - type: precision_at_1
881 value: 19.511
882 - type: precision_at_10
883 value: 5.017
884 - type: precision_at_100
885 value: 0.91
886 - type: precision_at_1000
887 value: 0.133
888 - type: precision_at_3
889 value: 11.023
890 - type: precision_at_5
891 value: 8.025
892 - type: recall_at_1
893 value: 16.197
894 - type: recall_at_10
895 value: 37.09
896 - type: recall_at_100
897 value: 61.778
898 - type: recall_at_1000
899 value: 82.56599999999999
900 - type: recall_at_3
901 value: 26.034000000000002
902 - type: recall_at_5
903 value: 30.762
904 - task:
905 type: Retrieval
906 dataset:
907 type: BeIR/cqadupstack
908 name: MTEB CQADupstackUnixRetrieval
909 config: default
910 split: test
911 revision: None
912 metrics:
913 - type: map_at_1
914 value: 25.41
915 - type: map_at_10
916 value: 33.655
917 - type: map_at_100
918 value: 34.892
919 - type: map_at_1000
920 value: 34.995
921 - type: map_at_3
922 value: 30.94
923 - type: map_at_5
924 value: 32.303
925 - type: mrr_at_1
926 value: 29.477999999999998
927 - type: mrr_at_10
928 value: 37.443
929 - type: mrr_at_100
930 value: 38.383
931 - type: mrr_at_1000
932 value: 38.440000000000005
933 - type: mrr_at_3
934 value: 34.949999999999996
935 - type: mrr_at_5
936 value: 36.228
937 - type: ndcg_at_1
938 value: 29.477999999999998
939 - type: ndcg_at_10
940 value: 38.769
941 - type: ndcg_at_100
942 value: 44.245000000000005
943 - type: ndcg_at_1000
944 value: 46.593
945 - type: ndcg_at_3
946 value: 33.623
947 - type: ndcg_at_5
948 value: 35.766
949 - type: precision_at_1
950 value: 29.477999999999998
951 - type: precision_at_10
952 value: 6.455
953 - type: precision_at_100
954 value: 1.032
955 - type: precision_at_1000
956 value: 0.135
957 - type: precision_at_3
958 value: 14.893999999999998
959 - type: precision_at_5
960 value: 10.485
961 - type: recall_at_1
962 value: 25.41
963 - type: recall_at_10
964 value: 50.669
965 - type: recall_at_100
966 value: 74.084
967 - type: recall_at_1000
968 value: 90.435
969 - type: recall_at_3
970 value: 36.679
971 - type: recall_at_5
972 value: 41.94
973 - task:
974 type: Retrieval
975 dataset:
976 type: BeIR/cqadupstack
977 name: MTEB CQADupstackWebmastersRetrieval
978 config: default
979 split: test
980 revision: None
981 metrics:
982 - type: map_at_1
983 value: 23.339
984 - type: map_at_10
985 value: 31.852000000000004
986 - type: map_at_100
987 value: 33.411
988 - type: map_at_1000
989 value: 33.62
990 - type: map_at_3
991 value: 28.929
992 - type: map_at_5
993 value: 30.542
994 - type: mrr_at_1
995 value: 28.063
996 - type: mrr_at_10
997 value: 36.301
998 - type: mrr_at_100
999 value: 37.288
1000 - type: mrr_at_1000
1001 value: 37.349
1002 - type: mrr_at_3
1003 value: 33.663
1004 - type: mrr_at_5
1005 value: 35.165
1006 - type: ndcg_at_1
1007 value: 28.063
1008 - type: ndcg_at_10
1009 value: 37.462
1010 - type: ndcg_at_100
1011 value: 43.620999999999995
1012 - type: ndcg_at_1000
1013 value: 46.211
1014 - type: ndcg_at_3
1015 value: 32.68
1016 - type: ndcg_at_5
1017 value: 34.981
1018 - type: precision_at_1
1019 value: 28.063
1020 - type: precision_at_10
1021 value: 7.1739999999999995
1022 - type: precision_at_100
1023 value: 1.486
1024 - type: precision_at_1000
1025 value: 0.23500000000000001
1026 - type: precision_at_3
1027 value: 15.217
1028 - type: precision_at_5
1029 value: 11.265
1030 - type: recall_at_1
1031 value: 23.339
1032 - type: recall_at_10
1033 value: 48.376999999999995
1034 - type: recall_at_100
1035 value: 76.053
1036 - type: recall_at_1000
1037 value: 92.455
1038 - type: recall_at_3
1039 value: 34.735
1040 - type: recall_at_5
1041 value: 40.71
1042 - task:
1043 type: Retrieval
1044 dataset:
1045 type: BeIR/cqadupstack
1046 name: MTEB CQADupstackWordpressRetrieval
1047 config: default
1048 split: test
1049 revision: None
1050 metrics:
1051 - type: map_at_1
1052 value: 18.925
1053 - type: map_at_10
1054 value: 26.017000000000003
1055 - type: map_at_100
1056 value: 27.034000000000002
1057 - type: map_at_1000
1058 value: 27.156000000000002
1059 - type: map_at_3
1060 value: 23.604
1061 - type: map_at_5
1062 value: 24.75
1063 - type: mrr_at_1
1064 value: 20.333000000000002
1065 - type: mrr_at_10
1066 value: 27.915
1067 - type: mrr_at_100
1068 value: 28.788000000000004
1069 - type: mrr_at_1000
1070 value: 28.877999999999997
1071 - type: mrr_at_3
1072 value: 25.446999999999996
1073 - type: mrr_at_5
1074 value: 26.648
1075 - type: ndcg_at_1
1076 value: 20.333000000000002
1077 - type: ndcg_at_10
1078 value: 30.673000000000002
1079 - type: ndcg_at_100
1080 value: 35.618
1081 - type: ndcg_at_1000
1082 value: 38.517
1083 - type: ndcg_at_3
1084 value: 25.71
1085 - type: ndcg_at_5
1086 value: 27.679
1087 - type: precision_at_1
1088 value: 20.333000000000002
1089 - type: precision_at_10
1090 value: 4.9910000000000005
1091 - type: precision_at_100
1092 value: 0.8130000000000001
1093 - type: precision_at_1000
1094 value: 0.117
1095 - type: precision_at_3
1096 value: 11.029
1097 - type: precision_at_5
1098 value: 7.8740000000000006
1099 - type: recall_at_1
1100 value: 18.925
1101 - type: recall_at_10
1102 value: 43.311
1103 - type: recall_at_100
1104 value: 66.308
1105 - type: recall_at_1000
1106 value: 87.49
1107 - type: recall_at_3
1108 value: 29.596
1109 - type: recall_at_5
1110 value: 34.245
1111 - task:
1112 type: Retrieval
1113 dataset:
1114 type: climate-fever
1115 name: MTEB ClimateFEVER
1116 config: default
1117 split: test
1118 revision: None
1119 metrics:
1120 - type: map_at_1
1121 value: 13.714
1122 - type: map_at_10
1123 value: 23.194
1124 - type: map_at_100
1125 value: 24.976000000000003
1126 - type: map_at_1000
1127 value: 25.166
1128 - type: map_at_3
1129 value: 19.709
1130 - type: map_at_5
1131 value: 21.523999999999997
1132 - type: mrr_at_1
1133 value: 30.619000000000003
1134 - type: mrr_at_10
1135 value: 42.563
1136 - type: mrr_at_100
1137 value: 43.386
1138 - type: mrr_at_1000
1139 value: 43.423
1140 - type: mrr_at_3
1141 value: 39.555
1142 - type: mrr_at_5
1143 value: 41.268
1144 - type: ndcg_at_1
1145 value: 30.619000000000003
1146 - type: ndcg_at_10
1147 value: 31.836
1148 - type: ndcg_at_100
1149 value: 38.652
1150 - type: ndcg_at_1000
1151 value: 42.088
1152 - type: ndcg_at_3
1153 value: 26.733
1154 - type: ndcg_at_5
1155 value: 28.435
1156 - type: precision_at_1
1157 value: 30.619000000000003
1158 - type: precision_at_10
1159 value: 9.751999999999999
1160 - type: precision_at_100
1161 value: 1.71
1162 - type: precision_at_1000
1163 value: 0.23500000000000001
1164 - type: precision_at_3
1165 value: 19.935
1166 - type: precision_at_5
1167 value: 14.984
1168 - type: recall_at_1
1169 value: 13.714
1170 - type: recall_at_10
1171 value: 37.26
1172 - type: recall_at_100
1173 value: 60.546
1174 - type: recall_at_1000
1175 value: 79.899
1176 - type: recall_at_3
1177 value: 24.325
1178 - type: recall_at_5
1179 value: 29.725
1180 - task:
1181 type: Retrieval
1182 dataset:
1183 type: dbpedia-entity
1184 name: MTEB DBPedia
1185 config: default
1186 split: test
1187 revision: None
1188 metrics:
1189 - type: map_at_1
1190 value: 8.462
1191 - type: map_at_10
1192 value: 18.637
1193 - type: map_at_100
1194 value: 26.131999999999998
1195 - type: map_at_1000
1196 value: 27.607
1197 - type: map_at_3
1198 value: 13.333
1199 - type: map_at_5
1200 value: 15.654000000000002
1201 - type: mrr_at_1
1202 value: 66.25
1203 - type: mrr_at_10
1204 value: 74.32600000000001
1205 - type: mrr_at_100
1206 value: 74.60900000000001
1207 - type: mrr_at_1000
1208 value: 74.62
1209 - type: mrr_at_3
1210 value: 72.667
1211 - type: mrr_at_5
1212 value: 73.817
1213 - type: ndcg_at_1
1214 value: 53.87499999999999
1215 - type: ndcg_at_10
1216 value: 40.028999999999996
1217 - type: ndcg_at_100
1218 value: 44.199
1219 - type: ndcg_at_1000
1220 value: 51.629999999999995
1221 - type: ndcg_at_3
1222 value: 44.113
1223 - type: ndcg_at_5
1224 value: 41.731
1225 - type: precision_at_1
1226 value: 66.25
1227 - type: precision_at_10
1228 value: 31.900000000000002
1229 - type: precision_at_100
1230 value: 10.043000000000001
1231 - type: precision_at_1000
1232 value: 1.926
1233 - type: precision_at_3
1234 value: 47.417
1235 - type: precision_at_5
1236 value: 40.65
1237 - type: recall_at_1
1238 value: 8.462
1239 - type: recall_at_10
1240 value: 24.293
1241 - type: recall_at_100
1242 value: 50.146
1243 - type: recall_at_1000
1244 value: 74.034
1245 - type: recall_at_3
1246 value: 14.967
1247 - type: recall_at_5
1248 value: 18.682000000000002
1249 - task:
1250 type: Classification
1251 dataset:
1252 type: mteb/emotion
1253 name: MTEB EmotionClassification
1254 config: default
1255 split: test
1256 revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1257 metrics:
1258 - type: accuracy
1259 value: 47.84499999999999
1260 - type: f1
1261 value: 42.48106691979349
1262 - task:
1263 type: Retrieval
1264 dataset:
1265 type: fever
1266 name: MTEB FEVER
1267 config: default
1268 split: test
1269 revision: None
1270 metrics:
1271 - type: map_at_1
1272 value: 74.034
1273 - type: map_at_10
1274 value: 82.76
1275 - type: map_at_100
1276 value: 82.968
1277 - type: map_at_1000
1278 value: 82.98299999999999
1279 - type: map_at_3
1280 value: 81.768
1281 - type: map_at_5
1282 value: 82.418
1283 - type: mrr_at_1
1284 value: 80.048
1285 - type: mrr_at_10
1286 value: 87.64999999999999
1287 - type: mrr_at_100
1288 value: 87.712
1289 - type: mrr_at_1000
1290 value: 87.713
1291 - type: mrr_at_3
1292 value: 87.01100000000001
1293 - type: mrr_at_5
1294 value: 87.466
1295 - type: ndcg_at_1
1296 value: 80.048
1297 - type: ndcg_at_10
1298 value: 86.643
1299 - type: ndcg_at_100
1300 value: 87.361
1301 - type: ndcg_at_1000
1302 value: 87.606
1303 - type: ndcg_at_3
1304 value: 85.137
1305 - type: ndcg_at_5
1306 value: 86.016
1307 - type: precision_at_1
1308 value: 80.048
1309 - type: precision_at_10
1310 value: 10.372
1311 - type: precision_at_100
1312 value: 1.093
1313 - type: precision_at_1000
1314 value: 0.11299999999999999
1315 - type: precision_at_3
1316 value: 32.638
1317 - type: precision_at_5
1318 value: 20.177
1319 - type: recall_at_1
1320 value: 74.034
1321 - type: recall_at_10
1322 value: 93.769
1323 - type: recall_at_100
1324 value: 96.569
1325 - type: recall_at_1000
1326 value: 98.039
1327 - type: recall_at_3
1328 value: 89.581
1329 - type: recall_at_5
1330 value: 91.906
1331 - task:
1332 type: Retrieval
1333 dataset:
1334 type: fiqa
1335 name: MTEB FiQA2018
1336 config: default
1337 split: test
1338 revision: None
1339 metrics:
1340 - type: map_at_1
1341 value: 20.5
1342 - type: map_at_10
1343 value: 32.857
1344 - type: map_at_100
1345 value: 34.589
1346 - type: map_at_1000
1347 value: 34.778
1348 - type: map_at_3
1349 value: 29.160999999999998
1350 - type: map_at_5
1351 value: 31.033
1352 - type: mrr_at_1
1353 value: 40.123
1354 - type: mrr_at_10
1355 value: 48.776
1356 - type: mrr_at_100
1357 value: 49.495
1358 - type: mrr_at_1000
1359 value: 49.539
1360 - type: mrr_at_3
1361 value: 46.605000000000004
1362 - type: mrr_at_5
1363 value: 47.654
1364 - type: ndcg_at_1
1365 value: 40.123
1366 - type: ndcg_at_10
1367 value: 40.343
1368 - type: ndcg_at_100
1369 value: 46.56
1370 - type: ndcg_at_1000
1371 value: 49.777
1372 - type: ndcg_at_3
1373 value: 37.322
1374 - type: ndcg_at_5
1375 value: 37.791000000000004
1376 - type: precision_at_1
1377 value: 40.123
1378 - type: precision_at_10
1379 value: 11.08
1380 - type: precision_at_100
1381 value: 1.752
1382 - type: precision_at_1000
1383 value: 0.232
1384 - type: precision_at_3
1385 value: 24.897
1386 - type: precision_at_5
1387 value: 17.809
1388 - type: recall_at_1
1389 value: 20.5
1390 - type: recall_at_10
1391 value: 46.388
1392 - type: recall_at_100
1393 value: 69.552
1394 - type: recall_at_1000
1395 value: 89.011
1396 - type: recall_at_3
1397 value: 33.617999999999995
1398 - type: recall_at_5
1399 value: 38.211
1400 - task:
1401 type: Retrieval
1402 dataset:
1403 type: hotpotqa
1404 name: MTEB HotpotQA
1405 config: default
1406 split: test
1407 revision: None
1408 metrics:
1409 - type: map_at_1
1410 value: 39.135999999999996
1411 - type: map_at_10
1412 value: 61.673
1413 - type: map_at_100
1414 value: 62.562
1415 - type: map_at_1000
1416 value: 62.62
1417 - type: map_at_3
1418 value: 58.467999999999996
1419 - type: map_at_5
1420 value: 60.463
1421 - type: mrr_at_1
1422 value: 78.271
1423 - type: mrr_at_10
1424 value: 84.119
1425 - type: mrr_at_100
1426 value: 84.29299999999999
1427 - type: mrr_at_1000
1428 value: 84.299
1429 - type: mrr_at_3
1430 value: 83.18900000000001
1431 - type: mrr_at_5
1432 value: 83.786
1433 - type: ndcg_at_1
1434 value: 78.271
1435 - type: ndcg_at_10
1436 value: 69.935
1437 - type: ndcg_at_100
1438 value: 73.01299999999999
1439 - type: ndcg_at_1000
1440 value: 74.126
1441 - type: ndcg_at_3
1442 value: 65.388
1443 - type: ndcg_at_5
1444 value: 67.906
1445 - type: precision_at_1
1446 value: 78.271
1447 - type: precision_at_10
1448 value: 14.562
1449 - type: precision_at_100
1450 value: 1.6969999999999998
1451 - type: precision_at_1000
1452 value: 0.184
1453 - type: precision_at_3
1454 value: 41.841
1455 - type: precision_at_5
1456 value: 27.087
1457 - type: recall_at_1
1458 value: 39.135999999999996
1459 - type: recall_at_10
1460 value: 72.809
1461 - type: recall_at_100
1462 value: 84.86200000000001
1463 - type: recall_at_1000
1464 value: 92.208
1465 - type: recall_at_3
1466 value: 62.76199999999999
1467 - type: recall_at_5
1468 value: 67.718
1469 - task:
1470 type: Classification
1471 dataset:
1472 type: mteb/imdb
1473 name: MTEB ImdbClassification
1474 config: default
1475 split: test
1476 revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1477 metrics:
1478 - type: accuracy
1479 value: 90.60600000000001
1480 - type: ap
1481 value: 86.6579587804335
1482 - type: f1
1483 value: 90.5938853929307
1484 - task:
1485 type: Retrieval
1486 dataset:
1487 type: msmarco
1488 name: MTEB MSMARCO
1489 config: default
1490 split: dev
1491 revision: None
1492 metrics:
1493 - type: map_at_1
1494 value: 21.852
1495 - type: map_at_10
1496 value: 33.982
1497 - type: map_at_100
1498 value: 35.116
1499 - type: map_at_1000
1500 value: 35.167
1501 - type: map_at_3
1502 value: 30.134
1503 - type: map_at_5
1504 value: 32.340999999999994
1505 - type: mrr_at_1
1506 value: 22.479
1507 - type: mrr_at_10
1508 value: 34.594
1509 - type: mrr_at_100
1510 value: 35.672
1511 - type: mrr_at_1000
1512 value: 35.716
1513 - type: mrr_at_3
1514 value: 30.84
1515 - type: mrr_at_5
1516 value: 32.998
1517 - type: ndcg_at_1
1518 value: 22.493
1519 - type: ndcg_at_10
1520 value: 40.833000000000006
1521 - type: ndcg_at_100
1522 value: 46.357
1523 - type: ndcg_at_1000
1524 value: 47.637
1525 - type: ndcg_at_3
1526 value: 32.995999999999995
1527 - type: ndcg_at_5
1528 value: 36.919000000000004
1529 - type: precision_at_1
1530 value: 22.493
1531 - type: precision_at_10
1532 value: 6.465999999999999
1533 - type: precision_at_100
1534 value: 0.9249999999999999
1535 - type: precision_at_1000
1536 value: 0.104
1537 - type: precision_at_3
1538 value: 14.030999999999999
1539 - type: precision_at_5
1540 value: 10.413
1541 - type: recall_at_1
1542 value: 21.852
1543 - type: recall_at_10
1544 value: 61.934999999999995
1545 - type: recall_at_100
1546 value: 87.611
1547 - type: recall_at_1000
1548 value: 97.441
1549 - type: recall_at_3
1550 value: 40.583999999999996
1551 - type: recall_at_5
1552 value: 49.992999999999995
1553 - task:
1554 type: Classification
1555 dataset:
1556 type: mteb/mtop_domain
1557 name: MTEB MTOPDomainClassification (en)
1558 config: en
1559 split: test
1560 revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1561 metrics:
1562 - type: accuracy
1563 value: 93.36069311445507
1564 - type: f1
1565 value: 93.16456330371453
1566 - task:
1567 type: Classification
1568 dataset:
1569 type: mteb/mtop_intent
1570 name: MTEB MTOPIntentClassification (en)
1571 config: en
1572 split: test
1573 revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1574 metrics:
1575 - type: accuracy
1576 value: 74.74692202462381
1577 - type: f1
1578 value: 58.17903579421599
1579 - task:
1580 type: Classification
1581 dataset:
1582 type: mteb/amazon_massive_intent
1583 name: MTEB MassiveIntentClassification (en)
1584 config: en
1585 split: test
1586 revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1587 metrics:
1588 - type: accuracy
1589 value: 74.80833893745796
1590 - type: f1
1591 value: 72.70786592684664
1592 - task:
1593 type: Classification
1594 dataset:
1595 type: mteb/amazon_massive_scenario
1596 name: MTEB MassiveScenarioClassification (en)
1597 config: en
1598 split: test
1599 revision: 7d571f92784cd94a019292a1f45445077d0ef634
1600 metrics:
1601 - type: accuracy
1602 value: 78.69872225958305
1603 - type: f1
1604 value: 78.61626934504731
1605 - task:
1606 type: Clustering
1607 dataset:
1608 type: mteb/medrxiv-clustering-p2p
1609 name: MTEB MedrxivClusteringP2P
1610 config: default
1611 split: test
1612 revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1613 metrics:
1614 - type: v_measure
1615 value: 33.058658628717694
1616 - task:
1617 type: Clustering
1618 dataset:
1619 type: mteb/medrxiv-clustering-s2s
1620 name: MTEB MedrxivClusteringS2S
1621 config: default
1622 split: test
1623 revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1624 metrics:
1625 - type: v_measure
1626 value: 30.85561739360599
1627 - task:
1628 type: Reranking
1629 dataset:
1630 type: mteb/mind_small
1631 name: MTEB MindSmallReranking
1632 config: default
1633 split: test
1634 revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1635 metrics:
1636 - type: map
1637 value: 31.290259910144385
1638 - type: mrr
1639 value: 32.44223046102856
1640 - task:
1641 type: Retrieval
1642 dataset:
1643 type: nfcorpus
1644 name: MTEB NFCorpus
1645 config: default
1646 split: test
1647 revision: None
1648 metrics:
1649 - type: map_at_1
1650 value: 5.288
1651 - type: map_at_10
1652 value: 12.267999999999999
1653 - type: map_at_100
1654 value: 15.557000000000002
1655 - type: map_at_1000
1656 value: 16.98
1657 - type: map_at_3
1658 value: 8.866
1659 - type: map_at_5
1660 value: 10.418
1661 - type: mrr_at_1
1662 value: 43.653
1663 - type: mrr_at_10
1664 value: 52.681
1665 - type: mrr_at_100
1666 value: 53.315999999999995
1667 - type: mrr_at_1000
1668 value: 53.357
1669 - type: mrr_at_3
1670 value: 51.393
1671 - type: mrr_at_5
1672 value: 51.903999999999996
1673 - type: ndcg_at_1
1674 value: 42.415000000000006
1675 - type: ndcg_at_10
1676 value: 34.305
1677 - type: ndcg_at_100
1678 value: 30.825999999999997
1679 - type: ndcg_at_1000
1680 value: 39.393
1681 - type: ndcg_at_3
1682 value: 39.931
1683 - type: ndcg_at_5
1684 value: 37.519999999999996
1685 - type: precision_at_1
1686 value: 43.653
1687 - type: precision_at_10
1688 value: 25.728
1689 - type: precision_at_100
1690 value: 7.932
1691 - type: precision_at_1000
1692 value: 2.07
1693 - type: precision_at_3
1694 value: 38.184000000000005
1695 - type: precision_at_5
1696 value: 32.879000000000005
1697 - type: recall_at_1
1698 value: 5.288
1699 - type: recall_at_10
1700 value: 16.195
1701 - type: recall_at_100
1702 value: 31.135
1703 - type: recall_at_1000
1704 value: 61.531000000000006
1705 - type: recall_at_3
1706 value: 10.313
1707 - type: recall_at_5
1708 value: 12.754999999999999
1709 - task:
1710 type: Retrieval
1711 dataset:
1712 type: nq
1713 name: MTEB NQ
1714 config: default
1715 split: test
1716 revision: None
1717 metrics:
1718 - type: map_at_1
1719 value: 28.216
1720 - type: map_at_10
1721 value: 42.588
1722 - type: map_at_100
1723 value: 43.702999999999996
1724 - type: map_at_1000
1725 value: 43.739
1726 - type: map_at_3
1727 value: 38.177
1728 - type: map_at_5
1729 value: 40.754000000000005
1730 - type: mrr_at_1
1731 value: 31.866
1732 - type: mrr_at_10
1733 value: 45.189
1734 - type: mrr_at_100
1735 value: 46.056000000000004
1736 - type: mrr_at_1000
1737 value: 46.081
1738 - type: mrr_at_3
1739 value: 41.526999999999994
1740 - type: mrr_at_5
1741 value: 43.704
1742 - type: ndcg_at_1
1743 value: 31.837
1744 - type: ndcg_at_10
1745 value: 50.178
1746 - type: ndcg_at_100
1747 value: 54.98800000000001
1748 - type: ndcg_at_1000
1749 value: 55.812
1750 - type: ndcg_at_3
1751 value: 41.853
1752 - type: ndcg_at_5
1753 value: 46.153
1754 - type: precision_at_1
1755 value: 31.837
1756 - type: precision_at_10
1757 value: 8.43
1758 - type: precision_at_100
1759 value: 1.1119999999999999
1760 - type: precision_at_1000
1761 value: 0.11900000000000001
1762 - type: precision_at_3
1763 value: 19.023
1764 - type: precision_at_5
1765 value: 13.911000000000001
1766 - type: recall_at_1
1767 value: 28.216
1768 - type: recall_at_10
1769 value: 70.8
1770 - type: recall_at_100
1771 value: 91.857
1772 - type: recall_at_1000
1773 value: 97.941
1774 - type: recall_at_3
1775 value: 49.196
1776 - type: recall_at_5
1777 value: 59.072
1778 - task:
1779 type: Retrieval
1780 dataset:
1781 type: quora
1782 name: MTEB QuoraRetrieval
1783 config: default
1784 split: test
1785 revision: None
1786 metrics:
1787 - type: map_at_1
1788 value: 71.22800000000001
1789 - type: map_at_10
1790 value: 85.115
1791 - type: map_at_100
1792 value: 85.72
1793 - type: map_at_1000
1794 value: 85.737
1795 - type: map_at_3
1796 value: 82.149
1797 - type: map_at_5
1798 value: 84.029
1799 - type: mrr_at_1
1800 value: 81.96
1801 - type: mrr_at_10
1802 value: 88.00200000000001
1803 - type: mrr_at_100
1804 value: 88.088
1805 - type: mrr_at_1000
1806 value: 88.089
1807 - type: mrr_at_3
1808 value: 87.055
1809 - type: mrr_at_5
1810 value: 87.715
1811 - type: ndcg_at_1
1812 value: 82.01
1813 - type: ndcg_at_10
1814 value: 88.78
1815 - type: ndcg_at_100
1816 value: 89.91
1817 - type: ndcg_at_1000
1818 value: 90.013
1819 - type: ndcg_at_3
1820 value: 85.957
1821 - type: ndcg_at_5
1822 value: 87.56
1823 - type: precision_at_1
1824 value: 82.01
1825 - type: precision_at_10
1826 value: 13.462
1827 - type: precision_at_100
1828 value: 1.528
1829 - type: precision_at_1000
1830 value: 0.157
1831 - type: precision_at_3
1832 value: 37.553
1833 - type: precision_at_5
1834 value: 24.732000000000003
1835 - type: recall_at_1
1836 value: 71.22800000000001
1837 - type: recall_at_10
1838 value: 95.69
1839 - type: recall_at_100
1840 value: 99.531
1841 - type: recall_at_1000
1842 value: 99.98
1843 - type: recall_at_3
1844 value: 87.632
1845 - type: recall_at_5
1846 value: 92.117
1847 - task:
1848 type: Clustering
1849 dataset:
1850 type: mteb/reddit-clustering
1851 name: MTEB RedditClustering
1852 config: default
1853 split: test
1854 revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1855 metrics:
1856 - type: v_measure
1857 value: 52.31768034366916
1858 - task:
1859 type: Clustering
1860 dataset:
1861 type: mteb/reddit-clustering-p2p
1862 name: MTEB RedditClusteringP2P
1863 config: default
1864 split: test
1865 revision: 282350215ef01743dc01b456c7f5241fa8937f16
1866 metrics:
1867 - type: v_measure
1868 value: 60.640266772723606
1869 - task:
1870 type: Retrieval
1871 dataset:
1872 type: scidocs
1873 name: MTEB SCIDOCS
1874 config: default
1875 split: test
1876 revision: None
1877 metrics:
1878 - type: map_at_1
1879 value: 4.7780000000000005
1880 - type: map_at_10
1881 value: 12.299
1882 - type: map_at_100
1883 value: 14.363000000000001
1884 - type: map_at_1000
1885 value: 14.71
1886 - type: map_at_3
1887 value: 8.738999999999999
1888 - type: map_at_5
1889 value: 10.397
1890 - type: mrr_at_1
1891 value: 23.599999999999998
1892 - type: mrr_at_10
1893 value: 34.845
1894 - type: mrr_at_100
1895 value: 35.916
1896 - type: mrr_at_1000
1897 value: 35.973
1898 - type: mrr_at_3
1899 value: 31.7
1900 - type: mrr_at_5
1901 value: 33.535
1902 - type: ndcg_at_1
1903 value: 23.599999999999998
1904 - type: ndcg_at_10
1905 value: 20.522000000000002
1906 - type: ndcg_at_100
1907 value: 28.737000000000002
1908 - type: ndcg_at_1000
1909 value: 34.596
1910 - type: ndcg_at_3
1911 value: 19.542
1912 - type: ndcg_at_5
1913 value: 16.958000000000002
1914 - type: precision_at_1
1915 value: 23.599999999999998
1916 - type: precision_at_10
1917 value: 10.67
1918 - type: precision_at_100
1919 value: 2.259
1920 - type: precision_at_1000
1921 value: 0.367
1922 - type: precision_at_3
1923 value: 18.333
1924 - type: precision_at_5
1925 value: 14.879999999999999
1926 - type: recall_at_1
1927 value: 4.7780000000000005
1928 - type: recall_at_10
1929 value: 21.617
1930 - type: recall_at_100
1931 value: 45.905
1932 - type: recall_at_1000
1933 value: 74.42
1934 - type: recall_at_3
1935 value: 11.148
1936 - type: recall_at_5
1937 value: 15.082999999999998
1938 - task:
1939 type: STS
1940 dataset:
1941 type: mteb/sickr-sts
1942 name: MTEB SICK-R
1943 config: default
1944 split: test
1945 revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1946 metrics:
1947 - type: cos_sim_pearson
1948 value: 83.22372750297885
1949 - type: cos_sim_spearman
1950 value: 79.40972617119405
1951 - type: euclidean_pearson
1952 value: 80.6101072020434
1953 - type: euclidean_spearman
1954 value: 79.53844217225202
1955 - type: manhattan_pearson
1956 value: 80.57265975286111
1957 - type: manhattan_spearman
1958 value: 79.46335611792958
1959 - task:
1960 type: STS
1961 dataset:
1962 type: mteb/sts12-sts
1963 name: MTEB STS12
1964 config: default
1965 split: test
1966 revision: a0d554a64d88156834ff5ae9920b964011b16384
1967 metrics:
1968 - type: cos_sim_pearson
1969 value: 85.43713315520749
1970 - type: cos_sim_spearman
1971 value: 77.44128693329532
1972 - type: euclidean_pearson
1973 value: 81.63869928101123
1974 - type: euclidean_spearman
1975 value: 77.29512977961515
1976 - type: manhattan_pearson
1977 value: 81.63704185566183
1978 - type: manhattan_spearman
1979 value: 77.29909412738657
1980 - task:
1981 type: STS
1982 dataset:
1983 type: mteb/sts13-sts
1984 name: MTEB STS13
1985 config: default
1986 split: test
1987 revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1988 metrics:
1989 - type: cos_sim_pearson
1990 value: 81.59451537860527
1991 - type: cos_sim_spearman
1992 value: 82.97994638856723
1993 - type: euclidean_pearson
1994 value: 82.89478688288412
1995 - type: euclidean_spearman
1996 value: 83.58740751053104
1997 - type: manhattan_pearson
1998 value: 82.69140840941608
1999 - type: manhattan_spearman
2000 value: 83.33665956040555
2001 - task:
2002 type: STS
2003 dataset:
2004 type: mteb/sts14-sts
2005 name: MTEB STS14
2006 config: default
2007 split: test
2008 revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2009 metrics:
2010 - type: cos_sim_pearson
2011 value: 82.00756527711764
2012 - type: cos_sim_spearman
2013 value: 81.83560996841379
2014 - type: euclidean_pearson
2015 value: 82.07684151976518
2016 - type: euclidean_spearman
2017 value: 82.00913052060511
2018 - type: manhattan_pearson
2019 value: 82.05690778488794
2020 - type: manhattan_spearman
2021 value: 82.02260252019525
2022 - task:
2023 type: STS
2024 dataset:
2025 type: mteb/sts15-sts
2026 name: MTEB STS15
2027 config: default
2028 split: test
2029 revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2030 metrics:
2031 - type: cos_sim_pearson
2032 value: 86.13710262895447
2033 - type: cos_sim_spearman
2034 value: 87.26412811156248
2035 - type: euclidean_pearson
2036 value: 86.94151453230228
2037 - type: euclidean_spearman
2038 value: 87.5363796699571
2039 - type: manhattan_pearson
2040 value: 86.86989424083748
2041 - type: manhattan_spearman
2042 value: 87.47315940781353
2043 - task:
2044 type: STS
2045 dataset:
2046 type: mteb/sts16-sts
2047 name: MTEB STS16
2048 config: default
2049 split: test
2050 revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2051 metrics:
2052 - type: cos_sim_pearson
2053 value: 83.0230597603627
2054 - type: cos_sim_spearman
2055 value: 84.93344499318864
2056 - type: euclidean_pearson
2057 value: 84.23754743431141
2058 - type: euclidean_spearman
2059 value: 85.09707376597099
2060 - type: manhattan_pearson
2061 value: 84.04325160987763
2062 - type: manhattan_spearman
2063 value: 84.89353071339909
2064 - task:
2065 type: STS
2066 dataset:
2067 type: mteb/sts17-crosslingual-sts
2068 name: MTEB STS17 (en-en)
2069 config: en-en
2070 split: test
2071 revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2072 metrics:
2073 - type: cos_sim_pearson
2074 value: 86.75620824563921
2075 - type: cos_sim_spearman
2076 value: 87.15065513706398
2077 - type: euclidean_pearson
2078 value: 88.26281533633521
2079 - type: euclidean_spearman
2080 value: 87.51963738643983
2081 - type: manhattan_pearson
2082 value: 88.25599267618065
2083 - type: manhattan_spearman
2084 value: 87.58048736047483
2085 - task:
2086 type: STS
2087 dataset:
2088 type: mteb/sts22-crosslingual-sts
2089 name: MTEB STS22 (en)
2090 config: en
2091 split: test
2092 revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2093 metrics:
2094 - type: cos_sim_pearson
2095 value: 64.74645319195137
2096 - type: cos_sim_spearman
2097 value: 65.29996325037214
2098 - type: euclidean_pearson
2099 value: 67.04297794086443
2100 - type: euclidean_spearman
2101 value: 65.43841726694343
2102 - type: manhattan_pearson
2103 value: 67.39459955690904
2104 - type: manhattan_spearman
2105 value: 65.92864704413651
2106 - task:
2107 type: STS
2108 dataset:
2109 type: mteb/stsbenchmark-sts
2110 name: MTEB STSBenchmark
2111 config: default
2112 split: test
2113 revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2114 metrics:
2115 - type: cos_sim_pearson
2116 value: 84.31291020270801
2117 - type: cos_sim_spearman
2118 value: 85.86473738688068
2119 - type: euclidean_pearson
2120 value: 85.65537275064152
2121 - type: euclidean_spearman
2122 value: 86.13087454209642
2123 - type: manhattan_pearson
2124 value: 85.43946955047609
2125 - type: manhattan_spearman
2126 value: 85.91568175344916
2127 - task:
2128 type: Reranking
2129 dataset:
2130 type: mteb/scidocs-reranking
2131 name: MTEB SciDocsRR
2132 config: default
2133 split: test
2134 revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2135 metrics:
2136 - type: map
2137 value: 85.93798118350695
2138 - type: mrr
2139 value: 95.93536274908824
2140 - task:
2141 type: Retrieval
2142 dataset:
2143 type: scifact
2144 name: MTEB SciFact
2145 config: default
2146 split: test
2147 revision: None
2148 metrics:
2149 - type: map_at_1
2150 value: 57.594
2151 - type: map_at_10
2152 value: 66.81899999999999
2153 - type: map_at_100
2154 value: 67.368
2155 - type: map_at_1000
2156 value: 67.4
2157 - type: map_at_3
2158 value: 64.061
2159 - type: map_at_5
2160 value: 65.47
2161 - type: mrr_at_1
2162 value: 60.667
2163 - type: mrr_at_10
2164 value: 68.219
2165 - type: mrr_at_100
2166 value: 68.655
2167 - type: mrr_at_1000
2168 value: 68.684
2169 - type: mrr_at_3
2170 value: 66.22200000000001
2171 - type: mrr_at_5
2172 value: 67.289
2173 - type: ndcg_at_1
2174 value: 60.667
2175 - type: ndcg_at_10
2176 value: 71.275
2177 - type: ndcg_at_100
2178 value: 73.642
2179 - type: ndcg_at_1000
2180 value: 74.373
2181 - type: ndcg_at_3
2182 value: 66.521
2183 - type: ndcg_at_5
2184 value: 68.581
2185 - type: precision_at_1
2186 value: 60.667
2187 - type: precision_at_10
2188 value: 9.433
2189 - type: precision_at_100
2190 value: 1.0699999999999998
2191 - type: precision_at_1000
2192 value: 0.11299999999999999
2193 - type: precision_at_3
2194 value: 25.556
2195 - type: precision_at_5
2196 value: 16.8
2197 - type: recall_at_1
2198 value: 57.594
2199 - type: recall_at_10
2200 value: 83.622
2201 - type: recall_at_100
2202 value: 94.167
2203 - type: recall_at_1000
2204 value: 99.667
2205 - type: recall_at_3
2206 value: 70.64399999999999
2207 - type: recall_at_5
2208 value: 75.983
2209 - task:
2210 type: PairClassification
2211 dataset:
2212 type: mteb/sprintduplicatequestions-pairclassification
2213 name: MTEB SprintDuplicateQuestions
2214 config: default
2215 split: test
2216 revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2217 metrics:
2218 - type: cos_sim_accuracy
2219 value: 99.85841584158416
2220 - type: cos_sim_ap
2221 value: 96.66996142314342
2222 - type: cos_sim_f1
2223 value: 92.83208020050125
2224 - type: cos_sim_precision
2225 value: 93.06532663316584
2226 - type: cos_sim_recall
2227 value: 92.60000000000001
2228 - type: dot_accuracy
2229 value: 99.85841584158416
2230 - type: dot_ap
2231 value: 96.6775307676576
2232 - type: dot_f1
2233 value: 92.69289729177312
2234 - type: dot_precision
2235 value: 94.77533960292581
2236 - type: dot_recall
2237 value: 90.7
2238 - type: euclidean_accuracy
2239 value: 99.86138613861387
2240 - type: euclidean_ap
2241 value: 96.6338454403108
2242 - type: euclidean_f1
2243 value: 92.92214357937311
2244 - type: euclidean_precision
2245 value: 93.96728016359918
2246 - type: euclidean_recall
2247 value: 91.9
2248 - type: manhattan_accuracy
2249 value: 99.86237623762376
2250 - type: manhattan_ap
2251 value: 96.60370449645053
2252 - type: manhattan_f1
2253 value: 92.91177970423253
2254 - type: manhattan_precision
2255 value: 94.7970863683663
2256 - type: manhattan_recall
2257 value: 91.10000000000001
2258 - type: max_accuracy
2259 value: 99.86237623762376
2260 - type: max_ap
2261 value: 96.6775307676576
2262 - type: max_f1
2263 value: 92.92214357937311
2264 - task:
2265 type: Clustering
2266 dataset:
2267 type: mteb/stackexchange-clustering
2268 name: MTEB StackExchangeClustering
2269 config: default
2270 split: test
2271 revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2272 metrics:
2273 - type: v_measure
2274 value: 60.77977058695198
2275 - task:
2276 type: Clustering
2277 dataset:
2278 type: mteb/stackexchange-clustering-p2p
2279 name: MTEB StackExchangeClusteringP2P
2280 config: default
2281 split: test
2282 revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2283 metrics:
2284 - type: v_measure
2285 value: 35.2725272535638
2286 - task:
2287 type: Reranking
2288 dataset:
2289 type: mteb/stackoverflowdupquestions-reranking
2290 name: MTEB StackOverflowDupQuestions
2291 config: default
2292 split: test
2293 revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2294 metrics:
2295 - type: map
2296 value: 53.64052466362125
2297 - type: mrr
2298 value: 54.533067014684654
2299 - task:
2300 type: Summarization
2301 dataset:
2302 type: mteb/summeval
2303 name: MTEB SummEval
2304 config: default
2305 split: test
2306 revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2307 metrics:
2308 - type: cos_sim_pearson
2309 value: 30.677624219206578
2310 - type: cos_sim_spearman
2311 value: 30.121368518123447
2312 - type: dot_pearson
2313 value: 30.69870088041608
2314 - type: dot_spearman
2315 value: 29.61284927093751
2316 - task:
2317 type: Retrieval
2318 dataset:
2319 type: trec-covid
2320 name: MTEB TRECCOVID
2321 config: default
2322 split: test
2323 revision: None
2324 metrics:
2325 - type: map_at_1
2326 value: 0.22
2327 - type: map_at_10
2328 value: 1.855
2329 - type: map_at_100
2330 value: 9.885
2331 - type: map_at_1000
2332 value: 23.416999999999998
2333 - type: map_at_3
2334 value: 0.637
2335 - type: map_at_5
2336 value: 1.024
2337 - type: mrr_at_1
2338 value: 88.0
2339 - type: mrr_at_10
2340 value: 93.067
2341 - type: mrr_at_100
2342 value: 93.067
2343 - type: mrr_at_1000
2344 value: 93.067
2345 - type: mrr_at_3
2346 value: 92.667
2347 - type: mrr_at_5
2348 value: 93.067
2349 - type: ndcg_at_1
2350 value: 82.0
2351 - type: ndcg_at_10
2352 value: 75.899
2353 - type: ndcg_at_100
2354 value: 55.115
2355 - type: ndcg_at_1000
2356 value: 48.368
2357 - type: ndcg_at_3
2358 value: 79.704
2359 - type: ndcg_at_5
2360 value: 78.39699999999999
2361 - type: precision_at_1
2362 value: 88.0
2363 - type: precision_at_10
2364 value: 79.60000000000001
2365 - type: precision_at_100
2366 value: 56.06
2367 - type: precision_at_1000
2368 value: 21.206
2369 - type: precision_at_3
2370 value: 84.667
2371 - type: precision_at_5
2372 value: 83.2
2373 - type: recall_at_1
2374 value: 0.22
2375 - type: recall_at_10
2376 value: 2.078
2377 - type: recall_at_100
2378 value: 13.297
2379 - type: recall_at_1000
2380 value: 44.979
2381 - type: recall_at_3
2382 value: 0.6689999999999999
2383 - type: recall_at_5
2384 value: 1.106
2385 - task:
2386 type: Retrieval
2387 dataset:
2388 type: webis-touche2020
2389 name: MTEB Touche2020
2390 config: default
2391 split: test
2392 revision: None
2393 metrics:
2394 - type: map_at_1
2395 value: 2.258
2396 - type: map_at_10
2397 value: 10.439
2398 - type: map_at_100
2399 value: 16.89
2400 - type: map_at_1000
2401 value: 18.407999999999998
2402 - type: map_at_3
2403 value: 5.668
2404 - type: map_at_5
2405 value: 7.718
2406 - type: mrr_at_1
2407 value: 32.653
2408 - type: mrr_at_10
2409 value: 51.159
2410 - type: mrr_at_100
2411 value: 51.714000000000006
2412 - type: mrr_at_1000
2413 value: 51.714000000000006
2414 - type: mrr_at_3
2415 value: 47.959
2416 - type: mrr_at_5
2417 value: 50.407999999999994
2418 - type: ndcg_at_1
2419 value: 29.592000000000002
2420 - type: ndcg_at_10
2421 value: 26.037
2422 - type: ndcg_at_100
2423 value: 37.924
2424 - type: ndcg_at_1000
2425 value: 49.126999999999995
2426 - type: ndcg_at_3
2427 value: 30.631999999999998
2428 - type: ndcg_at_5
2429 value: 28.571
2430 - type: precision_at_1
2431 value: 32.653
2432 - type: precision_at_10
2433 value: 22.857
2434 - type: precision_at_100
2435 value: 7.754999999999999
2436 - type: precision_at_1000
2437 value: 1.529
2438 - type: precision_at_3
2439 value: 34.014
2440 - type: precision_at_5
2441 value: 29.796
2442 - type: recall_at_1
2443 value: 2.258
2444 - type: recall_at_10
2445 value: 16.554
2446 - type: recall_at_100
2447 value: 48.439
2448 - type: recall_at_1000
2449 value: 82.80499999999999
2450 - type: recall_at_3
2451 value: 7.283
2452 - type: recall_at_5
2453 value: 10.732
2454 - task:
2455 type: Classification
2456 dataset:
2457 type: mteb/toxic_conversations_50k
2458 name: MTEB ToxicConversationsClassification
2459 config: default
2460 split: test
2461 revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2462 metrics:
2463 - type: accuracy
2464 value: 69.8858
2465 - type: ap
2466 value: 13.835684144362109
2467 - type: f1
2468 value: 53.803351693244586
2469 - task:
2470 type: Classification
2471 dataset:
2472 type: mteb/tweet_sentiment_extraction
2473 name: MTEB TweetSentimentExtractionClassification
2474 config: default
2475 split: test
2476 revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2477 metrics:
2478 - type: accuracy
2479 value: 60.50650820599886
2480 - type: f1
2481 value: 60.84357825979259
2482 - task:
2483 type: Clustering
2484 dataset:
2485 type: mteb/twentynewsgroups-clustering
2486 name: MTEB TwentyNewsgroupsClustering
2487 config: default
2488 split: test
2489 revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2490 metrics:
2491 - type: v_measure
2492 value: 48.52131044852134
2493 - task:
2494 type: PairClassification
2495 dataset:
2496 type: mteb/twittersemeval2015-pairclassification
2497 name: MTEB TwitterSemEval2015
2498 config: default
2499 split: test
2500 revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2501 metrics:
2502 - type: cos_sim_accuracy
2503 value: 85.59337187816654
2504 - type: cos_sim_ap
2505 value: 73.23925826533437
2506 - type: cos_sim_f1
2507 value: 67.34693877551021
2508 - type: cos_sim_precision
2509 value: 62.40432237730752
2510 - type: cos_sim_recall
2511 value: 73.13984168865434
2512 - type: dot_accuracy
2513 value: 85.31322644096085
2514 - type: dot_ap
2515 value: 72.30723963807422
2516 - type: dot_f1
2517 value: 66.47051612112296
2518 - type: dot_precision
2519 value: 62.0792305930845
2520 - type: dot_recall
2521 value: 71.53034300791556
2522 - type: euclidean_accuracy
2523 value: 85.61125350181797
2524 - type: euclidean_ap
2525 value: 73.32843720487845
2526 - type: euclidean_f1
2527 value: 67.36549633745895
2528 - type: euclidean_precision
2529 value: 64.60755813953489
2530 - type: euclidean_recall
2531 value: 70.36939313984169
2532 - type: manhattan_accuracy
2533 value: 85.63509566668654
2534 - type: manhattan_ap
2535 value: 73.16658488311325
2536 - type: manhattan_f1
2537 value: 67.20597386434349
2538 - type: manhattan_precision
2539 value: 63.60424028268551
2540 - type: manhattan_recall
2541 value: 71.2401055408971
2542 - type: max_accuracy
2543 value: 85.63509566668654
2544 - type: max_ap
2545 value: 73.32843720487845
2546 - type: max_f1
2547 value: 67.36549633745895
2548 - task:
2549 type: PairClassification
2550 dataset:
2551 type: mteb/twitterurlcorpus-pairclassification
2552 name: MTEB TwitterURLCorpus
2553 config: default
2554 split: test
2555 revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2556 metrics:
2557 - type: cos_sim_accuracy
2558 value: 88.33779640625606
2559 - type: cos_sim_ap
2560 value: 84.83868375898157
2561 - type: cos_sim_f1
2562 value: 77.16506154017773
2563 - type: cos_sim_precision
2564 value: 74.62064005753327
2565 - type: cos_sim_recall
2566 value: 79.88912842623961
2567 - type: dot_accuracy
2568 value: 88.02732176815307
2569 - type: dot_ap
2570 value: 83.95089283763002
2571 - type: dot_f1
2572 value: 76.29635101196631
2573 - type: dot_precision
2574 value: 73.31771720613288
2575 - type: dot_recall
2576 value: 79.52725592854944
2577 - type: euclidean_accuracy
2578 value: 88.44452206310397
2579 - type: euclidean_ap
2580 value: 84.98384576824827
2581 - type: euclidean_f1
2582 value: 77.29311047696697
2583 - type: euclidean_precision
2584 value: 74.51232583065381
2585 - type: euclidean_recall
2586 value: 80.28949799815214
2587 - type: manhattan_accuracy
2588 value: 88.47362906042613
2589 - type: manhattan_ap
2590 value: 84.91421462218432
2591 - type: manhattan_f1
2592 value: 77.05107637204792
2593 - type: manhattan_precision
2594 value: 74.74484256243214
2595 - type: manhattan_recall
2596 value: 79.50415768401602
2597 - type: max_accuracy
2598 value: 88.47362906042613
2599 - type: max_ap
2600 value: 84.98384576824827
2601 - type: max_f1
2602 value: 77.29311047696697
2603 license: mit
2604 language:
2605 - en
2606 ---
2607
2608
2609 <h1 align="center">FlagEmbedding</h1>
2610
2611
2612 <h4 align="center">
2613 <p>
2614 <a href=#model-list>Model List</a> |
2615 <a href=#frequently-asked-questions>FAQ</a> |
2616 <a href=#usage>Usage</a> |
2617 <a href="#evaluation">Evaluation</a> |
2618 <a href="#train">Train</a> |
2619 <a href="#contact">Contact</a> |
2620 <a href="#citation">Citation</a> |
2621 <a href="#license">License</a>
2622 <p>
2623 </h4>
2624
2625 More details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
2626
2627 If you are looking for a model that supports more languages, longer texts, and other retrieval methods, you can try using [bge-m3](https://huggingface.co/BAAI/bge-m3).
2628
2629
2630 [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2631
2632 FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently:
2633
2634 - **Long-Context LLM**: [Activation Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon)
2635 - **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
2636 - **Dense Retrieval**: [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3), [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding)
2637 - **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2638 - **Benchmark**: [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
2639
2640 ## News
2641 - 1/30/2024: Release **BGE-M3**, a new member to BGE model series! M3 stands for **M**ulti-linguality (100+ languages), **M**ulti-granularities (input length up to 8192), **M**ulti-Functionality (unification of dense, lexical, multi-vec/colbert retrieval).
2642 It is the first embedding model which supports all three retrieval methods, achieving new SOTA on multi-lingual (MIRACL) and cross-lingual (MKQA) benchmarks.
2643 [Technical Report](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/BGE_M3.pdf) and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3). :fire:
2644 - 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
2645 - 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
2646 - 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
2647 - 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
2648 - 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
2649 - 09/15/2023: The [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
2650 - 09/12/2023: New models:
2651 - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
2652 - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
2653
2654
2655 <details>
2656 <summary>More</summary>
2657 <!-- ### More -->
2658
2659 - 09/07/2023: Update [fine-tune code](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md): Add script to mine hard negatives and support adding instruction during fine-tuning.
2660 - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like [this](#using-langchain); C-MTEB **leaderboard** is [available](https://huggingface.co/spaces/mteb/leaderboard).
2661 - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗**
2662 - 08/02/2023: Release `bge-large-*`(short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada:
2663 - 08/01/2023: We release the [Chinese Massive Text Embedding Benchmark](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), consisting of 31 test dataset.
2664
2665 </details>
2666
2667
2668 ## Model List
2669
2670 `bge` is short for `BAAI general embedding`.
2671
2672 | Model | Language | | Description | query instruction for retrieval [1] |
2673 |:-------------------------------|:--------:| :--------:| :--------:|:--------:|
2674 | [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
2675 | [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
2676 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2677 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2678 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2679 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2680 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `Represent this sentence for searching relevant passages: ` |
2681 | [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2682 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2683 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | version 1.5 with more reasonable similarity distribution | `为这个句子生成表示以用于检索相关文章:` |
2684 | [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard | `Represent this sentence for searching relevant passages: ` |
2685 | [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-en` | `Represent this sentence for searching relevant passages: ` |
2686 | [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en) | English | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) |a small-scale model but with competitive performance | `Represent this sentence for searching relevant passages: ` |
2687 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | :trophy: rank **1st** in [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB) benchmark | `为这个句子生成表示以用于检索相关文章:` |
2688 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
2689 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
2690
2691 [1\]: If you need to search the relevant passages to a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages.
2692
2693 [2\]: Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
2694 For examples, use bge embedding model to retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 document to get the final top-3 results.
2695
2696 All models have been uploaded to Huggingface Hub, and you can see them at https://huggingface.co/BAAI.
2697 If you cannot open the Huggingface Hub, you also can download the models at https://model.baai.ac.cn/models .
2698
2699
2700 ## Frequently asked questions
2701
2702 <details>
2703 <summary>1. How to fine-tune bge embedding model?</summary>
2704
2705 <!-- ### How to fine-tune bge embedding model? -->
2706 Following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) to prepare data and fine-tune your model.
2707 Some suggestions:
2708 - Mine hard negatives following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune#hard-negatives), which can improve the retrieval performance.
2709 - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity.
2710 - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker.
2711
2712
2713 </details>
2714
2715 <details>
2716 <summary>2. The similarity score between two dissimilar sentences is higher than 0.5</summary>
2717
2718 <!-- ### The similarity score between two dissimilar sentences is higher than 0.5 -->
2719 **Suggest to use bge v1.5, which alleviates the issue of the similarity distribution.**
2720
2721 Since we finetune the models by contrastive learning with a temperature of 0.01,
2722 the similarity distribution of the current BGE model is about in the interval \[0.6, 1\].
2723 So a similarity score greater than 0.5 does not indicate that the two sentences are similar.
2724
2725 For downstream tasks, such as passage retrieval or semantic similarity,
2726 **what matters is the relative order of the scores, not the absolute value.**
2727 If you need to filter similar sentences based on a similarity threshold,
2728 please select an appropriate similarity threshold based on the similarity distribution on your data (such as 0.8, 0.85, or even 0.9).
2729
2730 </details>
2731
2732 <details>
2733 <summary>3. When does the query instruction need to be used</summary>
2734
2735 <!-- ### When does the query instruction need to be used -->
2736
2737 For the `bge-*-v1.5`, we improve its retrieval ability when not using instruction.
2738 No instruction only has a slight degradation in retrieval performance compared with using instruction.
2739 So you can generate embedding without instruction in all cases for convenience.
2740
2741 For a retrieval task that uses short queries to find long related documents,
2742 it is recommended to add instructions for these short queries.
2743 **The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.**
2744 In all cases, the documents/passages do not need to add the instruction.
2745
2746 </details>
2747
2748
2749 ## Usage
2750
2751 ### Usage for Embedding Model
2752
2753 Here are some examples for using `bge` models with
2754 [FlagEmbedding](#using-flagembedding), [Sentence-Transformers](#using-sentence-transformers), [Langchain](#using-langchain), or [Huggingface Transformers](#using-huggingface-transformers).
2755
2756 #### Using FlagEmbedding
2757 ```
2758 pip install -U FlagEmbedding
2759 ```
2760 If it doesn't work for you, you can see [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md) for more methods to install FlagEmbedding.
2761
2762 ```python
2763 from FlagEmbedding import FlagModel
2764 sentences_1 = ["样例数据-1", "样例数据-2"]
2765 sentences_2 = ["样例数据-3", "样例数据-4"]
2766 model = FlagModel('BAAI/bge-large-zh-v1.5',
2767 query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章:",
2768 use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2769 embeddings_1 = model.encode(sentences_1)
2770 embeddings_2 = model.encode(sentences_2)
2771 similarity = embeddings_1 @ embeddings_2.T
2772 print(similarity)
2773
2774 # for s2p(short query to long passage) retrieval task, suggest to use encode_queries() which will automatically add the instruction to each query
2775 # corpus in retrieval task can still use encode() or encode_corpus(), since they don't need instruction
2776 queries = ['query_1', 'query_2']
2777 passages = ["样例文档-1", "样例文档-2"]
2778 q_embeddings = model.encode_queries(queries)
2779 p_embeddings = model.encode(passages)
2780 scores = q_embeddings @ p_embeddings.T
2781 ```
2782 For the value of the argument `query_instruction_for_retrieval`, see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list).
2783
2784 By default, FlagModel will use all available GPUs when encoding. Please set `os.environ["CUDA_VISIBLE_DEVICES"]` to select specific GPUs.
2785 You also can set `os.environ["CUDA_VISIBLE_DEVICES"]=""` to make all GPUs unavailable.
2786
2787
2788 #### Using Sentence-Transformers
2789
2790 You can also use the `bge` models with [sentence-transformers](https://www.SBERT.net):
2791
2792 ```
2793 pip install -U sentence-transformers
2794 ```
2795 ```python
2796 from sentence_transformers import SentenceTransformer
2797 sentences_1 = ["样例数据-1", "样例数据-2"]
2798 sentences_2 = ["样例数据-3", "样例数据-4"]
2799 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2800 embeddings_1 = model.encode(sentences_1, normalize_embeddings=True)
2801 embeddings_2 = model.encode(sentences_2, normalize_embeddings=True)
2802 similarity = embeddings_1 @ embeddings_2.T
2803 print(similarity)
2804 ```
2805 For s2p(short query to long passage) retrieval task,
2806 each short query should start with an instruction (instructions see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list)).
2807 But the instruction is not needed for passages.
2808 ```python
2809 from sentence_transformers import SentenceTransformer
2810 queries = ['query_1', 'query_2']
2811 passages = ["样例文档-1", "样例文档-2"]
2812 instruction = "为这个句子生成表示以用于检索相关文章:"
2813
2814 model = SentenceTransformer('BAAI/bge-large-zh-v1.5')
2815 q_embeddings = model.encode([instruction+q for q in queries], normalize_embeddings=True)
2816 p_embeddings = model.encode(passages, normalize_embeddings=True)
2817 scores = q_embeddings @ p_embeddings.T
2818 ```
2819
2820 #### Using Langchain
2821
2822 You can use `bge` in langchain like this:
2823 ```python
2824 from langchain.embeddings import HuggingFaceBgeEmbeddings
2825 model_name = "BAAI/bge-large-en-v1.5"
2826 model_kwargs = {'device': 'cuda'}
2827 encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity
2828 model = HuggingFaceBgeEmbeddings(
2829 model_name=model_name,
2830 model_kwargs=model_kwargs,
2831 encode_kwargs=encode_kwargs,
2832 query_instruction="为这个句子生成表示以用于检索相关文章:"
2833 )
2834 model.query_instruction = "为这个句子生成表示以用于检索相关文章:"
2835 ```
2836
2837
2838 #### Using HuggingFace Transformers
2839
2840 With the transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of the first token (i.e., [CLS]) as the sentence embedding.
2841
2842 ```python
2843 from transformers import AutoTokenizer, AutoModel
2844 import torch
2845 # Sentences we want sentence embeddings for
2846 sentences = ["样例数据-1", "样例数据-2"]
2847
2848 # Load model from HuggingFace Hub
2849 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-zh-v1.5')
2850 model = AutoModel.from_pretrained('BAAI/bge-large-zh-v1.5')
2851 model.eval()
2852
2853 # Tokenize sentences
2854 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2855 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2856 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2857
2858 # Compute token embeddings
2859 with torch.no_grad():
2860 model_output = model(**encoded_input)
2861 # Perform pooling. In this case, cls pooling.
2862 sentence_embeddings = model_output[0][:, 0]
2863 # normalize embeddings
2864 sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
2865 print("Sentence embeddings:", sentence_embeddings)
2866 ```
2867
2868 ### Usage for Reranker
2869
2870 Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
2871 You can get a relevance score by inputting query and passage to the reranker.
2872 The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range.
2873
2874
2875 #### Using FlagEmbedding
2876 ```
2877 pip install -U FlagEmbedding
2878 ```
2879
2880 Get relevance scores (higher scores indicate more relevance):
2881 ```python
2882 from FlagEmbedding import FlagReranker
2883 reranker = FlagReranker('BAAI/bge-reranker-large', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
2884
2885 score = reranker.compute_score(['query', 'passage'])
2886 print(score)
2887
2888 scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']])
2889 print(scores)
2890 ```
2891
2892
2893 #### Using Huggingface transformers
2894
2895 ```python
2896 import torch
2897 from transformers import AutoModelForSequenceClassification, AutoTokenizer
2898
2899 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large')
2900 model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-large')
2901 model.eval()
2902
2903 pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
2904 with torch.no_grad():
2905 inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
2906 scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
2907 print(scores)
2908 ```
2909
2910 #### Usage of the ONNX files
2911
2912 ```python
2913 from optimum.onnxruntime import ORTModelForFeatureExtraction # type: ignore
2914
2915 import torch
2916 from transformers import AutoModel, AutoTokenizer
2917
2918 tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-small-en-v1.5')
2919 model = AutoModel.from_pretrained('BAAI/bge-small-en-v1.5')
2920 model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-small-en-v1.5', file_name="onnx/model.onnx")
2921
2922 # Sentences we want sentence embeddings for
2923 sentences = ["样例数据-1", "样例数据-2"]
2924
2925 # Tokenize sentences
2926 encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2927 # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2928 # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2929
2930 model_output_ort = model_ort(**encoded_input)
2931 # Compute token embeddings
2932 with torch.no_grad():
2933 model_output = model(**encoded_input)
2934
2935 # model_output and model_output_ort are identical
2936
2937 ```
2938
2939 #### Usage via infinity
2940 Its also possible to deploy the onnx files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
2941 Recommended is `device="cuda", engine="torch"` with flash attention on gpu, and `device="cpu", engine="optimum"` for onnx inference.
2942
2943 ```python
2944 import asyncio
2945 from infinity_emb import AsyncEmbeddingEngine, EngineArgs
2946
2947 sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
2948 engine = AsyncEmbeddingEngine.from_args(
2949 EngineArgs(model_name_or_path = "BAAI/bge-small-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
2950 ))
2951
2952 async def main():
2953 async with engine:
2954 embeddings, usage = await engine.embed(sentences=sentences)
2955 asyncio.run(main())
2956 ```
2957
2958
2959 ## Evaluation
2960
2961 `baai-general-embedding` models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!**
2962 For more details and evaluation tools see our [scripts](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md).
2963
2964 - **MTEB**:
2965
2966 | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) |
2967 |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
2968 | [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) | 1024 | 512 | **64.23** | **54.29** | 46.08 | 87.12 | 60.03 | 83.11 | 31.61 | 75.97 |
2969 | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | 768 | 512 | 63.55 | 53.25 | 45.77 | 86.55 | 58.86 | 82.4 | 31.07 | 75.53 |
2970 | [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) | 384 | 512 | 62.17 |51.68 | 43.82 | 84.92 | 58.36 | 81.59 | 30.12 | 74.14 |
2971 | [bge-large-en](https://huggingface.co/BAAI/bge-large-en) | 1024 | 512 | 63.98 | 53.9 | 46.98 | 85.8 | 59.48 | 81.56 | 32.06 | 76.21 |
2972 | [bge-base-en](https://huggingface.co/BAAI/bge-base-en) | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 |
2973 | [gte-large](https://huggingface.co/thenlper/gte-large) | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 |
2974 | [gte-base](https://huggingface.co/thenlper/gte-base) | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 |
2975 | [e5-large-v2](https://huggingface.co/intfloat/e5-large-v2) | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 |
2976 | [bge-small-en](https://huggingface.co/BAAI/bge-small-en) | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 |
2977 | [instructor-xl](https://huggingface.co/hkunlp/instructor-xl) | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 |
2978 | [e5-base-v2](https://huggingface.co/intfloat/e5-base-v2) | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 |
2979 | [gte-small](https://huggingface.co/thenlper/gte-small) | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 |
2980 | [text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings) | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 |
2981 | [e5-small-v2](https://huggingface.co/intfloat/e5-base-v2) | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 |
2982 | [sentence-t5-xxl](https://huggingface.co/sentence-transformers/sentence-t5-xxl) | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 |
2983 | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 |
2984 | [sgpt-bloom-7b1-msmarco](https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco) | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 |
2985
2986
2987
2988 - **C-MTEB**:
2989 We create the benchmark C-MTEB for Chinese text embedding which consists of 31 datasets from 6 tasks.
2990 Please refer to [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md) for a detailed introduction.
2991
2992 | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering |
2993 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
2994 | [**BAAI/bge-large-zh-v1.5**](https://huggingface.co/BAAI/bge-large-zh-v1.5) | 1024 | **64.53** | 70.46 | 56.25 | 81.6 | 69.13 | 65.84 | 48.99 |
2995 | [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5) | 768 | 63.13 | 69.49 | 53.72 | 79.75 | 68.07 | 65.39 | 47.53 |
2996 | [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) | 512 | 57.82 | 61.77 | 49.11 | 70.41 | 63.96 | 60.92 | 44.18 |
2997 | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | 1024 | 64.20 | 71.53 | 54.98 | 78.94 | 68.32 | 65.11 | 48.39 |
2998 | [bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct) | 1024 | 63.53 | 70.55 | 53 | 76.77 | 68.58 | 64.91 | 50.01 |
2999 | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | 768 | 62.96 | 69.53 | 54.12 | 77.5 | 67.07 | 64.91 | 47.63 |
3000 | [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) | 1024 | 58.79 | 63.66 | 48.44 | 69.89 | 67.34 | 56.00 | 48.23 |
3001 | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | 512 | 58.27 | 63.07 | 49.45 | 70.35 | 63.64 | 61.48 | 45.09 |
3002 | [m3e-base](https://huggingface.co/moka-ai/m3e-base) | 768 | 57.10 | 56.91 | 50.47 | 63.99 | 67.52 | 59.34 | 47.68 |
3003 | [m3e-large](https://huggingface.co/moka-ai/m3e-large) | 1024 | 57.05 | 54.75 | 50.42 | 64.3 | 68.2 | 59.66 | 48.88 |
3004 | [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) | 768 | 55.48 | 61.63 | 46.49 | 67.07 | 65.35 | 54.35 | 40.68 |
3005 | [multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) | 384 | 55.38 | 59.95 | 45.27 | 66.45 | 65.85 | 53.86 | 45.26 |
3006 | [text-embedding-ada-002(OpenAI)](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings) | 1536 | 53.02 | 52.0 | 43.35 | 69.56 | 64.31 | 54.28 | 45.68 |
3007 | [luotuo](https://huggingface.co/silk-road/luotuo-bert-medium) | 1024 | 49.37 | 44.4 | 42.78 | 66.62 | 61 | 49.25 | 44.39 |
3008 | [text2vec-base](https://huggingface.co/shibing624/text2vec-base-chinese) | 768 | 47.63 | 38.79 | 43.41 | 67.41 | 62.19 | 49.45 | 37.66 |
3009 | [text2vec-large](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | 1024 | 47.36 | 41.94 | 44.97 | 70.86 | 60.66 | 49.16 | 30.02 |
3010
3011
3012 - **Reranking**:
3013 See [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/) for evaluation script.
3014
3015 | Model | T2Reranking | T2RerankingZh2En\* | T2RerankingEn2Zh\* | MMarcoReranking | CMedQAv1 | CMedQAv2 | Avg |
3016 |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
3017 | text2vec-base-multilingual | 64.66 | 62.94 | 62.51 | 14.37 | 48.46 | 48.6 | 50.26 |
3018 | multilingual-e5-small | 65.62 | 60.94 | 56.41 | 29.91 | 67.26 | 66.54 | 57.78 |
3019 | multilingual-e5-large | 64.55 | 61.61 | 54.28 | 28.6 | 67.42 | 67.92 | 57.4 |
3020 | multilingual-e5-base | 64.21 | 62.13 | 54.68 | 29.5 | 66.23 | 66.98 | 57.29 |
3021 | m3e-base | 66.03 | 62.74 | 56.07 | 17.51 | 77.05 | 76.76 | 59.36 |
3022 | m3e-large | 66.13 | 62.72 | 56.1 | 16.46 | 77.76 | 78.27 | 59.57 |
3023 | bge-base-zh-v1.5 | 66.49 | 63.25 | 57.02 | 29.74 | 80.47 | 84.88 | 63.64 |
3024 | bge-large-zh-v1.5 | 65.74 | 63.39 | 57.03 | 28.74 | 83.45 | 85.44 | 63.97 |
3025 | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | 67.28 | 63.95 | 60.45 | 35.46 | 81.26 | 84.1 | 65.42 |
3026 | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | 67.6 | 64.03 | 61.44 | 37.16 | 82.15 | 84.18 | 66.09 |
3027
3028 \* : T2RerankingZh2En and T2RerankingEn2Zh are cross-language retrieval tasks
3029
3030 ## Train
3031
3032 ### BAAI Embedding
3033
3034 We pre-train the models using [retromae](https://github.com/staoxiao/RetroMAE) and train them on large-scale pairs data using contrastive learning.
3035 **You can fine-tune the embedding model on your data following our [examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune).**
3036 We also provide a [pre-train example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain).
3037 Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned.
3038 More training details for bge see [baai_general_embedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md).
3039
3040
3041
3042 ### BGE Reranker
3043
3044 Cross-encoder will perform full-attention over the input pair,
3045 which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model.
3046 Therefore, it can be used to re-rank the top-k documents returned by embedding model.
3047 We train the cross-encoder on a multilingual pair data,
3048 The data format is the same as embedding model, so you can fine-tune it easily following our [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker).
3049 More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
3050
3051
3052 ## Contact
3053 If you have any question or suggestion related to this project, feel free to open an issue or pull request.
3054 You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn).
3055
3056
3057 ## Citation
3058
3059 If you find this repository useful, please consider giving a star :star: and citation
3060
3061 ```
3062 @misc{bge_embedding,
3063 title={C-Pack: Packaged Resources To Advance General Chinese Embedding},
3064 author={Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff},
3065 year={2023},
3066 eprint={2309.07597},
3067 archivePrefix={arXiv},
3068 primaryClass={cs.CL}
3069 }
3070 ```
3071
3072 ## License
3073 FlagEmbedding is licensed under the [MIT License](https://github.com/FlagOpen/FlagEmbedding/blob/master/LICENSE). The released models can be used for commercial purposes free of charge.
3074
3075