README.md · punctuate-all

README.md

1.3 KB · 36 lines · markdown Raw

1	`---`
2	`license: mit`
3	`datasets:`
4	`- wmt/europarl`
5	`metrics:`
6	`- f1`
7	`- recall`
8	`- precision`
9	`---`
10	`This is based on [Oliver Guhr's work](https://huggingface.co/oliverguhr/fullstop-punctuation-multilang-large). The difference is that it is a finetuned xlm-roberta-base instead of an xlm-roberta-large and on twelve languages instead of four. The languages are: English, German, French, Spanish, Bulgarian, Italian, Polish, Dutch, Czech, Portugese, Slovak, Slovenian.`
11
12	`----- report -----`
13
14	`precision recall f1-score support`
15
16	`0 0.99 0.99 0.99 73317475`
17	`. 0.94 0.95 0.95 4484845`
18	`, 0.86 0.86 0.86 6100650`
19	`? 0.88 0.85 0.86 136479`
20	`- 0.60 0.29 0.39 233630`
21	`: 0.71 0.49 0.58 152424`
22
23	`accuracy 0.98 84425503`
24	`macro avg 0.83 0.74 0.77 84425503`
25	`weighted avg 0.98 0.98 0.98 84425503`
26
27
28	`----- confusion matrix -----`
29
30	`t/p 0 . , ? - :`
31	`0 1.0 0.0 0.0 0.0 0.0 0.0`
32	`. 0.0 1.0 0.0 0.0 0.0 0.0`
33	`, 0.1 0.0 0.9 0.0 0.0 0.0`
34	`? 0.0 0.1 0.0 0.8 0.0 0.0`
35	`- 0.1 0.1 0.5 0.0 0.3 0.0`
36	`: 0.0 0.3 0.1 0.0 0.0 0.5`