README.md
2.1 KB · 97 lines · markdown Raw
1 ---
2 language:
3 - ko
4 - en
5
6 tags:
7 - translation
8
9 license: apache-2.0
10 ---
11
12 ### kor-eng
13
14 * source group: Korean
15 * target group: English
16 * OPUS readme: [kor-eng](https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/kor-eng/README.md)
17
18 * model: transformer-align
19 * source language(s): kor kor_Hang kor_Latn
20 * target language(s): eng
21 * model: transformer-align
22 * pre-processing: normalization + SentencePiece (spm32k,spm32k)
23 * download original weights: [opus-2020-06-17.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/kor-eng/opus-2020-06-17.zip)
24 * test set translations: [opus-2020-06-17.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/kor-eng/opus-2020-06-17.test.txt)
25 * test set scores: [opus-2020-06-17.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/kor-eng/opus-2020-06-17.eval.txt)
26
27 ## Benchmarks
28
29 | testset | BLEU | chr-F |
30 |-----------------------|-------|-------|
31 | Tatoeba-test.kor.eng | 41.3 | 0.588 |
32
33
34 ### System Info:
35 - hf_name: kor-eng
36
37 - source_languages: kor
38
39 - target_languages: eng
40
41 - opus_readme_url: https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/kor-eng/README.md
42
43 - original_repo: Tatoeba-Challenge
44
45 - tags: ['translation']
46
47 - languages: ['ko', 'en']
48
49 - src_constituents: {'kor_Hani', 'kor_Hang', 'kor_Latn', 'kor'}
50
51 - tgt_constituents: {'eng'}
52
53 - src_multilingual: False
54
55 - tgt_multilingual: False
56
57 - prepro: normalization + SentencePiece (spm32k,spm32k)
58
59 - url_model: https://object.pouta.csc.fi/Tatoeba-MT-models/kor-eng/opus-2020-06-17.zip
60
61 - url_test_set: https://object.pouta.csc.fi/Tatoeba-MT-models/kor-eng/opus-2020-06-17.test.txt
62
63 - src_alpha3: kor
64
65 - tgt_alpha3: eng
66
67 - short_pair: ko-en
68
69 - chrF2_score: 0.588
70
71 - bleu: 41.3
72
73 - brevity_penalty: 0.9590000000000001
74
75 - ref_len: 17711.0
76
77 - src_name: Korean
78
79 - tgt_name: English
80
81 - train_date: 2020-06-17
82
83 - src_alpha2: ko
84
85 - tgt_alpha2: en
86
87 - prefer_old: False
88
89 - long_pair: kor-eng
90
91 - helsinki_git_sha: 480fcbe0ee1bf4774bcbe6226ad9f58e63f6c535
92
93 - transformers_git_sha: 2207e5d8cb224e954a7cba69fa4ac2309e9ff30b
94
95 - port_machine: brutasse
96
97 - port_time: 2020-08-21-14:41