README.md · electra_large_discriminator_squad2_512

README.md

1.4 KB · 59 lines · markdown Raw

1	`## ELECTRA_large_discriminator language model fine-tuned on SQuAD2.0`
2
3	`### with the following results:`
4
5	```
6	`"exact": 87.09677419354838,`
7	`"f1": 89.98343832723452,`
8	`"total": 11873,`
9	`"HasAns_exact": 84.66599190283401,`
10	`"HasAns_f1": 90.44759839056285,`
11	`"HasAns_total": 5928,`
12	`"NoAns_exact": 89.52060555088309,`
13	`"NoAns_f1": 89.52060555088309,`
14	`"NoAns_total": 5945,`
15	`"best_exact": 87.09677419354838,`
16	`"best_exact_thresh": 0.0,`
17	`"best_f1": 89.98343832723432,`
18	`"best_f1_thresh": 0.0`
19	```
20	`### from script:`
21	```
22	`python ${EXAMPLES}/run_squad.py \`
23	`--model_type electra \`
24	`--model_name_or_path google/electra-large-discriminator \`
25	`--do_train \`
26	`--do_eval \`
27	`--train_file ${SQUAD}/train-v2.0.json \`
28	`--predict_file ${SQUAD}/dev-v2.0.json \`
29	`--version_2_with_negative \`
30	`--do_lower_case \`
31	`--num_train_epochs 3 \`
32	`--warmup_steps 306 \`
33	`--weight_decay 0.01 \`
34	`--learning_rate 3e-5 \`
35	`--max_grad_norm 0.5 \`
36	`--adam_epsilon 1e-6 \`
37	`--max_seq_length 512 \`
38	`--doc_stride 128 \`
39	`--per_gpu_train_batch_size 8 \`
40	`--gradient_accumulation_steps 16 \`
41	`--per_gpu_eval_batch_size 128 \`
42	`--fp16 \`
43	`--fp16_opt_level O1 \`
44	`--threads 12 \`
45	`--logging_steps 50 \`
46	`--save_steps 1000 \`
47	`--overwrite_output_dir \`
48	`--output_dir ${MODEL_PATH}`
49	```
50	`### using the following system & software:`
51	```
52	`Transformers: 2.11.0`
53	`PyTorch: 1.5.0`
54	`TensorFlow: 2.2.0`
55	`Python: 3.8.1`
56	`OS/Platform: Linux-5.3.0-59-generic-x86_64-with-glibc2.10`
57	`CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB`
58	```
59