README.md
9.0 KB · 267 lines · markdown Raw
1 ---
2 language: en
3 license: cc-by-4.0
4 datasets:
5 - squad_v2
6 model-index:
7 - name: deepset/roberta-base-squad2
8 results:
9 - task:
10 type: question-answering
11 name: Question Answering
12 dataset:
13 name: squad_v2
14 type: squad_v2
15 config: squad_v2
16 split: validation
17 metrics:
18 - type: exact_match
19 value: 79.9309
20 name: Exact Match
21 verified: true
22 verifyToken: >-
23 eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDhhNjg5YzNiZGQ1YTIyYTAwZGUwOWEzZTRiYzdjM2QzYjA3ZTUxNDM1NjE1MTUyMjE1MGY1YzEzMjRjYzVjYiIsInZlcnNpb24iOjF9.EH5JJo8EEFwU7osPz3s7qanw_tigeCFhCXjSfyN0Y1nWVnSfulSxIk_DbAEI5iE80V4EKLyp5-mYFodWvL2KDA
24 - type: f1
25 value: 82.9501
26 name: F1
27 verified: true
28 verifyToken: >-
29 eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjk5ZDYwOGQyNjNkMWI0OTE4YzRmOTlkY2JjNjQ0YTZkNTMzMzNkYTA0MDFmNmI3NjA3NjNlMjhiMDQ2ZjJjNSIsInZlcnNpb24iOjF9.DDm0LNTkdLbGsue58bg1aH_s67KfbcmkvL-6ZiI2s8IoxhHJMSf29H_uV2YLyevwx900t-MwTVOW3qfFnMMEAQ
30 - type: total
31 value: 11869
32 name: total
33 verified: true
34 verifyToken: >-
35 eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGFkMmI2ODM0NmY5NGNkNmUxYWViOWYxZDNkY2EzYWFmOWI4N2VhYzY5MGEzMTVhOTU4Zjc4YWViOGNjOWJjMCIsInZlcnNpb24iOjF9.fexrU1icJK5_MiifBtZWkeUvpmFISqBLDXSQJ8E6UnrRof-7cU0s4tX_dIsauHWtUpIHMPZCf5dlMWQKXZuAAA
36 - task:
37 type: question-answering
38 name: Question Answering
39 dataset:
40 name: squad
41 type: squad
42 config: plain_text
43 split: validation
44 metrics:
45 - type: exact_match
46 value: 85.289
47 name: Exact Match
48 - type: f1
49 value: 91.841
50 name: F1
51 - task:
52 type: question-answering
53 name: Question Answering
54 dataset:
55 name: adversarial_qa
56 type: adversarial_qa
57 config: adversarialQA
58 split: validation
59 metrics:
60 - type: exact_match
61 value: 29.5
62 name: Exact Match
63 - type: f1
64 value: 40.367
65 name: F1
66 - task:
67 type: question-answering
68 name: Question Answering
69 dataset:
70 name: squad_adversarial
71 type: squad_adversarial
72 config: AddOneSent
73 split: validation
74 metrics:
75 - type: exact_match
76 value: 78.567
77 name: Exact Match
78 - type: f1
79 value: 84.469
80 name: F1
81 - task:
82 type: question-answering
83 name: Question Answering
84 dataset:
85 name: squadshifts amazon
86 type: squadshifts
87 config: amazon
88 split: test
89 metrics:
90 - type: exact_match
91 value: 69.924
92 name: Exact Match
93 - type: f1
94 value: 83.284
95 name: F1
96 - task:
97 type: question-answering
98 name: Question Answering
99 dataset:
100 name: squadshifts new_wiki
101 type: squadshifts
102 config: new_wiki
103 split: test
104 metrics:
105 - type: exact_match
106 value: 81.204
107 name: Exact Match
108 - type: f1
109 value: 90.595
110 name: F1
111 - task:
112 type: question-answering
113 name: Question Answering
114 dataset:
115 name: squadshifts nyt
116 type: squadshifts
117 config: nyt
118 split: test
119 metrics:
120 - type: exact_match
121 value: 82.931
122 name: Exact Match
123 - type: f1
124 value: 90.756
125 name: F1
126 - task:
127 type: question-answering
128 name: Question Answering
129 dataset:
130 name: squadshifts reddit
131 type: squadshifts
132 config: reddit
133 split: test
134 metrics:
135 - type: exact_match
136 value: 71.55
137 name: Exact Match
138 - type: f1
139 value: 82.939
140 name: F1
141 base_model:
142 - FacebookAI/roberta-base
143 ---
144
145 # roberta-base for Extractive QA
146
147 This is the [roberta-base](https://huggingface.co/roberta-base) model, fine-tuned using the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.
148 We have also released a distilled version of this model called [deepset/tinyroberta-squad2](https://huggingface.co/deepset/tinyroberta-squad2). It has a comparable prediction quality and runs at twice the speed of [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2).
149
150
151 ## Overview
152 **Language model:** roberta-base
153 **Language:** English
154 **Downstream-task:** Extractive QA
155 **Training data:** SQuAD 2.0
156 **Eval data:** SQuAD 2.0
157 **Code:** See [an example extractive QA pipeline built with Haystack](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline)
158 **Infrastructure**: 4x Tesla v100
159
160 ## Hyperparameters
161
162 ```
163 batch_size = 96
164 n_epochs = 2
165 base_LM_model = "roberta-base"
166 max_seq_len = 386
167 learning_rate = 3e-5
168 lr_schedule = LinearWarmup
169 warmup_proportion = 0.2
170 doc_stride=128
171 max_query_length=64
172 ```
173
174 ## Usage
175
176 ### In Haystack
177 Haystack is an AI orchestration framework to build customizable, production-ready LLM applications. You can use this model in Haystack to do extractive question answering on documents.
178 To load and run the model with [Haystack](https://github.com/deepset-ai/haystack/):
179 ```python
180 # After running pip install haystack-ai "transformers[torch,sentencepiece]"
181
182 from haystack import Document
183 from haystack.components.readers import ExtractiveReader
184
185 docs = [
186 Document(content="Python is a popular programming language"),
187 Document(content="python ist eine beliebte Programmiersprache"),
188 ]
189
190 reader = ExtractiveReader(model="deepset/roberta-base-squad2")
191 reader.warm_up()
192
193 question = "What is a popular programming language?"
194 result = reader.run(query=question, documents=docs)
195 # {'answers': [ExtractedAnswer(query='What is a popular programming language?', score=0.5740374326705933, data='python', document=Document(id=..., content: '...'), context=None, document_offset=ExtractedAnswer.Span(start=0, end=6),...)]}
196 ```
197 For a complete example with an extractive question answering pipeline that scales over many documents, check out the [corresponding Haystack tutorial](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline).
198
199 ### In Transformers
200 ```python
201 from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
202
203 model_name = "deepset/roberta-base-squad2"
204
205 # a) Get predictions
206 nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
207 QA_input = {
208 'question': 'Why is model conversion important?',
209 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
210 }
211 res = nlp(QA_input)
212
213 # b) Load model & tokenizer
214 model = AutoModelForQuestionAnswering.from_pretrained(model_name)
215 tokenizer = AutoTokenizer.from_pretrained(model_name)
216 ```
217
218 ## Performance
219 Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).
220
221 ```
222 "exact": 79.87029394424324,
223 "f1": 82.91251169582613,
224
225 "total": 11873,
226 "HasAns_exact": 77.93522267206478,
227 "HasAns_f1": 84.02838248389763,
228 "HasAns_total": 5928,
229 "NoAns_exact": 81.79983179142137,
230 "NoAns_f1": 81.79983179142137,
231 "NoAns_total": 5945
232 ```
233
234 ## Authors
235 **Branden Chan:** branden.chan@deepset.ai
236 **Timo Möller:** timo.moeller@deepset.ai
237 **Malte Pietsch:** malte.pietsch@deepset.ai
238 **Tanay Soni:** tanay.soni@deepset.ai
239
240 ## About us
241
242 <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
243 <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
244 <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
245 </div>
246 <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
247 <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
248 </div>
249 </div>
250
251 [deepset](http://deepset.ai/) is the company behind the production-ready open-source AI framework [Haystack](https://haystack.deepset.ai/).
252
253 Some of our other work:
254 - [Distilled roberta-base-squad2 (aka "tinyroberta-squad2")](https://huggingface.co/deepset/tinyroberta-squad2)
255 - [German BERT](https://deepset.ai/german-bert), [GermanQuAD and GermanDPR](https://deepset.ai/germanquad), [German embedding model](https://huggingface.co/mixedbread-ai/deepset-mxbai-embed-de-large-v1)
256 - [deepset Cloud](https://www.deepset.ai/deepset-cloud-product)
257 - [deepset Studio](https://www.deepset.ai/deepset-studio)
258
259 ## Get in touch and join the Haystack community
260
261 <p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
262
263 We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community">Discord community open to everyone!</a></strong></p>
264
265 [Twitter](https://twitter.com/Haystack_AI) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://haystack.deepset.ai/) | [YouTube](https://www.youtube.com/@deepset_ai)
266
267 By the way: [we're hiring!](http://www.deepset.ai/jobs)