README.md · llmlingua-2-xlm-roberta-large-meetingbank

README.md

3.2 KB · 63 lines · markdown Raw

1	`---`
2	`license: mit`
3	`---`
4
5	`# LLMLingua-2-Bert-base-Multilingual-Cased-MeetingBank`
6
7	This model was introduced in the paper [LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression (Pan et al, 2024)](https://arxiv.org/abs/2403.12968). It is a [XLM-RoBERTa (large-sized model)](https://huggingface.co/FacebookAI/xlm-roberta-large) finetuned to perform token classification for task agnostic prompt compression. The probability $p_{preserve}$ of each token $x_i$ is used as the metric for compression. This model is trained on [the extractive text compression dataset](https://huggingface.co/datasets/microsoft/MeetingBank-LLMCompressed) constructed with the methodology proposed in the [LLMLingua-2](https://arxiv.org/abs/2403.12968), using training examples from [MeetingBank (Hu et al, 2023)](https://meetingbank.github.io/) as the seed data.
8
9	`You can evaluate the model on downstream tasks such as question answering (QA) and summarization over compressed meeting transcripts using [this dataset](https://huggingface.co/datasets/microsoft/MeetingBank-QA-Summary).`
10
11	`For more details, please check the home page of [LLMLingua-2](https://llmlingua.com/llmlingua2.html) and [LLMLingua Series](https://llmlingua.com/).`
12
13	`## Usage`
14	```python
15	`from llmlingua import PromptCompressor`
16
17	`compressor = PromptCompressor(`
18	`model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank",`
19	`use_llmlingua2=True`
20	`)`
21
22	`original_prompt = """John: So, um, I've been thinking about the project, you know, and I believe we need to, uh, make some changes. I mean, we want the project to succeed, right? So, like, I think we should consider maybe revising the timeline.`
23	`Sarah: I totally agree, John. I mean, we have to be realistic, you know. The timeline is, like, too tight. You know what I mean? We should definitely extend it.`
24	`"""`
25	`results = compressor.compress_prompt_llmlingua2(`
26	`original_prompt,`
27	`rate=0.6,`
28	`force_tokens=['\n', '.', '!', '?', ','],`
29	`chunk_end_tokens=['.', '\n'],`
30	`return_word_label=True,`
31	`drop_consecutive=True`
32	`)`
33
34	`print(results.keys())`
35	`print(f"Compressed prompt: {results['compressed_prompt']}")`
36	`print(f"Original tokens: {results['origin_tokens']}")`
37	`print(f"Compressed tokens: {results['compressed_tokens']}")`
38	`print(f"Compression rate: {results['rate']}")`
39
40	`# get the annotated results over the original prompt`
41	`word_sep = "\t\t\|\t\t"`
42	`label_sep = " "`
43	`lines = results["fn_labeled_original_prompt"].split(word_sep)`
44	`annotated_results = []`
45	`for line in lines:`
46	`word, label = line.split(label_sep)`
47	`annotated_results.append((word, '+') if label == '1' else (word, '-')) # list of tuples: (word, label)`
48	`print("Annotated results:")`
49	`for word, label in annotated_results[:10]:`
50	`print(f"{word} {label}")`
51	```
52
53	`## Citation`
54	```
55	`@article{wu2024llmlingua2,`
56	`title = "{LLML}ingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression",`
57	`author = "Zhuoshi Pan and Qianhui Wu and Huiqiang Jiang and Menglin Xia and Xufang Luo and Jue Zhang and Qingwei Lin and Victor Ruhle and Yuqing Yang and Chin-Yew Lin and H. Vicky Zhao and Lili Qiu and Dongmei Zhang",`
58	`url = "https://arxiv.org/abs/2403.12968",`
59	`journal = "ArXiv preprint",`
60	`volume = "abs/2403.12968",`
61	`year = "2024",`
62	`}`
63	```