README.md · RADAR-Vicuna-7B

README.md

2.9 KB · 50 lines · markdown Raw

1	`---`
2	`pipeline_tag: text-classification`
3	`---`
4
5	`<br>`
6
7	`# RADAR Model Card`
8
9	`## Model Details`
10
11	`RADAR-Vicuna-7B is an AI-text detector trained via adversarial learning between the detector and a paraphraser on human-text corpus ([OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext)) and AI-text corpus generated`
12	`based on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).`
13
14	`- Developed by: [TrustSafeAI](https://huggingface.co/TrustSafeAI)`
15	`- Model type: An encoder-only language model based on the transformer architecture (RoBERTa).`
16	`- License: [Non-commercial license](https://huggingface.co/lmsys/vicuna-7b-v1.1#model-details) (inherited from Vicuna-7B-v1.1)`
17	`- Trained from model: [RoBERTa](https://arxiv.org/abs/1907.11692)`
18
19
20	`### Model Sources`
21
22	`- Project Page: https://radar.vizhub.ai/`
23	`- Paper: https://arxiv.org/abs/2307.03838`
24	`- IBM Blog Post: https://research.ibm.com/blog/AI-forensics-attribution`
25
26	`## Uses`
27	`Users could use this detector to assist them in detecting text generated by large language models.`
28	`Please note that this detector is trained on AI-text generated by Vicuna-7B-v1.1. As the model only supports [non-commercial use](https://huggingface.co/lmsys/vicuna-7b-v1.1#model-details), the intended users are not allowed to involve this detector into commercial activities.`
29
30	`## Get Started with the Model`
31	`Please refer to the following guidelines to see how to locally run the downloaded model or use our API service hosted on Huggingface Space.`
32	`- Google Colab Demo: https://colab.research.google.com/drive/1r7mLEfVynChUUgIfw1r4WZyh9b0QBQdo?usp=sharing`
33	`- Huggingface API Documentation: https://trustsafeai-radar-ai-text-detector.hf.space/?view=api`
34
35	`## Training Pipeline`
36
37	`We propose adversarial learning between a paraphraser and our detector. The paraphraser's goal is to make the AI-generated text more like human-writen and the detector's goal is to`
38	`promote it's ability to identify the AI-text.`
39
40	`- (Step 1) Training Data preparation: Before training, we use Vicuna-7B to generate AI-text by performing text completion based on the prefix span of human-text in [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).`
41
42	`- (Step 2) Update the paraphraser During training, the paraphraser will do paraphrasing on the AI-text generated in Step 1. And then collect the reward returned by the detector to update the paraphraser using Proxy Proximal Optimization loss.`
43
44	`- (Step 3) Update the detector The detector is optimized using the logistic loss on the human-text, AI-text and paraphrased AI-text.`
45
46	`See more details in Sections 3 and 4 of this [paper](https://arxiv.org/pdf/2307.03838.pdf).`
47
48	`## Ethical Considerations`
49	`We suggest users use our tool to assist with identifying AI-written content at scale and with discretion. If the detection result is to be used as evidence, further validation steps`
50	`are necessary as RADAR cannot always make correct predictions.`