README.md
2.9 KB · 50 lines · markdown Raw
1 ---
2 pipeline_tag: text-classification
3 ---
4
5 <br>
6
7 # RADAR Model Card
8
9 ## Model Details
10
11 RADAR-Vicuna-7B is an AI-text detector trained via adversarial learning between the detector and a paraphraser on human-text corpus ([OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext)) and AI-text corpus generated
12 based on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).
13
14 - **Developed by:** [TrustSafeAI](https://huggingface.co/TrustSafeAI)
15 - **Model type:** An encoder-only language model based on the transformer architecture (RoBERTa).
16 - **License:** [Non-commercial license](https://huggingface.co/lmsys/vicuna-7b-v1.1#model-details) (inherited from Vicuna-7B-v1.1)
17 - **Trained from model:** [RoBERTa](https://arxiv.org/abs/1907.11692)
18
19
20 ### Model Sources
21
22 - **Project Page:** https://radar.vizhub.ai/
23 - **Paper:** https://arxiv.org/abs/2307.03838
24 - **IBM Blog Post:** https://research.ibm.com/blog/AI-forensics-attribution
25
26 ## Uses
27 Users could use this detector to assist them in detecting text generated by large language models.
28 Please note that this detector is trained on AI-text generated by Vicuna-7B-v1.1. As the model only supports [non-commercial use](https://huggingface.co/lmsys/vicuna-7b-v1.1#model-details), the intended users are **not allowed to involve this detector into commercial activities**.
29
30 ## Get Started with the Model
31 Please refer to the following guidelines to see how to locally run the downloaded model or use our API service hosted on Huggingface Space.
32 - Google Colab Demo: https://colab.research.google.com/drive/1r7mLEfVynChUUgIfw1r4WZyh9b0QBQdo?usp=sharing
33 - Huggingface API Documentation: https://trustsafeai-radar-ai-text-detector.hf.space/?view=api
34
35 ## Training Pipeline
36
37 We propose adversarial learning between a paraphraser and our detector. The paraphraser's goal is to make the AI-generated text more like human-writen and the detector's goal is to
38 promote it's ability to identify the AI-text.
39
40 - **(Step 1) Training Data preparation**: Before training, we use Vicuna-7B to generate AI-text by performing text completion based on the prefix span of human-text in [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).
41
42 - **(Step 2) Update the paraphraser** During training, the paraphraser will do paraphrasing on the AI-text generated in **Step 1**. And then collect the reward returned by the detector to update the paraphraser using Proxy Proximal Optimization loss.
43
44 - **(Step 3) Update the detector** The detector is optimized using the logistic loss on the human-text, AI-text and paraphrased AI-text.
45
46 See more details in Sections 3 and 4 of this [paper](https://arxiv.org/pdf/2307.03838.pdf).
47
48 ## Ethical Considerations
49 We suggest users use our tool to assist with identifying AI-written content at scale and with discretion. If the detection result is to be used as evidence, further validation steps
50 are necessary as RADAR cannot always make correct predictions.