Browse PQC-verified AI models, datasets, and tools
Dataset Card for "openwebtext" Dataset Summary An open-source replication of the WebText dataset from OpenAI, that was used to train GPT-2. This distribution was created by Aaron Gokaslan and Vanya Cohen of Brown University. Supported Tasks and Leaderboards More Information Needed Languages More Information Needed Dataset Structure Data Instances plain_text Size of downloaded dataset files: 13.51 GB Size of the⦠See the full description on the dataset page: https://huggingface.co/datasets/Skylion007/openwebtext.