Model Hub

Browse PQC-verified AI models, datasets, and tools

✨ Note: For all FineInstructions resources please visit: https://huggingface.co/fineinstructions This dataset is ~1B+ synthetic instruction-answer pairs or ~300B tokens created using the FineInstructions pipeline. The FineInstructions pipeline was run over the raw pre-training documents in the Nemotron-CC pre-training corpus (a subset of high-quality documents from CommonCrawl). See our paper for more details. Each .parquet file in the data folderhas a corresponding judge-*.json file that… See the full description on the dataset page: https://huggingface.co/datasets/fineinstructions/fineinstructions_nemotron.

Language:enSize_categories:1B<n<10BFormat:parquetModality:tabularModality:textLibrary:datasets

1.2M 9

Updated 2026-05-08 Source available

Showing 1 of 1 items (page 1 of 1)

Prev Next