Model Hub

Browse PQC-verified AI models, datasets, and tools

HuggingFaceH4/MATH-500 HF Unverified

Dataset Card for MATH-500 This dataset contains a subset of 500 problems from the MATH benchmark that OpenAI created in their Let's Verify Step by Step paper. See their GitHub repo for the source file: https://github.com/openai/prm800k/tree/main?tab=readme-ov-file#math-splits

Task_categories:text-GenerationLanguage:enSize_categories:n<1KFormat:jsonModality:textLibrary:datasets
OpenSQZ/AutoMathText-V2 HF Unverified

πŸš€ AutoMathText-V2: A 2.46 Trillion Token AI-Curated STEM Pretraining Dataset &nbsp; πŸŽ‰ AutoMathText-v2 has surpassed 1.5 million downloads! We'd love to know how you're using it. Please take 1 minute to fill out our use case survey. Your feedback will directly shape the future roadmap of this dataset.πŸ‘‰ Share your use case here πŸ“Š AutoMathText-V2 consists of 2.46 trillion tokens of high-quality, deduplicated text spanning web content, mathematics, code, reasoning, and… See the full description on the dataset page: https://huggingface.co/datasets/OpenSQZ/AutoMathText-V2.

Task_categories:text-GenerationTask_categories:question-AnsweringLanguage:enLanguage:zhSize_categories:100M<n<1BModality:tabular
nvidia/OpenMathInstruct-2 HF Unverified

OpenMathInstruct-2 OpenMathInstruct-2 is a math instruction tuning dataset with 14M problem-solution pairs generated using the Llama3.1-405B-Instruct model. The training set problems of GSM8K and MATH are used for constructing the dataset in the following ways: Solution augmentation: Generating chain-of-thought solutions for training set problems in GSM8K and MATH. Problem-Solution augmentation: Generating new problems, followed by solutions for these new problems.… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/OpenMathInstruct-2.

Task_categories:question-AnsweringTask_categories:text-GenerationLanguage:enSize_categories:10M<n<100MFormat:parquetModality:text
Showing 3 of 3 items (page 1 of 1)
Prev Next