Model Hub

Browse PQC-verified AI models, datasets, and tools

arXiv Papers by Subject A reorganised version of the nick007x/arxiv-papers dataset, partitioned by subject code, year, and month for efficient selective access. Dataset Description This dataset contains metadata for over 2.5 million arXiv papers, organised into a hierarchical directory structure that allows users to download only the specific subjects and time periods they need, rather than the entire dataset. Motivation The original nick007x/arxiv-papers… See the full description on the dataset page: https://huggingface.co/datasets/permutans/arxiv-papers-by-subject.

Task_categories:text-GenerationTask_categories:feature-ExtractionSource_datasets:nick007x/arxiv-PapersLanguage:enSize_categories:1M<n<10MArxiv

358K 23

Updated 2026-06-29 Source available

AlgorithmicResearchGroup/arxiv_s2orc_parsed HF Unverified

Dataset Card for "ArtifactAI/arxiv_s2orc_parsed" Dataset Description https://huggingface.co/datasets/AlgorithmicResearchGroup/arxiv_s2orc_parsed Dataset Summary AlgorithmicResearchGroup/arxiv_s2orc_parsed is a subset of the AllenAI S2ORC dataset, a general-purpose corpus for NLP and text mining research over scientific papers, The dataset is filtered strictly for ArXiv papers, including the full text for each paper. Github links have been extracted from each… See the full description on the dataset page: https://huggingface.co/datasets/AlgorithmicResearchGroup/arxiv_s2orc_parsed.

Task_categories:text-GenerationTask_categories:zero-Shot-ClassificationLanguage:enSize_categories:1M<n<10MFormat:parquetModality:text

92K 27

Updated 2026-06-27 Source available

bluuebunny/arxiv_metadata_by_year HF Unverified

Dataset Card for Dataset Name This dataset card aims to be a base template for new datasets. It has been generated using this raw template. Dataset Details Dataset Description Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed] Dataset Sources [optional] Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/bluuebunny/arxiv_metadata_by_year.

Language:enSize_categories:1M<n<10MFormat:parquetModality:textLibrary:datasetsLibrary:dask

91K 9

Updated 2026-06-29 Source available

Showing 3 of 3 items (page 1 of 1)

Prev Next