Model Hub

Browse PQC-verified AI models, datasets, and tools

GroundCUA: Grounding Computer Use Agents on Human Demonstrations 🌐 Website | 📑 Paper | 🤗 Dataset | 🤖 Models GroundCUA Dataset GroundCUA is a large and diverse dataset of real UI screenshots paired with structured annotations for building multimodal computer use agents. It covers 87 software platforms across productivity tools, browsers, creative tools, communication apps, development environments, and system utilities. GroundCUA is designed for research on GUI… See the full description on the dataset page: https://huggingface.co/datasets/ServiceNow/GroundCUA.

Task_categories:image-To-TextLanguage:enSize_categories:1M<n<10MModality:imageComputer_useAgents

106K 34

Updated 2026-05-08 Source available

Wan-AI/Wan2.2-T2V-A14B-Diffusers HF Unverified

Text-To-VideoDiffusersSafetensorsDiffusers:WanPipeline HIGH

105K 132

Updated 2026-05-08

ILSVRC/imagenet-1k HF Unverified

Dataset Card for ImageNet Dataset Summary ILSVRC 2012, commonly known as 'ImageNet' is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). ImageNet aims to provide on average 1000 images to illustrate each synset. Images of each concept are… See the full description on the dataset page: https://huggingface.co/datasets/ILSVRC/imagenet-1k.

Task_categories:image-ClassificationTask_ids:multi-Class-Image-ClassificationAnnotations_creators:crowdsourcedLanguage_creators:crowdsourcedMultilinguality:monolingualSource_datasets:original

103K 791

Updated 2026-05-08 Source available

truthfulqa/truthful_qa HF Unverified

Dataset Card for truthful_qa Dataset Summary TruthfulQA is a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics. Questions are crafted so that some humans would answer falsely due to a false belief or misconception. To perform well, models must avoid generating false answers learned from imitating human texts.… See the full description on the dataset page: https://huggingface.co/datasets/truthfulqa/truthful_qa.

Task_categories:multiple-ChoiceTask_categories:text-GenerationTask_categories:question-AnsweringTask_ids:multiple-Choice-QaTask_ids:language-ModelingTask_ids:open-Domain-Qa

103K 278

Updated 2026-05-08 Source available

HuggingFaceM4/FineVision HF Unverified

Fine Vision FineVision is a massive collection of datasets with 17.3M images, 24.3M samples, 88.9M turns, and 9.5B answer tokens, designed for training state-of-the-art open Vision-Language-Models. More detail can be found in the blog post: https://huggingface.co/spaces/HuggingFaceM4/FineVision Load the data from datasets import load_dataset, get_dataset_config_names # Get all subset names and load the first one available_subsets =… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceM4/FineVision.

Size_categories:10M<n<100MFormat:parquetModality:imageModality:textLibrary:datasetsLibrary:dask

102K 484

Updated 2026-05-08 Source available

google-research-datasets/paws HF Unverified

Dataset Card for PAWS: Paraphrase Adversaries from Word Scrambling Dataset Summary PAWS: Paraphrase Adversaries from Word Scrambling This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, and word order information for the problem of paraphrase identification. The dataset has two subsets, one based on Wikipedia and the other one based on the Quora Question Pairs (QQP) dataset. For further… See the full description on the dataset page: https://huggingface.co/datasets/google-research-datasets/paws.

Task_categories:text-ClassificationTask_ids:semantic-Similarity-ClassificationTask_ids:semantic-Similarity-ScoringTask_ids:text-ScoringTask_ids:multi-Input-Text-ClassificationAnnotations_creators:expert-Generated

99K 39

Updated 2026-05-07 Source available

mlfoundations/MINT-1T-PDF-CC-2023-50 HF PQC Verified

🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens 🍃 MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets. Additionally, we include previously untapped sources such as PDFs and ArXiv papers. 🍃 MINT-1T is designed to facilitate research in multimodal pretraining. 🍃 MINT-1T is created by a team from the University of Washington in… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/MINT-1T-PDF-CC-2023-50.

Task_categories:image-To-TextTask_categories:text-GenerationLanguage:enSize_categories:1M<n<10MFormat:webdatasetModality:image

98K 13

Updated 2026-05-03 Source available

roneneldan/TinyStories HF Unverified

Dataset containing synthetically generated (by GPT-3.5 and GPT-4) short stories that only use a small vocabulary. Described in the following paper: https://arxiv.org/abs/2305.07759. The models referred to in the paper were trained on TinyStories-train.txt (the file tinystories-valid.txt can be used for validation loss). These models can be found on Huggingface, at roneneldan/TinyStories-1M/3M/8M/28M/33M/1Layer-21M. Additional resources: tinystories_all_data.tar.gz - contains a superset of… See the full description on the dataset page: https://huggingface.co/datasets/roneneldan/TinyStories.

Task_categories:text-GenerationLanguage:enSize_categories:1M<n<10MFormat:parquetModality:textLibrary:datasets

97K 974

Updated 2026-05-08 Source available

zai-org/LongBench HF Unverified

LongBench is a comprehensive benchmark for multilingual and multi-task purposes, with the goal to fully measure and evaluate the ability of pre-trained language models to understand long text. This dataset consists of twenty different tasks, covering key long-text application scenarios such as multi-document QA, single-document QA, summarization, few-shot learning, synthetic tasks, and code completion.

Task_categories:question-AnsweringTask_categories:text-GenerationTask_categories:summarizationTask_categories:text-ClassificationLanguage:enLanguage:zh

94K 178

Updated 2026-05-08 Source available

ylecun/mnist HF Unverified

Dataset Card for MNIST Dataset Summary The MNIST dataset consists of 70,000 28x28 black-and-white images of handwritten digits extracted from two NIST databases. There are 60,000 images in the training dataset and 10,000 images in the validation dataset, one class per digit so a total of 10 classes, with 7,000 images (6,000 train images and 1,000 test images) per class. Half of the image were drawn by Census Bureau employees and the other half by high school students… See the full description on the dataset page: https://huggingface.co/datasets/ylecun/mnist.

Task_categories:image-ClassificationTask_ids:multi-Class-Image-ClassificationAnnotations_creators:expert-GeneratedLanguage_creators:foundMultilinguality:monolingualSource_datasets:extended|other-Nist

93K 239

Updated 2026-05-08 Source available

SulphurAI/Sulphur-2-base HF Unverified

Text-To-VideoDiffusersGGUFConversational CRITICAL

93K 435

Updated 2026-05-08

hotpotqa/hotpot_qa HF Unverified

Dataset Card for "hotpot_qa" Dataset Summary HotpotQA is a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowingQA systems to reason… See the full description on the dataset page: https://huggingface.co/datasets/hotpotqa/hotpot_qa.

Task_categories:question-AnsweringAnnotations_creators:crowdsourcedLanguage_creators:foundMultilinguality:monolingualSource_datasets:originalLanguage:en

90K 289

Updated 2026-05-08 Source available

mvp-lab/LLaVA-OneVision-1.5-Instruct-Data HF Unverified

LLaVA-OneVision-1.5 Instruction Data Paper | Code 📌 Introduction This dataset, LLaVA-OneVision-1.5-Instruct, was collected and integrated during the development of LLaVA-OneVision-1.5. LLaVA-OneVision-1.5 is a novel family of Large Multimodal Models (LMMs) that achieve state-of-the-art performance with significantly reduced computational and financial costs. This meticulously curated 22M instruction dataset (LLaVA-OneVision-1.5-Instruct) is part of a comprehensive and… See the full description on the dataset page: https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Instruct-Data.

Task_categories:image-Text-To-TextLanguage:enSize_categories:10M<n<100MModality:imageModality:textMultimodal

90K 71

Updated 2026-05-08 Source available

HumanCompatibleAI/ppo-seals-CartPole-v0 HF Unverified

Reinforcement-LearningStable-Baselines3Seals/CartPole-V0Deep-Reinforcement-LearningModel-Index MEDIUM

89K 16

Updated 2026-05-07

moussaKam/mbarthez HF Unverified

Fill MaskTransformersPyTorchMbartText2text-GenerationSummarization MEDIUM

86K 7

Updated 2026-04-29

bio-nlp-umass/MedThinkVQA HF Unverified

MedThinkVQA MedThinkVQA is an expert-annotated benchmark for multi-image diagnostic reasoning in radiology. Unlike prior medical VQA benchmarks that typically contain at most one image per case, MedThinkVQA requires models to extract evidence from each image, integrate cross-view information, and perform differential-diagnosis reasoning. Links GitHub: https://github.com/benluwang/MedThinkVQA Leaderboard: https://benluwang.github.io/MedThinkVQA/ Submission Guide:… See the full description on the dataset page: https://huggingface.co/datasets/bio-nlp-umass/MedThinkVQA.

Task_categories:question-AnsweringTask_categories:text-GenerationLanguage:enSize_categories:1K<n<10KFormat:parquetModality:image

86K 8

Updated 2026-05-06 Source available

fixie-ai/covost2 HF Unverified

This is a partial copy of CoVoST2 dataset. The main difference is that the audio data is included in the dataset, which makes usage easier and allows browsing the samples using HF Dataset Viewer. The limitation of this method is that all audio samples of the EN_XX subsets are duplicated, as such the size of the dataset is larger. As such, not all the data is included: Only the validation and test subsets are available. From the XX_EN subsets, only fr, es, and zh-CN are included.

Size_categories:1M<n<10MFormat:parquetModality:audioModality:textLibrary:datasetsLibrary:dask

84K 5

Updated 2026-05-08 Source available

nvidia/PhysicalAI-Robotics-GR00T-Teleop-GR1 HF Unverified

Introduction TL;DR: DreamDojo is a generalist robot world model pretrained on 44k hours of human egocentric data, showing unprecedented generalization to diverse objects and environments. Project page: https://dreamdojo-world.github.io/ Paper: https://arxiv.org/abs/2602.06949 Code: https://github.com/NVIDIA/DreamDojo How to Use Check out https://github.com/NVIDIA/DreamDojo Citation @article{gao2026dreamdojo, title={DreamDojo: A Generalist Robot… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-GR00T-Teleop-GR1.

Size_categories:1M<n<10MFormat:parquetModality:tabularModality:videoLibrary:datasetsLibrary:dask

84K 23

Updated 2026-05-08 Source available

ali-vilab/text-to-video-ms-1.7b HF Unverified

Text-To-VideoDiffusersSafetensorsDiffusers:TextToVideoSDPipeline HIGH

84K 658

Updated 2026-05-08

Showing 20 of 531 items (page 22 of 27)

Prev Next