Model Hub
Browse PQC-verified AI models, datasets, and tools
SAGE-10k SAGE-10k is a large-scale interactive indoor scene dataset featuring realistic layouts, generated by the agentic-driven pipeline introduced in "SAGE: Scalable Agentic 3D Scene Generation for Embodied AI". The dataset contains 10,000 diverse scenes spanning 50 room types and styles, along with 565K uniquely generated 3D objects. 🔑 Key Features SAGE-10k integrates a wide variety of scenes, and particularly, preserves small items for… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/SAGE-10k.
Further cleaning done. Please look through the dataset and ensure that I didn't miss anything. Update: Confirmed working method for training the model: https://huggingface.co/AlekseyKorshuk/vicuna-7b/discussions/4#64346c08ef6d5abefe42c12c Two choices: Removes instances of "I'm sorry, but": https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split_no_imsorry.json Has instances of "I'm sorry, but":… See the full description on the dataset page: https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered.
Nemotron-Pre-Training-Dataset-v1 Release Data Overview This pretraining dataset, for generative AI model training, preserves high-value math and code while enriching it with diverse multilingual Q&A, fueling the next generation of intelligent, globally-capable models. This dataset supports NVIDIA Nemotron Nano 2, a family of large language models (LLMs) that consists of the NVIDIA-Nemotron-Nano-9B-v2, NVIDIA-Nemotron-Nano-9B-v2-Base, and NVIDIA-Nemotron-Nano-12B-v2-Base… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Nemotron-CC-v2.
🚀 LLaVA-One-Vision-1.5-Mid-Training-85M Dataset is being uploaded 🚀 Upload Status All Completed: ImageNet-21k、LAIONCN、DataComp-1B、Zero250M、COYO700M、SA-1B、MINT、Obelics 📜 Cite If you find LLaVA-One-Vision-1.5-Mid-Training-85M useful in your research, please consider to cite the following related papers: @misc{an2025llavaonevision15fullyopenframework, title={LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training}… See the full description on the dataset page: https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Mid-Training-85M.
Dataset Card for GPQA GPQA is a multiple-choice, Q&A dataset of very hard questions written and validated by experts in biology, physics, and chemistry. When attempting questions out of their own domain (e.g., a physicist answers a chemistry question), these experts get only 34% accuracy, despite spending >30m with full access to Google. We request that you do not reveal examples from this dataset in plain text or images online, to reduce the risk of leakage into foundation model… See the full description on the dataset page: https://huggingface.co/datasets/Idavidrein/gpqa.
GroundCUA: Grounding Computer Use Agents on Human Demonstrations 🌐 Website | 📑 Paper | 🤗 Dataset | 🤖 Models GroundCUA Dataset GroundCUA is a large and diverse dataset of real UI screenshots paired with structured annotations for building multimodal computer use agents. It covers 87 software platforms across productivity tools, browsers, creative tools, communication apps, development environments, and system utilities. GroundCUA is designed for research on GUI… See the full description on the dataset page: https://huggingface.co/datasets/ServiceNow/GroundCUA.
Fine Vision FineVision is a massive collection of datasets with 17.3M images, 24.3M samples, 88.9M turns, and 9.5B answer tokens, designed for training state-of-the-art open Vision-Language-Models. More detail can be found in the blog post: https://huggingface.co/spaces/HuggingFaceM4/FineVision Load the data from datasets import load_dataset, get_dataset_config_names # Get all subset names and load the first one available_subsets =… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceM4/FineVision.
LLaVA-OneVision-1.5 Instruction Data Paper | Code 📌 Introduction This dataset, LLaVA-OneVision-1.5-Instruct, was collected and integrated during the development of LLaVA-OneVision-1.5. LLaVA-OneVision-1.5 is a novel family of Large Multimodal Models (LMMs) that achieve state-of-the-art performance with significantly reduced computational and financial costs. This meticulously curated 22M instruction dataset (LLaVA-OneVision-1.5-Instruct) is part of a comprehensive and… See the full description on the dataset page: https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Instruct-Data.
Introduction TL;DR: DreamDojo is a generalist robot world model pretrained on 44k hours of human egocentric data, showing unprecedented generalization to diverse objects and environments. Project page: https://dreamdojo-world.github.io/ Paper: https://arxiv.org/abs/2602.06949 Code: https://github.com/NVIDIA/DreamDojo How to Use Check out https://github.com/NVIDIA/DreamDojo Citation @article{gao2026dreamdojo, title={DreamDojo: A Generalist Robot… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-GR00T-Teleop-GR1.