Model Hub

Browse PQC-verified AI models, datasets, and tools

M
microsoft/Phi-3.5-vision-instruct HF PQC Verified

Image-Text-to-TextTransformersSafetensorsPhi3_vText GenerationNlp HIGH
mvp-lab/LLaVA-OneVision-1.5-Mid-Training-85M HF Unverified

🚀 LLaVA-One-Vision-1.5-Mid-Training-85M Dataset is being uploaded 🚀 Upload Status All Completed: ImageNet-21k、LAIONCN、DataComp-1B、Zero250M、COYO700M、SA-1B、MINT、Obelics 📜 Cite If you find LLaVA-One-Vision-1.5-Mid-Training-85M useful in your research, please consider to cite the following related papers: @misc{an2025llavaonevision15fullyopenframework, title={LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training}… See the full description on the dataset page: https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Mid-Training-85M.

Size_categories:10M<n<100MFormat:parquetModality:imageModality:textLibrary:datasetsLibrary:dask
stanford-vision-lab/gpic HF Unverified

GPIC: A Giant Permissive Image Corpus for Visual Generation Keshigeyan&nbsp;Chandrasegaran*1,&nbsp; Kyle&nbsp;Sargent*1,&nbsp; Suchir&nbsp;Agarwal1,&nbsp; Michael&nbsp;Jang1,&nbsp; Michael&nbsp;Poli1,2,&nbsp; Juan&nbsp;Carlos&nbsp;Niebles1,4,&nbsp; Justin&nbsp;Johnson3,&nbsp; Jiajun&nbsp;Wu1,&nbsp; Li&nbsp;Fei-Fei1 1&nbsp;Stanford University&nbsp;&nbsp; 2&nbsp;Radical Numerics&nbsp;&nbsp; 3&nbsp;University of Michigan&nbsp;&nbsp; 4&nbsp;Salesforce… See the full description on the dataset page: https://huggingface.co/datasets/stanford-vision-lab/gpic.

Language:en
HuggingFaceM4/FineVision HF Unverified

Fine Vision FineVision is a massive collection of datasets with 17.3M images, 24.3M samples, 88.9M turns, and 9.5B answer tokens, designed for training state-of-the-art open Vision-Language-Models. More detail can be found in the blog post: https://huggingface.co/spaces/HuggingFaceM4/FineVision Load the data from datasets import load_dataset, get_dataset_config_names # Get all subset names and load the first one available_subsets =… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceM4/FineVision.

Size_categories:10M<n<100MFormat:parquetModality:imageModality:textLibrary:datasetsLibrary:dask
mvp-lab/LLaVA-OneVision-1.5-Instruct-Data HF Unverified

LLaVA-OneVision-1.5 Instruction Data Paper | Code 📌 Introduction This dataset, LLaVA-OneVision-1.5-Instruct, was collected and integrated during the development of LLaVA-OneVision-1.5. LLaVA-OneVision-1.5 is a novel family of Large Multimodal Models (LMMs) that achieve state-of-the-art performance with significantly reduced computational and financial costs. This meticulously curated 22M instruction dataset (LLaVA-OneVision-1.5-Instruct) is part of a comprehensive and… See the full description on the dataset page: https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Instruct-Data.

Task_categories:image-Text-To-TextLanguage:enSize_categories:10M<n<100MModality:imageModality:textMultimodal
Showing 5 of 5 items (page 1 of 1)
Prev Next