Model Hub
Browse PQC-verified AI models, datasets, and tools
SynData 中文说明 Demo If the video cannot be displayed in your environment, open it directly: assets/syndata-demo.mp4 1. Overview SynData is a next-generation large-scale real-world multimodal dataset newly released by PsiBot. It comprehensively covers key dimensions including vision, language, and action, and provides highly realistic, high-density, and highly usable human data as a solid foundation for embodied intelligence training. Powered by… See the full description on the dataset page: https://huggingface.co/datasets/PsiBotAI/SynData.
arXiv Papers by Subject A reorganised version of the nick007x/arxiv-papers dataset, partitioned by subject code, year, and month for efficient selective access. Dataset Description This dataset contains metadata for over 2.5 million arXiv papers, organised into a hierarchical directory structure that allows users to download only the specific subjects and time periods they need, rather than the entire dataset. Motivation The original nick007x/arxiv-papers… See the full description on the dataset page: https://huggingface.co/datasets/permutans/arxiv-papers-by-subject.
Dataset Card for "jat-dataset-tokenized" More Information needed
🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens 🍃 MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets. Additionally, we include previously untapped sources such as PDFs and ArXiv papers. 🍃 MINT-1T is designed to facilitate research in multimodal pretraining. 🍃 MINT-1T is created by a team from the University of Washington in… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/MINT-1T-HTML.
GUI-360°: A Comprehensive Dataset And Benchmark For Computer-Using Agents Paper | Code GUI-360° is a large-scale, comprehensive dataset and benchmark suite designed to advance Computer-Using Agents (CUAs). 🎯 Key Features 🔢 1.2M+ executed action steps across thousands of trajectories 💼 Popular Windows office applications (Word, Excel, PowerPoint) 📸 Full-resolution screenshots with accessibility metadata 🎨 Multi-modal trajectories with reasoning traces ✅ Both… See the full description on the dataset page: https://huggingface.co/datasets/vyokky/GUI-360.