Model Hub | QuantaMrkt

PsiBotAI/SynData HF Unverified

SynData 中文说明 Demo If the video cannot be displayed in your environment, open it directly: assets/syndata-demo.mp4 1. Overview SynData is a next-generation large-scale real-world multimodal dataset newly released by PsiBot. It comprehensively covers key dimensions including vision, language, and action, and provides highly realistic, high-density, and highly usable human data as a solid foundation for embodied intelligence training. Powered by… See the full description on the dataset page: https://huggingface.co/datasets/PsiBotAI/SynData.

Language:enSize_categories:100K<n<1MFormat:parquetModality:3dModality:tabularModality:text

360K 182

Updated 2026-06-30 Source available

permutans/arxiv-papers-by-subject HF Unverified

arXiv Papers by Subject A reorganised version of the nick007x/arxiv-papers dataset, partitioned by subject code, year, and month for efficient selective access. Dataset Description This dataset contains metadata for over 2.5 million arXiv papers, organised into a hierarchical directory structure that allows users to download only the specific subjects and time periods they need, rather than the entire dataset. Motivation The original nick007x/arxiv-papers… See the full description on the dataset page: https://huggingface.co/datasets/permutans/arxiv-papers-by-subject.

Task_categories:text-GenerationTask_categories:feature-ExtractionSource_datasets:nick007x/arxiv-PapersLanguage:enSize_categories:1M<n<10MArxiv

358K 23

Updated 2026-06-30 Source available

O

OpenMuQ/MuQ-large-msd-iter HF Unverified

Audio-ClassificationPyTorchSafetensorsMusicEnglishChinese HIGH

347K 24

Updated 2026-06-30

C

cross-encoder/nli-deberta-v3-large HF Unverified

Zero-Shot ClassificationSentence-TransformersPyTorchONNXSafetensorsDeberta-V2 HIGH

344K 44

Updated 2026-06-30

T

tencent/HunyuanImage-3.0 HF Unverified

Text-to-ImageTransformersSafetensorsHunyuan_image_3_moeText GenerationCustom_code CRITICAL

342K 1,094

Updated 2026-06-30

D

diffusers/stable-diffusion-xl-1.0-inpainting-0.1 HF PQC Verified

Text-to-ImageDiffusersSafetensorsStable-Diffusion-XlStable-Diffusion-Xl-DiffusersInpainting CRITICAL

340K 370

Updated 2026-05-08

M

MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 HF Unverified

Zero-Shot ClassificationTransformersPyTorchONNXSafetensorsDeberta-V2 HIGH

338K 377

Updated 2026-06-30

jat-project/jat-dataset-tokenized HF Unverified

Dataset Card for "jat-dataset-tokenized" More Information needed

Size_categories:10M<n<100MFormat:parquetModality:timeseriesLibrary:datasetsLibrary:daskLibrary:mlcroissant

335K 15

Updated 2026-06-30 Source available

P

PaddlePaddle/PP-DocLayoutV3_safetensors HF PQC Verified

Object-DetectionTransformersSafetensorsPp_doclayout_v3PaddleOCRPaddlePaddle MEDIUM

332K 30

Updated 2026-06-30

O

onecxi/open-vakgyata HF Unverified

Audio-ClassificationTransformersONNXSafetensorsWav2vec2Language-Identification MEDIUM

327K 3

Updated 2026-06-24

A

aufklarer/WeSpeaker-ResNet34-LM-MLX HF Unverified

Audio-ClassificationMlxSafetensorsWespeaker-Resnet34-LmSpeaker-EmbeddingSpeaker-Verification MEDIUM

325K 2

Updated 2026-05-08

B

briaai/RMBG-1.4 HF Unverified

Image-SegmentationTransformersPyTorchONNXSafetensorsSegformerForSemanticSegmentation MEDIUM

324K 1,994

Updated 2026-06-30

X

xbgoose/hubert-large-speech-emotion-recognition-russian-dusha-finetuned HF Unverified

Audio-ClassificationTransformersPyTorchSafetensorsHubertSER HIGH

314K 15

Updated 2026-06-30

S

Synthefy/Nori HF Unverified

Tabular-RegressionSynthefy-NoriFeatures-TransformerTabularTabular-Foundation-ModelIn-Context-Learning MEDIUM

314K 5

Updated 2026-06-30

D

deepset/roberta-large-squad2 HF Unverified

Question AnsweringTransformersPyTorchJAXSafetensorsRoberta HIGH

303K 29

Updated 2026-05-08

mlfoundations/MINT-1T-HTML HF Unverified

🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens 🍃 MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets. Additionally, we include previously untapped sources such as PDFs and ArXiv papers. 🍃 MINT-1T is designed to facilitate research in multimodal pretraining. 🍃 MINT-1T is created by a team from the University of Washington in… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/MINT-1T-HTML.

Task_categories:image-To-TextTask_categories:text-GenerationLanguage:enSize_categories:100M<n<1BFormat:parquetModality:text

303K 97

Updated 2026-06-30 Source available

U

unsloth/LTX-2.3-GGUF HF Unverified

Image-To-VideoGgmlGGUFUnslothText-To-VideoVideo-To-Video CRITICAL

299K 487

Updated 2026-06-30

N

nvidia/segformer-b0-finetuned-ade-512-512 HF Unverified

Image-SegmentationTransformersPyTorchTfSafetensorsSegformer MEDIUM

298K 190

Updated 2026-06-30

J

joeddav/xlm-roberta-large-xnli HF Unverified

Zero-Shot ClassificationTransformersPyTorchTfSafetensorsXlm-Roberta HIGH

296K 291

Updated 2026-06-30

vyokky/GUI-360 HF Unverified

GUI-360°: A Comprehensive Dataset And Benchmark For Computer-Using Agents Paper | Code GUI-360° is a large-scale, comprehensive dataset and benchmark suite designed to advance Computer-Using Agents (CUAs). 🎯 Key Features 🔢 1.2M+ executed action steps across thousands of trajectories 💼 Popular Windows office applications (Word, Excel, PowerPoint) 📸 Full-resolution screenshots with accessibility metadata 🎨 Multi-modal trajectories with reasoning traces ✅ Both… See the full description on the dataset page: https://huggingface.co/datasets/vyokky/GUI-360.

Task_categories:image-Text-To-TextSize_categories:1M<n<10M

285K 16

Updated 2026-06-30 Source available