Model Hub

Browse PQC-verified AI models, datasets, and tools

jat-project/jat-dataset-tokenized HF Unverified

Dataset Card for "jat-dataset-tokenized" More Information needed

Size_categories:10M<n<100MFormat:parquetModality:timeseriesLibrary:datasetsLibrary:daskLibrary:mlcroissant
D
depth-anything/DA3NESTED-GIANT-LARGE-1.1 HF Unverified

Depth-EstimationDepth-Anything-3SafetensorsComputer-VisionMonocular-DepthMulti-View-Geometry HIGH
Q
Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign HF PQC Verified

Text-To-SpeechQwen-TtsSafetensorsQwen3_ttsAudioTts HIGH
nyu-mll/glue HF PQC Verified

Dataset Card for GLUE Dataset Summary GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems. Supported Tasks and Leaderboards The leaderboard for the GLUE benchmark can be found at this address. It comprises the following tasks: ax A manually-curated evaluation dataset for fine-grained analysis of system… See the full description on the dataset page: https://huggingface.co/datasets/nyu-mll/glue.

Task_categories:text-ClassificationTask_ids:acceptability-ClassificationTask_ids:natural-Language-InferenceTask_ids:semantic-Similarity-ScoringTask_ids:sentiment-ClassificationTask_ids:text-Scoring
allenai/ai2_arc HF Unverified

Dataset Card for "ai2_arc" Dataset Summary A new dataset of 7,787 genuine grade-school level, multiple-choice science questions, assembled to encourage research in advanced question-answering. The dataset is partitioned into a Challenge Set and an Easy Set, where the former contains only questions answered incorrectly by both a retrieval-based algorithm and a word co-occurrence algorithm. We are also including a corpus of over 14 million science sentences relevant to… See the full description on the dataset page: https://huggingface.co/datasets/allenai/ai2_arc.

Task_categories:question-AnsweringTask_ids:open-Domain-QaTask_ids:multiple-Choice-QaAnnotations_creators:foundLanguage_creators:foundMultilinguality:monolingual
X
Xenova/segformer-b0-finetuned-ade-512-512 HF Unverified

Image-SegmentationTransformers.jsONNXSegformerBase_model:nvidia/segformer-B0-Finetuned-Ade-512-512Base_model:quantized:nvidia/segformer-B0-Finetuned-Ade-512-512 MEDIUM
G
google-t5/t5-large HF Unverified

TranslationTransformersPyTorchTfJAXSafetensors HIGH
robbyant/mdm_depth HF Unverified

LingBot-Depth Dataset Self-curated RGB-D dataset for training LingBot-Depth, a masked depth modeling approach (arxiv:2601.17895). Each sample contains an RGB image, raw sensor depth, and ground truth depth. Total size: 2.71 TBDepth scale: millimeters (mm), stored as 16-bit PNGLicense: CC BY-NC-SA 4.0 Sub-datasets Name Description Samples RobbyReal Real-world indoor scenes captured with multiple RGB-D cameras 1,400,000 RobbyVla Real-world data collected… See the full description on the dataset page: https://huggingface.co/datasets/robbyant/mdm_depth.

Task_categories:depth-EstimationLanguage:enModality:3d3D3dDepth
mlfoundations/MINT-1T-PDF-CC-2023-40 HF PQC Verified

🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens 🍃 MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets. Additionally, we include previously untapped sources such as PDFs and ArXiv papers. 🍃 MINT-1T is designed to facilitate research in multimodal pretraining. 🍃 MINT-1T is created by a team from the University of Washington in… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/MINT-1T-PDF-CC-2023-40.

Task_categories:image-To-TextTask_categories:text-GenerationLanguage:enSize_categories:100B<n<1TMultimodal
Q
Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice HF Unverified

Text-To-SpeechSafetensorsQwen3_ttsTtsQwenAudio HIGH
D
diffusers/stable-diffusion-xl-1.0-inpainting-0.1 HF PQC Verified

Text-to-ImageDiffusersSafetensorsStable-Diffusion-XlStable-Diffusion-Xl-DiffusersInpainting CRITICAL
M
MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 HF Unverified

Zero-Shot ClassificationTransformersPyTorchONNXSafetensorsDeberta-V2 HIGH
L
LiheYoung/depth-anything-large-hf HF PQC Verified

Depth-EstimationTransformersSafetensorsDepth_anythingVision HIGH
N
nvidia/segformer-b0-finetuned-ade-512-512 HF Unverified

Image-SegmentationTransformersPyTorchTfSafetensorsSegformer MEDIUM
cadene/droid HF Unverified

This dataset was created using LeRobot. DROID: A Large-Scale In-the-Wild Robot Manipulation Dataset One of the biggest open-source dataset for robotics with 27.044,326 frames, 92,223 episodes, 31,308 unique task description in natural language. Ported from Tensorflow Dataset format (2TB) to LeRobotDataset format (400GB) with the help from IPEC-COMMUNITY. Visualization: LeRobot Homepage: Droid Paper: Arxiv License: apache-2.0 Dataset Structure meta/info.json: {… See the full description on the dataset page: https://huggingface.co/datasets/cadene/droid.

Task_categories:roboticsLanguage:enSize_categories:10M<n<100MModality:videoLeRobotOpenx
A
aufklarer/WeSpeaker-ResNet34-LM-MLX HF Unverified

Audio-ClassificationMlxSafetensorsWespeaker-Resnet34-LmSpeaker-EmbeddingSpeaker-Verification MEDIUM
P
PaddlePaddle/PP-DocLayoutV3_safetensors HF PQC Verified

Object-DetectionTransformersSafetensorsPp_doclayout_v3PaddleOCRPaddlePaddle MEDIUM
L
lightx2v/Qwen-Image-Lightning HF PQC Verified

Text-to-ImageDiffusersQwen-ImageDistillationLoRALora CRITICAL
P
philschmid/bart-large-cnn-samsum HF Unverified

SummarizationTransformersPyTorchBartText2text-GenerationSagemaker HIGH
J
jonathandinu/face-parsing HF Unverified

Image-SegmentationTransformersPyTorchONNXSafetensorsSegformer HIGH
Showing 20 of 531 items (page 15 of 27)