Model Hub

Browse PQC-verified AI models, datasets, and tools

Sort: Most Downloaded Most Liked Recently Updated

Dataset Card for GLUE Dataset Summary GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems. Supported Tasks and Leaderboards The leaderboard for the GLUE benchmark can be found at this address. It comprises the following tasks: ax A manually-curated evaluation dataset for fine-grained analysis of system… See the full description on the dataset page: https://huggingface.co/datasets/nyu-mll/glue.

Task_categories:text-ClassificationTask_ids:acceptability-ClassificationTask_ids:natural-Language-InferenceTask_ids:semantic-Similarity-ScoringTask_ids:sentiment-ClassificationTask_ids:text-Scoring

417K 508

Updated 2026-06-29 Source available

nyu-mll/blimp HF Unverified

Dataset Card for "blimp" Dataset Summary BLiMP is a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars. Supported Tasks and Leaderboards More Information Needed Languages More Information… See the full description on the dataset page: https://huggingface.co/datasets/nyu-mll/blimp.

Task_categories:text-ClassificationTask_ids:acceptability-ClassificationAnnotations_creators:crowdsourcedLanguage_creators:machine-GeneratedMultilinguality:monolingualSource_datasets:original

66K 38

Updated 2026-05-07 Source available

Showing 2 of 2 items (page 1 of 1)

Prev Next