Datasets

Training datasets with quantum-safe provenance

nguha/legalbench HF Unverified

Dataset Card for Dataset Name Homepage: https://hazyresearch.stanford.edu/legalbench/ Repository: https://github.com/HazyResearch/legalbench/ Paper: https://arxiv.org/abs/2308.11462 Dataset Description Dataset Summary The LegalBench project is an ongoing open science effort to collaboratively curate tasks for evaluating legal reasoning in English large language models (LLMs). The benchmark currently consists of 162 tasks gathered from 40… See the full description on the dataset page: https://huggingface.co/datasets/nguha/legalbench.

Task_categories:text-ClassificationTask_categories:question-AnsweringLanguage:enSize_categories:10K<n<100KFormat:csvModality:tabular
MrigLabIITRopar/GroMo25 HF Unverified

GroMo25: Multiview Time-Series Plant Image Dataset for Age Estimation and Leaf Counting Dataset Summary GroMo25 is a multiview, time-series plant image dataset designed for plant age estimation (in days) and leaf counting tasks in precision agriculture. It contains high-quality images of four crop species — Wheat, Okra, Radish, and Mustard — captured over multiple days under controlled conditions. Each plant is photographed from 24 angles across 5 vertical levels per day… See the full description on the dataset page: https://huggingface.co/datasets/MrigLabIITRopar/GroMo25.

Task_categories:image-ClassificationTask_categories:text-To-ImageTask_categories:image-To-TextLanguage:enSize_categories:100K<n<1MFormat:csv
uoft-cs/cifar100 HF Unverified

Dataset Card for CIFAR-100 Dataset Summary The CIFAR-100 dataset consists of 60000 32x32 colour images in 100 classes, with 600 images per class. There are 500 training images and 100 testing images per class. There are 50000 training images and 10000 test images. The 100 classes are grouped into 20 superclasses. There are two labels per image - fine label (actual class) and coarse label (superclass). Supported Tasks and Leaderboards image-classification: The… See the full description on the dataset page: https://huggingface.co/datasets/uoft-cs/cifar100.

Task_categories:image-ClassificationAnnotations_creators:crowdsourcedLanguage_creators:foundMultilinguality:monolingualSource_datasets:extended|other-80-Million-Tiny-ImagesLanguage:en
RichardErkhov/DASP HF Unverified

Dataset Card for DASP Dataset Description The DASP (Distributed Analysis of Sentinel-2 Pixels) dataset consists of cloud-free satellite images captured by Sentinel-2 satellites. Each image represents the most recent, non-partial, and cloudless capture from over 30 million Sentinel-2 images in every band. The dataset provides a near-complete cloudless view of Earth's surface, ideal for various geospatial applications. Images were converted from JPEG2000 to JPEG-XL to… See the full description on the dataset page: https://huggingface.co/datasets/RichardErkhov/DASP.

Task_categories:image-SegmentationTask_categories:image-ClassificationTask_categories:object-DetectionTask_categories:otherModality:geospatialSatellite-Imagery
schwein69/hagrid-subset HF Unverified

HaGRID Gesture Recognition Subset Dataset Description A curated subset of the HaGRID (Hand Gesture Recognition Image Dataset) containing 24 gesture classes for training gesture recognition models. Dataset Summary Total Images: 19,200 Gesture Classes: 24 Samples per Class: 800 Image Format: JPEG Average Image Size: ~302 KB Splits Split Images Percentage Train 14,592 76% Val 1,728 9% Test 2,880 15% Gesture Classes call… See the full description on the dataset page: https://huggingface.co/datasets/schwein69/hagrid-subset.

Task_categories:image-ClassificationTask_categories:object-DetectionSize_categories:10K<n<100KGesture-RecognitionComputer-VisionHand-Gestures
tanganke/stanford_cars HF Unverified

Stanford Cars Dataset Dataset Overview Splits: Training: 8144 images used for model training. Test: 8041 images used for evaluation. Contrast: 8041 images with high contrast for robustness testing. Gaussian Noise: 8041 images corrupted by Gaussian noise for robustness testing. Impulse Noise: 8041 images corrupted by impulse noise for robustness testing. JPEG Compression: 8041 compressed images for robustness testing. Motion Blur: 8041 images with motion blur for… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/stanford_cars.

Task_categories:image-ClassificationLanguage:enSize_categories:10K<n<100KFormat:parquetModality:imageLibrary:datasets
Showing 6 of 126 datasets (page 7 of 7)
Prev Next