Model Hub
Browse PQC-verified AI models, datasets, and tools
Dataset Card for ACL Anthology Corpus This repository provides full-text and metadata to the ACL anthology collection (80k articles/posters as of September 2022) also including .pdf files and grobid extractions of the pdfs. How is this different from what ACL anthology provides and what already exists? We provide pdfs, full-text, references and other details extracted by grobid from the PDFs while ACL Anthology only provides abstracts. There exists a similar corpus… See the full description on the dataset page: https://huggingface.co/datasets/WINGNUS/ACL-OCL.
About this dataset Context The datasets provided include the players data for the Career Mode from FIFA 15 to FIFA 23. The data allows multiple comparisons for the same players across the last 9 versions of the video game. Some ideas of possible analysis: Historical comparison between Messi and Ronaldo (what skill attributes changed the most during time - compared to real-life stats); Ideal budget to create a competitive team (at the level of top n teams in Europe) and… See the full description on the dataset page: https://huggingface.co/datasets/jsulz/FIFA23.
GPIC: A Giant Permissive Image Corpus for Visual Generation Keshigeyan Chandrasegaran*1, Kyle Sargent*1, Suchir Agarwal1, Michael Jang1, Michael Poli1,2, Juan Carlos Niebles1,4, Justin Johnson3, Jiajun Wu1, Li Fei-Fei1 1 Stanford University 2 Radical Numerics 3 University of Michigan 4 Salesforce… See the full description on the dataset page: https://huggingface.co/datasets/stanford-vision-lab/gpic.
ARKit Labelmaker: A New Scale for Indoor 3D Scene Understanding [arxiv] [website] [checkpoints] [code] We complement ARKitScenes dataset with dense semantic annotations that are automatically generated at scale. This produces the first large-scale, real-world 3D dataset with dense semantic annotations. Training on this auto-generated data, we push forward the state-of-the-art performance on ScanNet and ScanNet200 with prevalent 3D semantic segmentation models.
GIFT-Eval Pre-training Datasets Pretraining dataset aligned with GIFT-Eval that has 71 univariate and 17 multivariate datasets, spanning seven domains and 13 frequencies, totaling 4.5 million time series and 230 billion data points. Notably this collection of data has no leakage issue with the train/test split and can be used to pretrain foundation models that can be fairly evaluated on GIFT-Eval. 📄 Paper 🖥️ Code 📔 Blog Post 🏎️ Leader Board Ethical Considerations… See the full description on the dataset page: https://huggingface.co/datasets/Salesforce/GiftEvalPretrain.