Model Hub
Browse PQC-verified AI models, datasets, and tools
The ability to solve problems is a hallmark of intelligence and has been an enduring goal in AI. AI systems that can create programs as solutions to problems or assist developers in writing programs can increase productivity and make programming more accessible. Recently, pre-trained large language models have shown impressive abilities in generating new codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularity (e.g., function) level and in many cases without proper training data. Even more concerning is that in most cases the evaluation of generated codes has been done in terms of mere lexical overlap rather than actual execution whereas semantic similarity (or equivalence) of two code segments depends only on their ``execution similarity'', i.e., being able to get the same output for a given input.
Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data covers 36,704,000 documents with over 28 billion space-searated tokens of English data translated into 36 languages. The total data set is incudes of over 960 billion tokens and the translated documents are aligned across all languages. More information about how the data has been produced can… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/fineweb-edu-translated.
Dataset Card for "imdb" Dataset Summary Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Supported Tasks and Leaderboards More Information Needed Languages More Information Needed Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
-- 2nd International Chinese Word Segmentation Bakeoff - Data Release Release 1, 2005-11-18 Introduction This directory contains the training, test, and gold-standard data used in the 2nd International Chinese Word Segmentation Bakeoff. Also included is the script used to score the results submitted by the bakeoff participants and the simple segmenter used to generate the baseline and topline data. File List gold/ Contains the gold standard… See the full description on the dataset page: https://huggingface.co/datasets/zeroMN/hanlp_date-zh.
MedThinkVQA MedThinkVQA is an expert-annotated benchmark for multi-image diagnostic reasoning in radiology. Unlike prior medical VQA benchmarks that typically contain at most one image per case, MedThinkVQA requires models to extract evidence from each image, integrate cross-view information, and perform differential-diagnosis reasoning. Links GitHub: https://github.com/benluwang/MedThinkVQA Leaderboard: https://benluwang.github.io/MedThinkVQA/ Submission Guide:… See the full description on the dataset page: https://huggingface.co/datasets/bio-nlp-umass/MedThinkVQA.