Model Hub

Browse PQC-verified AI models, datasets, and tools

C
cardiffnlp/twitter-roberta-base-sentiment-latest HF PQC Verified

Text ClassificationTransformersPyTorchTfRobertaEnglish MEDIUM
A
Alibaba-NLP/gte-multilingual-base HF PQC Verified

Sentence SimilaritySentence-TransformersSafetensorsNewFeature ExtractionMteb MEDIUM
C
cardiffnlp/twitter-xlm-roberta-base-sentiment HF PQC Verified

Text ClassificationTransformersPyTorchTfXlm-RobertaMultilingual HIGH
A
Alibaba-NLP/gte-reranker-modernbert-base HF PQC Verified

Text-RankingTransformersONNXSafetensorsModernbertText Classification HIGH
N
nlptown/bert-base-multilingual-uncased-sentiment HF PQC Verified

Text ClassificationTransformersPyTorchTfJAXSafetensors HIGH
H
Helsinki-NLP/opus-mt-nl-en HF Unverified

TranslationTransformersPyTorchTfRustMarian HIGH
H
Helsinki-NLP/opus-mt-en-de HF Unverified

TranslationTransformersPyTorchTfJAXRust HIGH
H
Helsinki-NLP/opus-mt-fr-en HF Unverified

TranslationTransformersPyTorchTfJAXSafetensors HIGH
H
Helsinki-NLP/opus-mt-tr-en HF Unverified

TranslationTransformersPyTorchTfMarianText2text-Generation MEDIUM
NTU-NLP-sg/xCodeEval HF PQC Verified

The ability to solve problems is a hallmark of intelligence and has been an enduring goal in AI. AI systems that can create programs as solutions to problems or assist developers in writing programs can increase productivity and make programming more accessible. Recently, pre-trained large language models have shown impressive abilities in generating new codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularity (e.g., function) level and in many cases without proper training data. Even more concerning is that in most cases the evaluation of generated codes has been done in terms of mere lexical overlap rather than actual execution whereas semantic similarity (or equivalence) of two code segments depends only on their ``execution similarity'', i.e., being able to get the same output for a given input.

Task_categories:translationTask_categories:token-ClassificationTask_categories:text-RetrievalTask_categories:text-GenerationTask_categories:text-ClassificationTask_categories:feature-Extraction
H
Helsinki-NLP/opus-mt-ko-en HF Unverified

TranslationTransformersPyTorchTfMarianText2text-Generation MEDIUM
Helsinki-NLP/fineweb-edu-translated HF PQC Verified

Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data covers 36,704,000 documents with over 28 billion space-searated tokens of English data translated into 36 languages. The total data set is incudes of over 960 billion tokens and the translated documents are aligned across all languages. More information about how the data has been produced can… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/fineweb-edu-translated.

Task_categories:translationTask_categories:text-GenerationLanguage:bosLanguage:bulLanguage:catLanguage:ces
stanfordnlp/imdb HF Unverified

Dataset Card for "imdb" Dataset Summary Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Supported Tasks and Leaderboards More Information Needed Languages More Information Needed Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.

Task_categories:text-ClassificationTask_ids:sentiment-ClassificationAnnotations_creators:expert-GeneratedLanguage_creators:expert-GeneratedMultilinguality:monolingualSource_datasets:original
zeroMN/hanlp_date-zh HF Unverified

-- 2nd International Chinese Word Segmentation Bakeoff - Data Release Release 1, 2005-11-18 Introduction This directory contains the training, test, and gold-standard data used in the 2nd International Chinese Word Segmentation Bakeoff. Also included is the script used to score the results submitted by the bakeoff participants and the simple segmenter used to generate the baseline and topline data. File List gold/ Contains the gold standard… See the full description on the dataset page: https://huggingface.co/datasets/zeroMN/hanlp_date-zh.

Task_categories:text-ClassificationLanguage:zhSize_categories:100M<n<1BCode
bio-nlp-umass/MedThinkVQA HF Unverified

MedThinkVQA MedThinkVQA is an expert-annotated benchmark for multi-image diagnostic reasoning in radiology. Unlike prior medical VQA benchmarks that typically contain at most one image per case, MedThinkVQA requires models to extract evidence from each image, integrate cross-view information, and perform differential-diagnosis reasoning. Links GitHub: https://github.com/benluwang/MedThinkVQA Leaderboard: https://benluwang.github.io/MedThinkVQA/ Submission Guide:… See the full description on the dataset page: https://huggingface.co/datasets/bio-nlp-umass/MedThinkVQA.

Task_categories:question-AnsweringTask_categories:text-GenerationLanguage:enSize_categories:1K<n<10KFormat:parquetModality:image
Showing 15 of 15 items (page 1 of 1)
Prev Next