H

Helsinki-NLP / fineweb-edu-translated

PQC Verified HuggingFace

Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data covers 36,704,000 documents with over 28 billion space-searated tokens of English data translated into 36 languages. The total data set is incudes of over 960 billion tokens and the translated documents are aligned across all languages. More information about how the data has been produced can… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/fineweb-edu-translated.

15 263,695 2

PQC-Verified with ML-DSA-87

This model has a real FIPS 204 ML-DSA-87 (Dilithium5) signature from the platform signing authority. Signature chain includes 2 verification(s). Last verified 2026-05-08.

ML-DSA-87 Signer: did:web:quantamrkt.com:chain:authority View public key

README.md

fineweb-edu-translated

Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data covers 36,704,000 documents with over 28 billion space-searated tokens of English data translated into 36 languages. The total data set is incudes of over 960 billion tokens and the translated documents are aligned across all languages. More information about how the data has been produced can… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/fineweb-edu-translated.

Intended Uses

This model is registered on the QuantaMrkt quantum-safe registry. All files have been cryptographically verified using post-quantum signatures.

Quick Start

# Install the CLI
pip install quantumshield

# Pull the model
quantumshield pull Helsinki-NLP/fineweb-edu-translated

# Verify file integrity
quantumshield verify Helsinki-NLP/fineweb-edu-translated

About

Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data covers 36,704,000 documents with over 28 billion space-searated tokens of English data translated into 36 languages. The total data set is incudes of over 960 billion tokens and the translated documents are aligned across all languages. More information about how the data has been produced can… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/fineweb-edu-translated.

Created 2026-04-20
Downloads 263,695
Likes 15

Get this model

View on HuggingFace

Pull with QuantumShield

quantumshield pull Helsinki-NLP/fineweb-edu-translated

Verify signatures

quantumshield verify Helsinki-NLP/fineweb-edu-translated

Signers

V1
did:quantamrkt:regis...hield-v1
TY
did:web:quantamrkt.c...uthority