H

HuggingFaceFW / fineweb-edu

PQC Verified HuggingFace

πŸ“š FineWeb-Edu 1.3 trillion tokens of the finest educational data the 🌐 web has to offer Paper: https://arxiv.org/abs/2406.17557 What is it? πŸ“š FineWeb-Edu dataset consists of 1.3T tokens and 5.4T tokens (FineWeb-Edu-score-2) of educational web pages filtered from 🍷 FineWeb dataset. This is the 1.3 trillion version. To enhance FineWeb's quality, we developed an educational quality classifier using annotations generated by LLama3-70B-Instruct. We then… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu.

1,061 536,493 2

PQC-Verified with ML-DSA-87

This model has a real FIPS 204 ML-DSA-87 (Dilithium5) signature from the platform signing authority. Signature chain includes 2 verification(s). Last verified 2026-05-08.

ML-DSA-87 Signer: did:web:quantamrkt.com:chain:authority View public key

README.md

fineweb-edu

πŸ“š FineWeb-Edu 1.3 trillion tokens of the finest educational data the 🌐 web has to offer Paper: https://arxiv.org/abs/2406.17557 What is it? πŸ“š FineWeb-Edu dataset consists of 1.3T tokens and 5.4T tokens (FineWeb-Edu-score-2) of educational web pages filtered from 🍷 FineWeb dataset. This is the 1.3 trillion version. To enhance FineWeb's quality, we developed an educational quality classifier using annotations generated by LLama3-70B-Instruct. We then… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu.

Intended Uses

This model is registered on the QuantaMrkt quantum-safe registry. All files have been cryptographically verified using post-quantum signatures.

Quick Start

# Install the CLI
pip install quantumshield

# Pull the model
quantumshield pull HuggingFaceFW/fineweb-edu

# Verify file integrity
quantumshield verify HuggingFaceFW/fineweb-edu

About

πŸ“š FineWeb-Edu 1.3 trillion tokens of the finest educational data the 🌐 web has to offer Paper: https://arxiv.org/abs/2406.17557 What is it? πŸ“š FineWeb-Edu dataset consists of 1.3T tokens and 5.4T tokens (FineWeb-Edu-score-2) of educational web pages filtered from 🍷 FineWeb dataset. This is the 1.3 trillion version. To enhance FineWeb's quality, we developed an educational quality classifier using annotations generated by LLama3-70B-Instruct. We then… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu.

Created 2026-04-20
Downloads 536,493
Likes 1,061

Get this model

View on HuggingFace

Pull with QuantumShield

quantumshield pull HuggingFaceFW/fineweb-edu

Verify signatures

quantumshield verify HuggingFaceFW/fineweb-edu

Signers

V1
did:quantamrkt:regis...hield-v1
TY
did:web:quantamrkt.c...uthority