L

legacy-datasets / wikipedia

Unverified HuggingFace

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).

645 120,455 1

Unverified Model

This model has not been PQC-verified. File integrity cannot be guaranteed against quantum threats.

README.md

wikipedia

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).

Intended Uses

This model is registered on the QuantaMrkt quantum-safe registry. This model has not yet been PQC-verified.

Quick Start

# Install the CLI
pip install quantumshield

# Pull the model
quantumshield pull legacy-datasets/wikipedia

# Verify file integrity
quantumshield verify legacy-datasets/wikipedia

About

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).

Created 2026-06-27
Downloads 120,455
Likes 645

Get this model

View on HuggingFace

Pull with QuantumShield

quantumshield pull legacy-datasets/wikipedia

Verify signatures

quantumshield verify legacy-datasets/wikipedia

Signers

V1
did:quantamrkt:regis...hield-v1