Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).
Use this model
Pull with QuantumShield
quantumshield pull legacy-datasets/wikipedia Verify integrity
quantumshield verify legacy-datasets/wikipedia pip install
pip install quantumshield && quantumshield pull legacy-datasets/wikipedia Unverified Model
This model has not been PQC-verified. File integrity cannot be guaranteed against quantum threats.
README.md
wikipedia
Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).
Intended Uses
This model is registered on the QuantaMrkt quantum-safe registry. This model has not yet been PQC-verified.
Quick Start
# Install the CLI pip install quantumshield # Pull the model quantumshield pull legacy-datasets/wikipedia # Verify file integrity quantumshield verify legacy-datasets/wikipedia
About
Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).
Get this model
Pull with QuantumShield
quantumshield pull legacy-datasets/wikipedia Verify signatures
quantumshield verify legacy-datasets/wikipedia