N

natgillin / translations-raw

Unverified HuggingFace

natgillin/translations-raw Frozen, canonical raw bitext consolidated from upstream alvations/mtdata-raw* snapshots (since deleted). This is the read-only source-of-truth for downstream quality-filtering pipelines. 31,663 parquet files (1566.8 GB) 49 language pairs under data/<src-tgt>/ Schema: 5 columns — see below Read-only for downstream pipelines. Do not delete or modify. Schema Each parquet has 5 columns: column type description source string… See the full description on the dataset page: https://huggingface.co/datasets/natgillin/translations-raw.

5 86,248 1

Unverified Model

This model has not been PQC-verified. File integrity cannot be guaranteed against quantum threats.

README.md

translations-raw

natgillin/translations-raw Frozen, canonical raw bitext consolidated from upstream alvations/mtdata-raw* snapshots (since deleted). This is the read-only source-of-truth for downstream quality-filtering pipelines. 31,663 parquet files (1566.8 GB) 49 language pairs under data/<src-tgt>/ Schema: 5 columns — see below Read-only for downstream pipelines. Do not delete or modify. Schema Each parquet has 5 columns: column type description source string… See the full description on the dataset page: https://huggingface.co/datasets/natgillin/translations-raw.

Intended Uses

This model is registered on the QuantaMrkt quantum-safe registry. This model has not yet been PQC-verified.

Quick Start

# Install the CLI
pip install quantumshield

# Pull the model
quantumshield pull natgillin/translations-raw

# Verify file integrity
quantumshield verify natgillin/translations-raw

About

natgillin/translations-raw Frozen, canonical raw bitext consolidated from upstream alvations/mtdata-raw* snapshots (since deleted). This is the read-only source-of-truth for downstream quality-filtering pipelines. 31,663 parquet files (1566.8 GB) 49 language pairs under data/<src-tgt>/ Schema: 5 columns — see below Read-only for downstream pipelines. Do not delete or modify. Schema Each parquet has 5 columns: column type description source string… See the full description on the dataset page: https://huggingface.co/datasets/natgillin/translations-raw.

Created 2026-06-23
Downloads 86,248
Likes 5

Get this model

View on HuggingFace

Pull with QuantumShield

quantumshield pull natgillin/translations-raw

Verify signatures

quantumshield verify natgillin/translations-raw

Signers

V1
did:quantamrkt:regis...hield-v1