README.md
11.8 KB · 211 lines · markdown Raw
1 # PQC eBPF Attestation
2
3 ![PQC Native](https://img.shields.io/badge/PQC-Native-blue)
4 ![ML-DSA-65](https://img.shields.io/badge/ML--DSA--65-FIPS%20204-green)
5 ![eBPF](https://img.shields.io/badge/eBPF-load%20gate-purple)
6 ![License](https://img.shields.io/badge/License-Apache%202.0-orange)
7 ![Version](https://img.shields.io/badge/version-0.1.0-lightgrey)
8
9 **A post-quantum signed load gate for eBPF programs on AI inference servers.** eBPF lets code run inside the Linux kernel; it is phenomenal for observability and security, and catastrophic as a supply-chain attack vector. A malicious eBPF program loaded on an inference host can silently intercept model calls, exfiltrate weights out of `/dev/nvidia*` handles, or rewrite output tokens on the way back to the user - with no trace at the application layer. This library is the cryptographic envelope that sits *before* `bpf_prog_load()`: every program is ML-DSA signed, every signer has a DID, every load attempt is matched against a `LoadPolicy` and appended to an audit log. The actual kernel integration (an LSM hook, a pre-load userspace verifier, or a Kubernetes admission controller) is the user's job; this library gives you the data structures, signing, verification, and policy engine to plug into that integration.
10
11 ## The Problem
12
13 AI inference servers are a uniquely sensitive target for eBPF-based attacks:
14
15 - **Weight exfiltration.** A `kprobe` or `tracing` program attached to a CUDA or cgroup memory path can read GPU-adjacent pages and stream out model weights over a socket.
16 - **Silent tampering.** An `XDP` or `sched_cls` program can rewrite outbound JSON from an inference API mid-flight.
17 - **Observability poisoning.** A `perf_event` program can filter out its own footprint from the very telemetry the defender relies on.
18 - **Supply chain.** eBPF programs shipped as `.bpf.o` objects by vendors, operators, or observability agents have no standard signing model today; the kernel happily loads anything with `CAP_BPF`.
19
20 Pre-quantum signatures (RSA, ECDSA, Ed25519) also carry a long-term forgeability risk: signed eBPF binaries retain trust for years, well into the timeline where a cryptographically relevant quantum computer could forge new ones retroactively.
21
22 ## The Solution
23
24 - **ML-DSA (FIPS 204)** signatures over a canonical manifest: `metadata + SHA3-256(bytecode) + size`. The signature does not bloat with the bytecode.
25 - **DID-based signer identity** via `quantumshield.identity.agent.AgentIdentity` - every signer has a stable `did:pqaid:<hash>` that policies can reason about.
26 - **`LoadPolicy` with ordered rules** - each rule covers a set of `BPFProgramType`s, an allow-list of signer DIDs, an optional signature requirement, and a max-size cap. First matching rule wins; no match falls through to a configurable default (deny by default).
27 - **Append-only `AttestationLog`** - every load attempt is recorded with its signer, hash, decision, reason, and actor, regardless of whether it was accepted.
28 - **CLI (`pqc-bpf sign | verify | info`)** for ops teams.
29
30 ## Installation
31
32 ```bash
33 pip install pqc-ebpf-attestation
34 ```
35
36 Development:
37
38 ```bash
39 pip install -e ".[dev]"
40 ```
41
42 ## Quick Start
43
44 ```python
45 from quantumshield.identity.agent import AgentIdentity
46
47 from pqc_ebpf_attestation import (
48 AttestationLog,
49 BPFProgram,
50 BPFProgramMetadata,
51 BPFProgramType,
52 BPFSigner,
53 BPFVerifier,
54 LoadPolicy,
55 PolicyRule,
56 )
57
58 # 1. Load and sign a compiled eBPF object.
59 metadata = BPFProgramMetadata(
60 name="trace_sys_enter_read",
61 program_type=BPFProgramType.KPROBE,
62 attach_point="sys_enter_read",
63 author="ops-team",
64 )
65 program = BPFProgram.from_file(metadata, "trace.bpf.o")
66
67 identity = AgentIdentity.create("bpf-signer", capabilities=["sign"])
68 signer = BPFSigner(identity)
69 signed = signer.sign(program)
70
71 # 2. Verify independently.
72 result = BPFVerifier.verify(signed)
73 assert result.valid
74
75 # 3. Enforce a policy at load time.
76 policy = LoadPolicy().add_rule(
77 PolicyRule(
78 program_types=(BPFProgramType.KPROBE, BPFProgramType.TRACING),
79 allowed_signers=frozenset({identity.did}),
80 )
81 )
82 log = AttestationLog()
83 decision, reason = policy.evaluate(signed)
84 log.log(signed, decision, reason, actor="admission-controller")
85
86 if decision.value == "deny":
87 raise SystemExit(f"blocked: {reason}")
88
89 # ... now hand `signed.program.bytecode` to bpf_prog_load().
90 ```
91
92 ## Architecture
93
94 ```
95 +-----------------+ +-----------+ +--------------------+
96 | Dev / CI | --> | bpftool, | --> | .bpf.o object |
97 | writes BPF C | | clang BPF | | (compiled) |
98 +-----------------+ +-----------+ +---------+----------+
99 |
100 v
101 +---------------------+
102 | BPFSigner (ML-DSA) |
103 | signs canonical |
104 | manifest (hash+meta)|
105 +----------+----------+
106 |
107 v
108 +---------------------+
109 | SignedBPFProgram |
110 | ships with .bpf.o |
111 +----------+----------+
112 |
113 deployment / OCI bundle / admission webhook |
114 v
115 +------------------+ +----------------+ +----------------+
116 | LoadPolicy |--->| BPFVerifier |-->| AttestationLog |
117 | (rules, allow- | | checks sig + | | (append-only) |
118 | list, size caps)| | hash match | +----------------+
119 +--------+---------+ +-------+--------+
120 | |
121 | allow | deny
122 v v
123 +-------------------+ +------------------+
124 | bpf_prog_load() | | rejected before |
125 | kernel accepts | | reaching kernel |
126 +-------------------+ +------------------+
127 ```
128
129 ## Cryptography
130
131 | Primitive | Algorithm | Source |
132 |--------------------------|----------------------------|--------------------------------|
133 | Digital signature | ML-DSA-65 (FIPS 204) | `quantumshield.core.signatures`|
134 | Bytecode hash | SHA3-256 | `hashlib.sha3_256` |
135 | Canonical manifest | Sorted JSON, UTF-8 | Deterministic, compact |
136 | Identity | `did:pqaid:<sha3-256(pk)>` | `quantumshield.identity.agent` |
137
138 The signature does not cover the raw bytecode - only `metadata + SHA3-256(bytecode) + size`. Bytecode integrity is checked by recomputing the hash at verification time. This keeps the signature envelope small (stable at a few hundred bytes of metadata + a ~3 KB ML-DSA-65 signature), regardless of the size of the eBPF object.
139
140 ## Policy Model
141
142 A `LoadPolicy` is an ordered list of `PolicyRule`s. Each rule declares:
143
144 - **`program_types`** - which `BPFProgramType` values the rule covers (e.g., `(KPROBE, TRACING)`).
145 - **`allowed_signers`** - a `frozenset[str]` of DIDs permitted to sign programs for these types. Empty set means "any verified signer".
146 - **`require_signature`** - whether an invalid signature forces a deny (default `True`; turning this off is only for testing).
147 - **`max_bytecode_size`** - hard cap on bytecode size; default 2 MiB. Prevents signing-gate bypass via oversize or compressed programs.
148
149 Evaluation:
150
151 1. Iterate rules in order. First rule whose `program_types` matches is the chosen rule.
152 2. If no rule matches, return `default_decision` (default `DENY`).
153 3. Apply the matching rule: size check, signature check, allow-list check.
154 4. First failing check returns `DENY` with a human-readable reason.
155
156 `policy.enforce(signed)` raises `UntrustedSignerError` if the signer is not in the allow-list, or `PolicyDeniedError` for any other denial.
157
158 ## CLI Reference
159
160 ```bash
161 # Sign a compiled BPF object.
162 pqc-bpf sign trace.bpf.o --name trace-read --type kprobe --author ops-team
163
164 # Verify an envelope. Exit 0 if valid, 1 otherwise.
165 pqc-bpf verify trace.bpf.o.sig.json
166
167 # Pretty-print metadata without verifying.
168 pqc-bpf info trace.bpf.o.sig.json
169
170 # Show version.
171 pqc-bpf --version
172 ```
173
174 ## Integration Notes
175
176 This library is intentionally **userspace-only and kernel-agnostic**. It does *not* hook `bpf()` syscalls, does not ship an LSM module, and does not link against `libbpf`. Real enforcement requires wiring one of:
177
178 - **A pre-load userspace verifier.** A small daemon that replaces direct `bpf_prog_load()` calls in your deployment: callers pass `SignedBPFProgram` envelopes, the daemon verifies and enforces, and only then calls into libbpf. Simplest model, easiest to audit.
179 - **An LSM hook.** Kernels with `CONFIG_BPF_LSM=y` can attach a BPF LSM program to `bpf_prog_load` that rejects unsigned programs - with the signature verifier itself running in userspace over a ring buffer. Defense-in-depth but more involved.
180 - **An admission controller.** For Kubernetes, gate CRDs that reference BPF programs (Cilium, Falco, Tetragon, bpfman) through a webhook that verifies and logs before the DaemonSet even schedules.
181
182 In every case, this library provides the envelope format, the verifier, the policy engine, and the audit log. The trust root is the set of signer DIDs you choose to put in your `LoadPolicy`.
183
184 ## Threat Model
185
186 | Threat | Mitigation |
187 |-----------------------------------------------------------|-----------------------------------------------------------------|
188 | Unsigned attacker-supplied `.bpf.o` loaded via CAP_BPF | `require_signature=True`; no envelope -> no load |
189 | Legit-looking program signed by a rogue insider | Allow-list of DIDs; rogue DID not permitted |
190 | Bytecode swapped after signing (TOCTOU) | SHA3-256 hash in signed manifest; verifier recomputes |
191 | Signature replayed on a different program | Canonical manifest binds signature to exact metadata + hash |
192 | Oversize program smuggling raw weights back to attacker | `max_bytecode_size` cap |
193 | Future quantum adversary forges ECDSA-signed BPF object | ML-DSA-65 is NIST FIPS 204, resistant to Shor's algorithm |
194 | Tampering with audit record | In-memory append-only; sink to WORM store in production |
195
196 Out of scope: anything the kernel verifier itself misses (bounds-checker bypasses, stack overflow via helpers, JIT spraying). The library trusts the kernel's BPF verifier to do its job once a program is loaded.
197
198 ## Why PQC Matters for eBPF
199
200 eBPF programs are *long-lived trust artifacts*. A kernel probe shipped with an observability agent today can remain deployed for a decade across thousands of hosts. If the signature that authorizes it is RSA-2048 or secp256r1, a cryptographically relevant quantum computer appearing within that window lets any attacker forge *new* programs that pass the same gate - with no warning, because the forgery is indistinguishable from a legitimate signature. Rotating that gate to ML-DSA-65 from day one keeps the trust boundary intact on the ten-year horizon where eBPF-based infrastructure actually operates.
201
202 ## Examples
203
204 - `examples/sign_and_verify.py` - sign, serialize, deserialize, verify a synthetic program.
205 - `examples/enforce_load_policy.py` - three signers (two trusted, one rogue) evaluated against a policy, audit log printed.
206 - `examples/tampered_bytecode_rejected.py` - mutate bytecode post-sign; hash-consistency check fires.
207
208 ## License
209
210 Apache-2.0. See `LICENSE`.
211