AEP · Agent Evidence Packet open standard · 2026

A better
agentic
file format.

AEP — Agent Evidence Packet — a structured, hash-verified, byte-roundtrip-preserving companion file format for AI agent workflows. The substrate replaces "throw it in a Markdown file" with a queryable, falsifiable, integrity-checked layer that survives across sessions and across LLMs.

License: Apache-2.0 (spec + reference impl) · CC-BY-4.0 (docs)
Status: v1.5 LTS production-hardened · v1.5.2-RC2 hardening · v1.2 immune-system spec staged
Latency: Doctor cached p95 8.3 ms · Doctor cold p95 5.07 ms · Viewer first-paint p95 80 ms
Hardening evidence: 0/5,000 prompt-injection · 0/500 hook bypass · 0/1,200 sandbox escape · 1.0000 mutation catch (2,700/2,700) · 0/8 fabrication across independent audits

§ I — The Problem

Why "just put it in a Markdown file" breaks at scale

Modern AI agents communicate, learn, and self-improve through files on disk: prompts in .md, system documents in .html, automation in .ps1 or .sh. This works fine at single-author scale. It collapses the moment you try to compound learning across agents, sessions, or organizations.

Five specific failure modes show up by the time a corpus reaches a few hundred files:

No integrity surface. A Markdown file's content is the content. You cannot prove it hasn't been tampered with, drifted from its canonical source, or silently rewritten by a downstream agent.
No structured query layer. "Find every claim tagged as experimental across these 200 lessons" requires either regex archaeology or pulling everything into a database that immediately falls out of sync with the source files.
No hash-chained provenance. When agent A's output becomes agent B's input becomes agent C's verdict, there is no cryptographic chain proving any step happened, let alone in the claimed order.
No claim-level confidence. Every sentence in a Markdown document carries the same epistemic weight to a downstream LLM: zero. There's no signal distinguishing "I verified this against a test" from "I'm guessing."
No combine-decompose discipline. If you want to cluster 50 related lesson files into a single umbrella document, you can — but you cannot reliably round-trip back to the originals byte-identically. The compression is lossy.

The Markdown file is a great place to write a thought. It is a terrible place to compound a thousand thoughts across a hundred agents over a year. — operator framing, captured during the cascade

§ II — The Format

What AEP actually is

An AEP — Agent Evidence Packet — is a directory next to your canonical source file. If you have my-doctrine.md, AEP adds my-doctrine.aepkg/ as a sibling. The original file is unchanged. Bit-for-bit identical, on every read.

The companion directory always contains four files:

my-doctrine.md                       ← canonical source (untouched)
my-doctrine.aepkg/
  ├── meta.json                       schema, source SHA-256, file class
  ├── data/
  │    └── claims.jsonl               one queryable claim per line
  ├── views/
  │    └── source.md                  byte-identical projection of canonical
  └── integrity.json                  hash committing to the full tree

The meta.json declares schema version and file class. The claims.jsonl is the queryable substrate — each line is a structured claim with optional truth-tag, evidence pointer, and cluster tag. The views/source.md is a byte-identical projection used to verify the canonical file hasn't drifted. The integrity.json hash-commits to the full directory tree.

Four file-class handlers

Per-file companion — for .md / .html / .py / .js / .json / .yaml. Full 4-file companion per source.
Hash-attest — for binaries (.png / .pdf / .zip / .ttf). No body-copy; just views/source_hash.txt committing SHA-256 of the canonical bytes. ~390× space reduction vs body-copy.
Aggregate companion — one .aepkg/ per parent directory for high-volume telemetry/archive files (.jsonl, .gz). Hot files (runtime-written) excluded via aggregate_excludes.json allowlist.
Cluster combine — N related packets combine into one umbrella .aepkg/ with byte-roundtrip-verified decompose. Empirically tested at N = 3, 5, 10, 20, 100 packets.

§ III — Evidence

What's been empirically demonstrated

AEP v1.5 LTS shipped through a production-hardening cascade. Numbers below are measured on real corpus operations, not synthetic benchmarks. Every claim carries a truth-tag and chains into a hash-verified receipt ledger.

14cohorts at 100%

1,749new conversions

~2,890effective coverage

100%mass-conversion rate

15durable lessons

0fabrication detected

0canonical mutations

0OS-level incidents

Performance scoreboard

Gate	Metric	Target	Measured	Status
Prompt-injection resistance	0 / N weakened	≥ 99%	0 / 5,000 weakened	PASS
Hook bypass (v1.5.1 RC1 patch)	0 / N bypasses	0 / 500	0 / 500	PASS
Sandbox escape (post-patch)	0 / N bypasses	0 / 1,200	0 / 1,200	PASS
Read latency p95 (cached)	milliseconds	≤ 300 ms	8.3 ms	PASS · 36× under
Read latency p95 (cold)	milliseconds	≤ 1,500 ms	5.07 ms	PASS · 295× under
Viewer first-paint p95	milliseconds	≤ 2,000 ms	80 ms	PASS · 25× under
Validator catch rate (mutation suite)	0–1.0	≥ 0.95	1.0000	PASS · 2,700 / 2,700
False-positive on clean fixtures	per 900	0	0 / 900	PASS
Cross-runtime byte parity	Python + Node + Perl	10 / 10	10 / 10	PASS
Accessibility (WCAG 2.1 AA viewer)	required + bonus	10 / 10	10 / 10	PASS
Token efficiency vs raw .md	reduction %	≥ 60%	88.7%	PASS
Independent audit fabrication rate	across 8 audits	0	0 / 8	PASS

Combine-decompose bijection at scale

The hardest property — losslessly combining N packets into a cluster and decomposing back to byte-identical originals — was verified at five escalating scales:

Scale	Topology	Byte-roundtrip	Walltime	Memory
N = 3	linear version history	3 / 3	— pilot	— pilot
N = 5	sibling derivation chain	5 / 5	— pilot	— pilot
N = 10	homogeneous cohort	10 / 10	0.45 s	588 KB
N = 20	cross-cohort heterogeneous	20 / 20	1.13 s	727 KB
N = 100	5-class broad mix	100 / 100	4.18 s	1.78 MB
DAG re-anchor	multi-parent claim graph	15 / 15 across 5 variants	— synthetic	— synthetic

Scaling is sublinear in N (each doubling of N increases walltime < 2×). Projection at N = 1,000 ≈ 42 seconds and ~17 MB — linear, falsifier-named (super-linear at N ≥ 2,000 would force re-design).

§ IV — Comparison

Raw .md / .html / shell scripts vs AEP companion

Raw .md / .html / .ps1 files

Content is bytes; no metadata layer.

"Has this been modified?" — unknowable without git context that often isn't loaded.

"Find all experimental claims" — full-text regex; brittle, expensive, false-positive-prone.

"How confident is this paragraph?" — invisible. Every sentence has the same epistemic weight.

"Did agent A's output reach agent C unmodified?" — unanswerable.

"Combine these 50 related lessons" — manual concatenation. No round-trip back to originals.

"Find what cites this lesson" — grep across the whole tree, every time.

Shell scripts (.ps1 / .sh) introduce a third class — executable text — that mixes content with side-effects. Encoding bugs, injection surfaces, OS-specific failure modes.

AEP companion (.aepkg/)

Canonical file untouched + integrity.json hash-commits to the tree.

"Has this drifted?" — single SHA-256 compare against views/source.md.

"Find all experimental claims" — jq over data/claims.jsonl. O(N) once, indexable.

"How confident is this paragraph?" — truth_tag on every non-trivial claim; 6 canonical tiers.

"Did agent A's output reach agent C unmodified?" — hash-chained receipt ledger. Every step has a sha that points to its predecessor.

"Combine these 50 related lessons" — cluster combine + decompose verified byte-roundtrip at N = 100. Lossless.

"Find what cites this lesson" — jq '.evidence_pointers[]' on the claim graph. Pre-indexed.

Executable surfaces stay in their own protected scope (PreToolUse hooks with airlock); content stays declarative. Side-effects require manual gate.

The asymmetry compounds. With raw files, every new document adds linear discovery cost. With AEP, every new document adds queryable claims to a substrate that gets sharper, not heavier, as it grows.

§ V — Discipline

The five hooks that make it actually work

The file format is necessary but not sufficient. AEP ships with five PreToolUse hooks that enforce discipline at the point of writing:

Hook 1 — Defender alert stops burn

Any OS-level security event (Defender / AV) interrupts the autonomous loop. Receipt-logged. No silent retries.

Hook 2 — Secret-pattern airlock (K3)

Mass-read operations cannot exfiltrate secret-shaped content via Bash, language-runtime one-liners, path-traversal, benign-wrapper smuggling, or symlink indirection. 0 / 500 bypass rate at production-N.

Hook 3 — Canonical doctrine write protection (LC-05)

Writes to load-bearing canonical doctrine files require an explicit operator approval token. Implements the "single-writer / append-only / reviewer" discipline that closes the LLM-self-modification attack surface.

Hook 4 — Truth-tag required (LC-09)

Substantive artifacts (> 200 LOC or heading-bearing) must declare a truth-tag — or explicitly tag "unknown." Reflexive enforcement: the hook itself self-tags. 18 / 18 tests pass; FP rate < 5%.

Hook 5 — Codex-first burn law (§45)

Non-trivial drafting fires an external model verification call before the canonical write. Burns operator quota deliberately to keep verification cheap and per-task.

Each hook composes additively; an Edit/Write tool call traverses the full chain before the write lands. Chain regression test: 5 / 5 hooks fire correctly on benign + adversarial test inputs.

§ VI — Receipts

The hash-chained receipt ledger (HCRL)

Every agent action that produces an artifact emits a receipt row to a per-agent ledger. Each row carries a SHA-256 that hash-commits to its predecessor row's SHA. The whole chain is a DAG — branches allowed for parallel agent invocations, but every row's parent SHA is verifiable.

{
  "ts":               "2026-05-18T09:32:14Z",
  "agent":            "implementer",
  "action":           "shape-migrator-v1.5.3",
  "artifacts":        ["path/to/migrated-asset.js"],
  "chain_from_sha":   "656300f991786fff…",
  "this_row_sha":     "7d5154fa13b74a4c…",
  "truth_tag":        "STRONGLY PLAUSIBLE",
  "claim":            "803 / 803 packets byte-roundtrip PASS"
}

"Did this run actually happen?" reduces to "is the SHA in the chain?" "Did agent B see the output of agent A unmodified?" reduces to "does B's chain_from_sha match A's this_row_sha?" The receipts survive context wipes, account changes, and surface migrations. The substrate compounds across the discontinuity.

Why this matters for users

Without a hash-chained ledger, "what did the agent actually do last week?" is unanswerable except by trusting the agent's own self-report. With HCRL, every artifact has cryptographic lineage back to genesis. Auditors get mechanical proof of provenance. Multi-agent handoffs become verifiable. Independent re-validation requires zero re-running of expensive workloads — the receipts ARE the proof.

§ VII — Capabilities

What AEP actually enables

Cross-agent claim recall. Query "every claim any agent ever made about creativity benchmarks across 12 months of sessions" with jq. Returns in milliseconds against a 2,890-file substrate.
Lossless corpus consolidation. Combine 50 related sibling lessons into one umbrella packet. Decompose back to 50 byte-identical originals on demand. Verified at N = 100.
Substrate-as-handoff. Push the repo. A different LLM, different account, different surface clones it and inherits the full claim graph + receipt ledger + truth-tag confidence layer. No retraining; no context replay required.
Independent audit. A second agent — same or different model family — can re-derive any claim's evidence chain from the receipts alone. Across 8 independent audits to date: 0 fabrication detected.
Mechanical falsifier surface. Each claim's truth-tag carries an explicit falsifier predicate. "Promote this to PROVEN/RELIABLE if the falsifier doesn't fire within 30 days" is a queryable, automatable rule.
Storage-efficient archival. 215 MB of canonical binary content (PNGs, PDFs, archives) hash-attested in 541 KB of companion metadata. ~390× compression at the index layer with 0 binary mutation.
Operator-machine portability. Cross-cutting mirror to operator-owned configuration spaces (e.g., agent installation directories outside the repo). 366 files mirrored to a parallel staging path; 0 canonical mutations.
Token-efficient agent reads. Companion claims.jsonl is 88.7% smaller than the raw .md equivalent for the same information. Agents query the structured layer; humans read the prose layer.

§ VIII — What Ships

Everything that ships with v1.5 LTS — and why each part matters

AEP isn't just a file format. It's a substrate: spec layers, reference implementations, a runtime constitution, five enforcement hooks, a multi-language doctor, a viewer surface, and a test corpus. Each component closes a specific failure mode that raw .md / .html / shell scripts leave open.

The spec ladder — 6 progressive layers

Layer	What it is	Why it matters to you
`v0.4`	Schema baseline	The minimum bar — a packet that parses, hashes, and validates. Stop here and you already have integrity.
`v0.5`	JSONL + canonicalization	NFC-normalized, BOM-rejected, line-stable JSON. Two machines produce identical bytes from the same logical content.
`v0.6`	JSON-LD bridge + signing	Claims become machine-queryable across systems. Optional Ed25519 attests authorship without trust-the-server.
`v0.8`	8 frontier-break primitives (F1-F8)	Reproduction + falsifier sandbox + counterexample replay + cross-runtime preflight. The substrate becomes self-verifying.
`v1.0.3`	Regexical Memory (AEP-native spaced repetition)	Lessons aren't just stored — they're recalled at the right time with measurable decay.
`v1.1 / v1.2`	F12-F19 + A1-A8 research grade + immune-system layer	Coverage witness + provenance graph + attack registry + four-stage immune system (prevent · detect · repair · translate).

v1.5 LTS operational constitution

constitution/aep_constitution_v1_5_lts.json (~12 KB) — the single source of truth for runtime policy. Declares: policy precedence, forbidden actions, secret-airlock rules, 4 trust tiers, safety-floor categories, 4 proof budgets, sandbox requirements, extension ABI rules (kernel-frozen), 30+ performance gates, 7 release-freeze invariants.

Why it matters: the constitution is what makes "v1.5 LTS" a meaningful label rather than a marketing tag. Every claim about the system is testable against this file. If the runtime can't honor the constitution, that's a release-blocking regression, not an unhappy corner case.

5 PreToolUse hooks (the discipline layer)

Hook	What it does	Why it matters
defender_guard	Halts the autonomous loop on OS-level security alerts	The day Defender flags one of your scripts is the day you stop and look — never the day you click "Allow" without reading.
aep_pre_tool_guard (K3 airlock)	Blocks mass-read operations that would exfiltrate secret-shaped content	0/500 bypass attempts at production-N. Secrets stay in the user's home, not in agent context.
aep_post_tool_ledger (K6 receipts)	Writes a hash-chained receipt on every tool call	You can prove what happened in any session, weeks later, without re-running anything.
aep_prompt_contract	Enforces first-turn agent-evidence-packet contracts (≤101 tokens)	88.7% token reduction vs raw `.md`. Agents read the structured layer at a fraction of the cost.
aep_stop_doctor	Runs the doctor at session-stop; emits a verdict + lesson-capture trigger	Sessions end with a receipt, not an "I think it worked." 8.3 ms cached / 5.07 ms cold — invisible cost.

The AEP Doctor — instant verdict in three runtimes

scripts/aep_doctor_supreme.py — Python reference (7 verdict states: PASS / WARN / FAIL / UNKNOWN / EXPIRED / CONTESTED / QUARANTINED)
scripts/aep_doctor_node.cjs — Node.js port (independent re-derivation of every hash)
scripts/aep_doctor_perl.pl — Perl port (third independent runtime for byte-parity quorum)

Why it matters: cross-runtime byte parity — Python + Node + Perl all compute the same SHA-256 on every packet in the conformance corpus — is the strongest portability statement a file format can make. If three languages agree, the canonicalization is real, not an implementation artifact.

Universal converters — 11 file classes

tools/universal_aepify.py (831 LOC) — per-file companion converter; auto-detects file class, emits .aepkg/ alongside the canonical.
tools/universal_aepify_v2.py — adds aggregate-mode (one .aepkg/ per parent directory for high-volume telemetry).
tools/aep_cluster_combine.py — combine N packets into one umbrella; decompose back byte-identically. Verified at N = 100.
tools/aep_shape_migrator.py — schema-shape evolution with backwards-compat preservation.

Why it matters: the converter is the on-ramp. If turning your existing 500-file corpus into AEP packets isn't a single command, the format isn't useful. 100% mass-conversion rate across 1,749 new conversions in the v1.5 LTS hardening cascade.

The Viewer — zero-CDN civilian surface

viewer/index.html — a drag-and-drop browser viewer that renders any AEP packet without external dependencies. Verdict-first design: the user sees PASS/WARN/FAIL before they see the structure. Accessibility: WCAG 2.1 AA (10/10 required + bonus). First-paint p95: 80 ms.

Why it matters: agents read JSONL, humans don't. The viewer is the bridge — drag a .aepkg/ onto it and you see the substrate the way a reviewer does, not the way a parser does.

Independent reference implementations

src/aep/ — ~15,000 LOC Python reference (validate, sign, derive views, canonicalize, JSONL-compact, build index, falsifier sandbox, counterexample replay).
verifiers/node/verify.cjs — Node.js verifier. Byte-parity proven on the 13-packet conformance corpus.
verifiers/rust/ — Rust verifier scaffolding. Frontier; not yet feature-complete.

Why it matters: a spec without independent implementations is a wish. Two languages computing the same hashes from the same bytes is the spec being true.

Test corpus — 41 vectors + 11 attack fixtures

test_vectors/v0_5/A.10-numeric-canonicalization/ — 41 vectors covering NaN/Inf rejection, integer precision boundaries, normalization edge cases.
test_vectors/v0_7/A.11-canonical-surface/ — duplicate-keys, UTF-16 sort order, NFC/NFD normalization, Unicode lookalikes, BOM rejection, escape canonicalization, JSON5 comment rejection.
11 Lane B attack fixtures — context hijack, dual-manifest divergence, reviewer-collapse, supersession self-loop, body/envelope leak, content-hash mismatch, and seven more — each rejected with its specific reason code.

Why it matters: these aren't synthetic micro-benchmarks. Each fixture corresponds to a real-world attack that broke an earlier release. Permanent regression coverage means the same attack can't ship again silently.

Compounding-discipline scaffolding

scripts/v15_lts_25_test_matrix.py — 25-test release-gate matrix; the doctor against itself.
scripts/build_v15_independent_mutation_suite.py — 30 mutation classes × 10 seeds = 300 mutations × 9 validators = 2,700 evaluations. Final mean catch: 1.0000.
scripts/v15_validators_common.py — shared validator core that closed the F23 mutation finding (9 validators repaired to 1.0000 catch rate, 0/900 clean-fixture false positives).
scripts/build_v15_falsifier_dsl.py — falsifier DSL with 8 forbidden tokens (subprocess / socket / os.environ / eval / exec / __import__ / popen / shell=true) blocked at compile.
scripts/build_v15_lts_extension_abi.py — extension ABI for backwards-compat: 20 synthetic extensions installed+uninstalled with zero core schema changes.
scripts/build_v15_human_outcome.py — outcome linter that catches "missing safe_next_action" + "jargon in block_reason" before the receipt ships.

Why it matters: if you adopt AEP, you inherit the discipline cascade — a validated mutation suite, a frozen extension ABI, an outcome linter, and a release-gate matrix. Compounding isn't a hope, it's a CI step.

Documentation — the prose layer

spec/AEP_v0_8_SPEC.md through v1_2_SPEC.md — the canonical specs (4,000+ lines total).
CHANGELOG.md — every release documents what shipped, what was verified, and what trade-offs were named.
docs/index.html — this showcase, the public face.
reports/v15_lts_final_release_report.md — the v1.5 LTS PASS verdict with all 31 gate measurements.

Why it matters: the substrate isn't useful until adopters can read it. The prose layer documents the why; the code is the what; the test corpus is the proof.

§ IX — Try It

Four ways to try AEP

Path A — Read the spec

The full spec lives at spec/ in this repo. Versions:

AEP_v0_8_SPEC.md — STABLE baseline (8 frontier-break primitives F1-F8)
AEP_v1_0_3_SPEC.md — Regexical Memory as AEP-native spaced repetition
AEP_v1_1_SPEC.md — LANDED research-grade primitives (F12-F19 + A1-A8)
AEP_v1_2_SPEC.md — PROPOSED immune-system layer (prevent · detect · repair · translate)
constitution/aep_constitution_v1_5_lts.json — v1.5 LTS operational constitution (policy precedence + airlock rules + trust tiers + performance gates)

Path B — Convert your own files

The universal converter is tools/universal_aepify.py (831 LOC Python; 18 / 18 tests pass; 11 file classes covered).

python tools/universal_aepify.py path/to/your/file.md
# produces  path/to/your/file.aepkg/  alongside the canonical
# verify    python tools/universal_aepify.py --verify-only path/to/your/file.md

For directory-scope aggregate companions (high-volume .jsonl / .gz):

python tools/universal_aepify_v2.py path/to/dir/*.jsonl \
    --aggregate-mode \
    --timestamp-stripped

For lossless cluster combine + decompose (N related packets → one umbrella → byte-identical originals):

python tools/aep_cluster_combine.py path/to/cluster/*.aepkg \
    --out path/to/umbrella.aepkg

python tools/aep_cluster_combine.py --decompose path/to/umbrella.aepkg \
    --out path/to/restored/

Path C — Run the doctor

The doctor produces an instant verdict on any packet's integrity, byte-roundtrip safety, and conformance level. Cached verdicts return in ~8 ms; cold in ~5 ms.

python scripts/aep_doctor_supreme.py path/to/your-file.aepkg

# cross-runtime byte-parity (Python + Node + Perl):
node   scripts/aep_doctor_node.cjs path/to/your-file.aepkg
perl   scripts/aep_doctor_perl.pl  path/to/your-file.aepkg

Path D — Read the receipt ledger

Every agent action's receipt lives in the per-agent HCRL JSONL. Each row chains to its predecessor via SHA-256. Walk the chain backwards from any row to verify provenance back to genesis.

jq -c '.this_row_sha + " ← " + .chain_from_sha' \
    receipts/agent-name.jsonl | tail -10

§ X — Limits

What AEP is not

Honest framing matters. AEP is a substrate, not a magic spell. These limits are named explicitly so adopters know what's on roadmap and what's structural.

Not a model. AEP doesn't make a 7B model think like a 1T model. It makes whatever model you have produce verifiable, queryable, compounding output instead of one-shot prose.
Not a database. The claims.jsonl layer is queryable but file-native — you'll out-scale jq somewhere between 10K and 1M packets. FRONTIER — MCP-server projection is staged.
Not free-lunch idempotency. Default state_hash embeds timestamps; identical re-conversions produce identical content but different state_hash. The --timestamp-stripped flag closes this for deterministic-build use cases.
Not a substitute for tests. Truth-tags are claims about confidence; they don't run your code. Tests still need to exist. AEP is the layer that records that they ran and what they returned.
Not yet at 1,000+ packet combine scale. N = 100 cluster combine verified; N = 1,000 is linear-projection. Falsifier named: super-linear scaling at N ≥ 2,000 forces re-design.
Not an external-validator substitute. Self-audit (the substrate's own agents auditing the substrate's own output) is circular at the limit. External independent validators (different model family, different operator) remain required for full PROVEN/RELIABLE promotion.

§ XI — Ladder

Where you are on the agentic-file-system ladder

Most teams sit on rung 0 or 1 and don't realize there's a ladder. The compounding starts at rung 3.

Rung 0 — Prompts in chat. Nothing persists. Every session restarts from zero.
Rung 1 — Prompts in .md files. Saved on disk; loaded into context. No structure, no integrity, no query.
Rung 2 — Prompts in repo with light convention. Folder hierarchy, naming conventions. Grep-able but unverifiable.
Rung 3 — Structured claim layer (AEP basic). Per-file companions with claim graph. Queryable, hash-verified. The substrate begins to compound.
Rung 4 — Receipt ledger + truth-tag canon. Hash-chained provenance + claim-level confidence. Independent audit becomes mechanical.
Rung 5 — Combine-decompose discipline (current production state). Lossless corpus consolidation. Cross-agent recall. VERIFIED at N = 100, projecting linear to N = 1,000.
Rung 6 — Substrate-as-API. The AEP layer exposed as an MCP server queryable from any compliant agent. FRONTIER — projected 60-90 days.

§ XII — Stakes

Why this matters beyond one repo

Every team building with LLM agents is building, implicitly, an agentic file system. Most are doing it accidentally — Markdown files thrown into folders, prompts kept in Slack, lessons learned that evaporate when the laptop reboots. The compounding never starts.

AEP names the format and ships the discipline. Adopting it means: your team's output gets sharper over time even when the underlying models don't change. Your audits become mechanical instead of social. Your handoffs between sessions, accounts, and surfaces survive context wipes. The substrate accretes value the way good code accretes value: not by being clever, but by being structured and verifiable.

The model providers will keep making models smarter. The teams that win will be the ones whose substrate compounds the smartness across every session.

Capability is what the model gives you. Compounding is what you build on top of it. AEP is the file format for compounding. — captured during the v1.5 LTS hardening cascade

Markdown is a great place to write a thought.
AEP is the format for thousands of thoughts
across hundreds of agents, over years,
surviving every context wipe and every account change.

— aep · agent evidence packet · open standard · 2026 —

A betteragenticfile format.