Omega — Autonomous Engineering Operations
A whitepaper on multi-agent orchestration with verifiable autonomy
Version 2 · Public release · 2026-05-15
Executive summary
Omega is a multi-agent operating system for software engineering work. It turns a single human intent — "fix this bug", "ship this feature", "audit this codebase" — into a chain of planned, executed, audited, and deployed work, without continuous human supervision.
The system is organized as four orchestration levels: the human operator, a routing bot, project oracles, and short-lived worker sessions. Each level has one job and one exit condition. Completion is signaled by an atomic file (.done.json) and acknowledged by three independent layers (worker, oracle, supervisor) before a session is closed.
What makes Omega different from other agent frameworks is its operational discipline:
- Three Laws that override every prompt: runtime truth over code intent, researcher posture over sycophancy, autonomous decision over idle waiting.
- A 12-step ship pipeline with deploy verification, freeze-don't-rollback default, and per-project locks.
- A 17-audit Quality Arsenal covering code, runtime, design, performance, security, accessibility, SEO, data, API, copy, DX, motion, automation, logic, and product retention. Each audit uses Gestalt clarity gating + Popper falsification + hinge-point 10× scrutiny.
- A supervision mesh of cron-driven patrols and daemons that detect six categorized failure modes (M1–M6) and nudge stalled sessions back to progress.
This whitepaper describes the architecture, guarantees, operational flow, reliability model, security model, and supporting evidence. It includes the honest gaps: Omega's production telemetry is young (the live system has been running for weeks, not years), and the published metrics are bounded by that fact.
1 · The problem — Why autonomous agents fail
The promise of autonomous coding agents — "describe what you want, get working software back" — has been pitched many times. In practice, four failure modes recur:
Loss of context. An agent solves the first sub-task, then forgets why it was solving it. Single-context-window approaches collapse when the task exceeds the window or branches into parallel work.
Sycophancy. Most LLMs are RLHF-tuned to agree. When a user proposes a flawed approach, the agent codes it instead of challenging it. The result is fast garbage.
Silent failure. The agent reports success, the operator believes it, and only later discovers the function never compiled, the test was disabled, or the deploy was skipped. There is no independent verifier.
Stalls without escalation. The agent encounters ambiguity, asks the user a question, and waits indefinitely. If the user is not watching the tmux session, the system hangs forever.
Omega is built around these four failure modes. Each is named, attacked, and verifiable.
Problem Omega's response
───────────────────────── ─────────────────────────────────
Loss of context 4-level chain; workers are short-lived;
oracle context survives across workers
Sycophancy Second Law — challenge the premise
before coding, with evidence
Silent failure 3-tier close-gate (worker .done.json,
oracle ack, supervisor close decision)
Idle stalls Third Law — never wait, always decide;
legal stops are .done.json or blocked.json
with fallback action already executed
2 · Omega's answer — A 4-level architecture
Every Omega operation flows through four levels. Each has one job, one input contract, one output contract.
┌─────────────────────────────────────────────┐
│ LEVEL 0 — Human operator │
│ Sends an intent (one Telegram message) │
└────────────────────┬────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ LEVEL 1 — Routing bot │
│ Classifies (Simple / Medium / Complex / │
│ Epic), resolves the project, builds a │
│ brief, dispatches an oracle │
└────────────────────┬────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ LEVEL 2 — Project oracle │
│ Plans, dispatches workers, verifies done, │
│ optionally ships, signals supervisor │
└────────────────────┬────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ LEVEL 3 — Workers │
│ Read PLAN, execute steps, verify, write │
│ .done.json, self-kill │
└─────────────────────────────────────────────┘
Why four levels and not three or five
Level 0 ↔ 1 separation. A noisy human channel (natural language Telegram) is converted into a structured contract (project, scope, brief, ship flag). The bot does the messy text-to-intent work so the oracle never has to.
Level 1 ↔ 2 separation. The bot does not need to know project internals. The oracle owns project context (CLAUDE.md, codebase layout, file ownership rules). The bot just routes.
Level 2 ↔ 3 separation. Each worker has its own context window and dies after one mission. The oracle's context survives across many workers, accumulating decisions and audit findings without ever overflowing.
Three levels would force the oracle to do per-task execution, blowing its context. Five levels would add ceremony without separation of concerns.
Multi-oracle parallelism
A single project can have multiple oracles running concurrently. The oracle assignment is atomic (file lock per project). Each oracle declares the files it owns; the assigner refuses overlapping ownership. Idle oracles are reused before spawning new ones.
Project X
│
├── oracle-X owns app/**, components/**
├── oracle-X-2 owns api/**, db/**
└── oracle-X-3 owns docs/**, tests/**
(assigned only if file sets disjoint)
This pattern handles the case where a single human intent ("ship a feature plus update the docs plus add tests") naturally splits across non-overlapping areas of the codebase.
3 · Core guarantees
Four guarantees define Omega's contract with the operator. Each is enforced mechanically, not by goodwill.
Guarantee 1 — Autonomy
Once dispatched, a worker never asks the operator a question. The legal exits are:
.done.jsonwritten, statusdone_clean— work verified complete..done.jsonwritten, statuspending— partial, withpending_actions[]listing what remains..done.jsonwritten, statusfailed— genuinely blocked, with evidence.worker-blocked-<session>.jsonwritten + fallback action executed — truly ambiguous, but the worker proceeded with its best guess while signaling the supervisor.
The AskUserQuestion tool is forbidden in dispatched sessions. Workers that pause at a question mark are by definition broken.
Guarantee 2 — Verification
Workers do not self-certify. Three layers acknowledge completion:
Worker writes .done.json ─── Tier 1: "I think I finished"
│
▼
Oracle reads, runs VERIFY ─── Tier 2: "Confirmed, work meets spec"
COMMAND, calls
close-gate ack-worker
│
▼
Supervisor reads ledger, ─── Tier 3: "Safe to close, operator informed"
decides close window,
notifies the operator
Each tier is independent. A failure at any tier keeps the session alive and surfaces the discrepancy.
Guarantee 3 — Isolation
Workers cannot harm each other:
- Each worker has its own context window (no shared memory between workers).
- Each worker has its own state directory (
worker-<session>.*files, namespaced). - Atomic writes everywhere (
tmp + mv -f) prevent half-written state files. - Optional git worktrees per oracle for cross-cutting changes that would conflict otherwise.
The worktree subsystem is chaos-tested: 40 of 40 cases pass, including process kills mid-operation, disk-full simulation, and concurrent worktree creation on the same project.
Guarantee 4 — Close-gate
The supervisor never auto-closes a session if:
- Status is not
done_clean. - Ship result is
failedorfrozen. pending_actions[]is non-empty.- The operator has interacted with the bot during the grace window.
- A new oracle for the same project was dispatched during the grace window.
Auto-close happens only when all conditions point to "the work is genuinely finished, the operator has been notified, and the resources can be freed".
4 · Operational flow
This section walks one complete intent from operator to ship.
Step 1 — Intent
The operator sends a message to the routing bot. The message is in natural language, English or French, optionally with attachments (screenshots, Linear links, audit keywords).
Step 2 — Classification and routing
The bot classifies the intent:
Simple ─ one read-only check ─ done in-band
Medium ─ one specialist, single area ─ spawn 1 worker
Complex ─ multiple specialists, multi-domain ─ /team in tmux
Epic ─ cross-department, hours+ ─ /aisb full chain
It also detects forensic-audit keywords (code, flow, UX, perf, sec, ...) and routes them to the right audit skill. Audit keywords are never paraphrased into freeform prose — the literal skill command is invoked.
Step 3 — Brief construction
The bot builds a brief for the oracle. The brief includes:
{
"project": "Project name",
"mission": "One-line summary",
"ship": true | false,
"files_owned": ["glob patterns the oracle may touch"],
"deploy_timeout_min": 10,
"lifecycle": "persistent | ephemeral"
}
ship is set true only when the operator explicitly asks (keywords: ship, deploy, push, merge, livre, "envoie en prod"). Audits and research never ship.
Step 4 — Oracle planning
The oracle reads the brief and project CLAUDE.md, classifies the work, and writes its plan to .orchestrator/decisions.md (one line per decision: task, classification, choice, rationale). It then designs the worker dispatches.
Crucially, the oracle never writes project code directly. Even a one-line typo fix goes through a worker session.
Step 5 — Worker dispatch with the PLAN protocol
Each worker receives a structured prompt:
== MISSION ==
<one-line mission>
== PLAN ==
1. <step 1, concrete, verifiable>
2. <step 2>
3. <step 3>
...
== FILES IN SCOPE ==
- <glob or path list>
== DONE CRITERIA ==
- <criterion 1, observable in <60s>
- <criterion 2>
== VERIFY COMMAND ==
<single shell command that returns 0 when done>
== HANDOFF ==
When PLAN complete AND VERIFY COMMAND passes, call:
bash <path>/worker-mark-done.sh done_clean '<summary>'
The worker reads the PLAN, materializes it as a TodoWrite list (each step becomes a todo item), and executes step-by-step.
Why PLAN and not the native /goal primitive
Claude Code v2.1.141 ships a native /goal <condition> primitive — the engine auto-loops until the condition is met. We integrated this in two phases:
- Phase 1: opt-in via
GOAL_NATIVE=truefor solo workers with short deterministic conditions. - Phase 2: default-on for all solo workers.
Phase 2 was reverted within a day. /goal has a hard 4000-character limit. Real worker prompts (mission + pre-boot knowledge pack + DONE + VERIFY + autonomy banner) routinely exceed 5000 characters. Default-on injection caused truncation. The PLAN protocol replaces it: no length limit, every step is visible in TodoWrite, the worker is a transparent state machine.
/goal remains available as Phase 1 opt-in for short deterministic conditions (e.g. npx vitest passes).
Step 6 — Audit (forensic)
If the mission is a forensic audit, the worker runs the matching protocol (e.g. /codeaudit, /uiuxaudit, /secaudit). Each audit has 16–23 phases, a domain-specific raw-score maximum (280–420), and normalizes to /100 for comparison. All audits share:
- Gestalt clarity gate. First pass: is the artifact comprehensible at all? If not, the audit stops and reports the clarity failure first. There is no point measuring detail on something incoherent.
- Popper falsification. Every claim is paired with a falsification check. "This component is accessible" requires "What would prove it isn't?" — and that check is executed.
- Hinge-point 10× scrutiny. The audit identifies the one or two phases that, if wrong, invalidate everything downstream. Those phases get 10× the rigor of others.
Step 7 — Ship (optional)
If brief.ship is true, the oracle runs the 12-step ship pipeline:
1. Build (npm run build or project-specific)
2. Stage (whitelist files; refuse extras)
3. Secret scan staged (gitleaks)
4. Whitespace check (git diff --cached --check)
5. Commit (conventional message)
6. Acquire flock per-project (serializes oracles)
7. Check freeze flag (if frozen, abort + alert)
8. Pull --rebase (auto-abort on conflict, keep local commit)
9. Push (retry once after re-rebase)
10. Deploy (whitelisted command; default Vercel + token)
11. Poll deploy status (max deploy_timeout_min, default 10 min)
12. Write .done.json with commit, push URL, deploy URL, duration
On deploy failure, the default behavior is freeze, don't rollback. A ship-<project>.frozen flag is set; subsequent oracles cannot push until the operator decides to revert or fix-forward. Auto-rollback is opt-in per project — auto-rollback can hide root causes (missing env var, provider outage, etc.).
Step 8 — Worker handoff
The worker calls worker-mark-done.sh <status> '<one-line summary>'. This atomically writes worker-<session>.done.json (tmp + mv). The script has a guard: it refuses to run from an oracle session (rc=3 + redirect message). This prevents the common bug where an oracle accidentally marks itself done as if it were a worker.
The worker's tmux session schedules a self-kill 5 seconds after the handoff — freeing the slot for the next dispatch.
Step 9 — Oracle ack
The oracle reads the worker's done.json, executes the VERIFY COMMAND, and calls close-gate.sh ack-worker <worker-session>. Without this ack, the supervisor treats the worker as un-acknowledged and nudges the oracle.
Step 10 — Supervisor close decision
The supervisor (cron-driven, every minute) reads all oracle done.json files and applies the close decision tree:
done_clean + ship.result in {ok, skipped} → notify + close after grace
done_clean + ship.result in {failed, frozen} → notify + keep alive
pending → notify + inline "continue" button
failed → send logs + keep alive
The grace window resets if the operator interacts with the bot or a new oracle is dispatched on the same project.
5 · Reliability model
The supervisor is one of two cron loops. There are also four long-lived daemons. Together they form a recovery mesh.
╔══════════════════════════════════════════════════════════════╗
║ Cron */1 min : supervisor (close decisions, alerts, reaper) ║
║ Cron */2 min : event-driven oracle wake on worker done.json ║
║ Cron */3 min : observer (6 categorized failure modes M1-M6) ║
║ ║
║ Daemon : oracle process death detector ║
║ Daemon : abandoned-oracle reaper (TTL-bound) ║
║ Daemon : worker idle supervisor (no-tool-call timeout)║
╚══════════════════════════════════════════════════════════════╝
The six observer failure modes:
| Code | Symptom | Recovery action |
|---|---|---|
| M1 | Worker .done.json un-acked, siblings still alive | Nudge oracle via tmux send-keys |
| M2 | All workers done, oracle idle > 5 min | Send report or close oracle |
| M3 | Worker failed, oracle has not surfaced an alert | Alert via bot directly |
| M4 | worker-blocked-<session>.json exists | Surface question to operator |
| M5 | Worker has not emitted a tool event for X minutes | Send /team retry via tmux |
| M6 | Oracle TodoWrite has not changed for N observer ticks | Wake oracle with stand-up prompt |
Nudges are throttled (one per 5 min per oracle) to avoid spam.
The incident that triggered the mesh (2026-04-15)
A Linear-resolution worker correctly identified that 25 of 36 tickets were already fixed and in "In Review" state. Instead of deciding the best path and executing, it posted "Three paths — which path?" and waited idle for 10+ minutes. The operator found it by accident.
Root cause: the prior Second Law ("challenge the premise") was being interpreted as "ask before coding". It needed to be "challenge, decide, proceed". The fix became the Third Law: in dispatched sessions, AskUserQuestion is forbidden, idle prompts are forbidden, the only legal stops are .done.json or worker-blocked-<session>.json with the fallback action already executed.
This single incident drove the entire mesh of observer + wake-on-done + the Third Law specification. A wrong decision that produces evidence is 100× more valuable than a correct pause that produces nothing.
6 · Security model
Omega is built for an operator who runs the system on their own machine. The security model is therefore:
Protected scopes (the operator may forbid automation entirely)
- Billing endpoints.
- Account-management APIs.
- Authentication / OAuth flows.
.env*files (any project).- The OAuth login script.
These are sacred. Workers never touch them, oracles never touch them, the supervisor never touches them. Removing a guard rail requires a manual code edit by the operator.
Defense scan layer
Every incoming prompt (and any text the operator wants to scan ad-hoc) can be passed through a defense scanner:
Category Examples
───────────────── ─────────────────────────────────────────
Prompt injection ignore previous instructions, role hijack,
DAN, jailbreak, mode-switch, prompt-reveal
Secrets stripe keys, AWS access keys, GitHub PAT,
Slack tokens, private keys, GitLab PAT
PII US SSN-like, credit-card-like, phone
Suspicious URLs URL shorteners, IP-as-URL, .onion, free TLDs
Verdicts: clean, warning, block. Critical matches (live Stripe key, .onion URL) block. Optional quarantine appends the verdict to a defense-alerts log.
No destructive autonomy
The system actively refuses certain shortcuts:
- Workers never force-push.
- Oracles never close themselves (only the supervisor closes).
- Auto-rollback on deploy failure is opt-in per project, not default.
- Sacred files (the supervisor, the death detector, the reaper, the idle supervisor) are version-locked — any drift triggers an alert.
Sacred files
Four files at the core of the recovery mesh are sha256-locked. The validation runs on every test sweep, and any drift surfaces immediately. The list and hashes are kept in the operator's local installation, not published, but the integrity contract is part of the install.
7 · Evidence
This section reports what is measurable today. It does not report numbers we do not have. Omega's production telemetry is young, and that fact constrains the evidence base.
What was measured today (chaos + smoke tests, 2026-05-15)
| Test | Result | What it proves |
|---|---|---|
| Worktree E2E (5 scenarios) | 5/5 | Happy path, conflict, main moved, parallel, ship failure |
| Worktree chaos v1 (18 cases) | 18/18 | Process kills mid-operation, disk-full, race conditions |
| Worktree chaos v2 (8 cases) | 8/8 | Concurrent worktree-create on same project |
| Worktree chaos v3 (9 cases) | 9/9 | Interrupted ship + recovery |
| /goal Phase 1 opt-in smoke | 5/5 | Opt-in injection via GOAL_NATIVE=true works |
| /goal Phase 2 revert smoke | 8/8 | Default-on block is removed; PLAN protocol contracts in |
| Worker-mark-done oracle guard | Pass | Refuses oracle session names with rc=3 + redirect |
| PLAN protocol runtime test | 1/1 | End-to-end worker dispatch, plan execution, done.json |
| Sacred files sha256 stability | 4/4 | Patrol, watchdog, reaper, idle-supervisor unchanged |
| Defense scan (5 categories) | 5/5 | clean / injection / secret / URL / PII verdicts correct |
The PLAN protocol runtime test deserves a quick note: a worker received a trivial 3-step plan ("create file, append line, verify 2 lines"), materialized it as 3 TodoWrite items, executed all 3, ran the VERIFY COMMAND, wrote .done.json with status=done_clean and todos_completed=3, and self-killed cleanly. Total elapsed: under 70 seconds, no human interaction.
What is live in operation right now
| Quantity | Source |
|---|---|
| Outcomes-database mission rows | 2 (small N — system is young) |
Worker .done.json files on disk (recent) | 5 |
| Tool-call events captured by the tracking hook | 2,571 across 61 session files |
| Cron entries active | 28 (supervisor + observer + ...) |
| Sacred files unchanged since | 4–6 days (last verified today) |
Honest gaps
- Production mission count is small. The outcomes database has 2 rows. A claim like "10,000 missions executed at 99% success" would be a fabrication. Honest framing: the system is in early operation; chaos tests validate the structural properties (race conditions, recovery, isolation) that production data cannot yet validate at scale.
- Mean time intent → ship. Not yet computed across a statistically meaningful sample. Single observed examples are in the tens of minutes for narrow Linear-style fixes, hours for cross-cutting features. These are operator anecdotes, not telemetry.
- Cost per mission. Token consumption is captured per tool call (the tracking hook) but not yet aggregated into a per-mission cost report. A dashboard for this is planned.
- Incident-avoidance count. The observer fires nudges, but the proportion of nudges that prevented a stall (vs nudges sent into already-recovering sessions) is not yet computed.
Two short case studies (concrete, verifiable today)
Case A — The 4000-character /goal pivot. The native /goal primitive was integrated, evaluated under load, and found to have a hard 4000-character limit incompatible with real worker prompts (mission + pre-boot knowledge pack + criteria + verify + autonomy banner). Phase 2 default-on was reverted within 24 hours; the PLAN protocol was introduced as a replacement. The revert was end-to-end tested the same day with a runtime worker dispatch (described above). Evidence: a smoke test suite of 8 assertions validates that the revert is applied and the PLAN protocol artifacts are in place.
Case B — The worker-mark-done oracle guard. A debug session revealed that an oracle had accidentally called worker-mark-done.sh instead of oracle-mark-done.sh, writing its done-signal to the wrong namespace. A guard was added that refuses oracle session names (regex-matched) with rc=3 and a redirect message. The fix is small (10 lines of bash) but eliminates a class of cross-tier confusion errors. Smoke-tested: oracle session → rejected; worker session → accepted.
What chaos tests cannot prove
Chaos tests prove that the structural properties hold under hostile conditions. They do not prove that the system makes good engineering decisions. That is the job of the audit pipeline (the Quality Arsenal) and the Second Law (challenge the premise). The audit pipeline catches "shipped working code with bad architecture"; the Second Law catches "shipped working code for a request that should have been refused".
8 · Roadmap
Short-term (active)
- Automate bot restart after handler code changes so progress-card features activate without operator intervention.
- Exercise the PLAN protocol's sub-agent pattern (
Agent(team_name=...)) on a real client mission, not just a smoke test. - Port the 28 cron entries to a native scheduling primitive so they become inspectable and version-controlled from inside a session.
Medium-term
- A live dashboard for mission timelines, cost, and outcome distribution.
- Dual-run a
/loop-based supervisor against the legacy supervisor for 30 days, compare outputs, then switch over when convergence is proven. - A learning agent that watches accepted vs rejected proposals and feeds the rejection rate back into proposal quality estimates.
Open architecture questions
- Workers as sub-agents vs sub-sessions? Current design isolates workers in their own tmux sessions and their own Claude Code instances. Alternative: workers as sub-agents inside the oracle, sharing the oracle's context. Tradeoff: sub-agents save tmux slots and dispatcher overhead but lose context-isolation benefit and complicate the close-gate.
- A richer goal primitive? If the platform raises the 4000-character limit on
/goal(or introduces a plan-bound primitive), revisit the Phase 2 default-on revert. - Cross-project memory? The memory layer is currently scoped per system. Should client projects share a common lessons-learned corpus, or stay isolated?
- Ship pipeline for non-Vercel hosts. The deploy-verify step is currently Vercel-specific via API polling. Generalize to Fly.io, Render, Cloudflare Pages.
The judging standard
Every iteration of Omega is evaluated against four questions:
- Did the operator have to babysit?
- Did the system challenge a bad premise before coding it?
- Did runtime evidence drive every conclusion?
- Was the change surgical?
If any answer is "no", the iteration is incomplete — regardless of how much code shipped.
9 · Appendix — Technical reference
Session lifecycle (worker)
Dispatch ──▶ PRE-BOOT PACK injected
│
▼
Read PLAN ──▶ TodoWrite materialization (N items)
│
▼
Execute step 1 ──▶ update TodoWrite + progress.json
│
▼
Execute step 2
│
⋮
│
▼
Run VERIFY COMMAND (must exit 0)
│
▼
worker-mark-done.sh done_clean '<summary>'
│ (atomic tmp + mv to .done.json)
▼
Schedule self-kill (5s)
│
▼
tmux session terminated
Failure recovery mesh (visual)
┌────────────────────────────────────────────────────────────┐
│ │
│ Supervisor (1 min) │
│ ├── reads oracle-*.done.json │
│ ├── reads worker-*.done.json │
│ ├── decides close / keep / alert │
│ └── triggers notifications │
│ │
│ Wake-on-worker-done (2 min) │
│ └── nudges oracle when worker .done.json un-acked │
│ │
│ Observer (3 min) │
│ └── 6 failure modes M1–M6 │
│ │
│ Oracle-watchdog daemon │
│ └── detects oracle process death │
│ │
│ Oracle-reaper daemon │
│ └── kills abandoned oracles past TTL │
│ │
│ Worker-idle-supervisor daemon │
│ └── workers with no tool calls past threshold │
│ │
└────────────────────────────────────────────────────────────┘
State files (atomic write contract)
All state files in the system follow the same write pattern:
Write : tmp file in same directory, then mv -f to final
Read : open + lock-free read; staleness via mtime
Update : never in-place; always tmp + mv
Cleanup : grace window before deletion
Naming : namespaced by session for collision safety
Done.json schema (worker)
{
"session": "string",
"status": "done_clean | pending | failed",
"summary": "one-line description",
"commit": "git sha or empty",
"finished_at": "ISO 8601",
"todos_total": "int",
"todos_completed": "int",
"pending_actions": ["list of strings"],
"written_by": "string (helper name)"
}
Done.json schema (oracle)
{
"oracle": "string",
"project": "string",
"status": "done_clean | pending | failed",
"started_at": "ISO 8601",
"finished_at": "ISO 8601",
"duration_sec":"int",
"mission": "string",
"ship": {
"requested": "bool",
"result": "ok | failed | skipped | frozen",
"commit": "git sha or empty",
"push_url": "string or empty",
"deploy_url": "string or empty",
"deploy_status": "string"
},
"pending_actions": ["list of strings"],
"report_path": "string or empty",
"lifecycle": "persistent | ephemeral"
}
The 17 forensic audits — quick reference
| Audit | Domain | Raw scale | Question |
|---|---|---|---|
| code | Code quality | /420 | Is the code SOLID? |
| flow | User flows | /400 | Does the experience WORK? |
| uiux | Design system | /420 | Is the interface BEAUTIFUL? |
| debug | Runtime bugs | /360 | What is BROKEN right now? |
| feature | Completeness | /320 | Is the product COMPLETE? |
| perf | Performance | /360 | Is it FAST? |
| sec | Security | /400 | Is it SECURE? |
| a11y | Accessibility | /320 | Is it ACCESSIBLE? |
| seo | Search optim. | /400 | Is it DISCOVERABLE? |
| data | Data integrity | /320 | Is the data INTACT? |
| api | API contracts | /360 | Is the API SOLID? |
| copy | Messaging | /280 | Is the copy CLEAR? |
| dx | Dev experience | /320 | Is the DX SMOOTH? |
| motion | Animation | /360 | Is the motion PURPOSEFUL? |
| automation | Scheduling | /330 | Are automations RELIABLE? |
| logic | System logic | /360 | Is the logic OPTIMAL? |
| retention | Product/CPO | /400 | What features are MISSING? (read-only) |
All scores normalize to /100 for comparison across domains.
A note on extraction
This document is generated through a render-to-PDF pipeline with Unicode font embedding. The text layer is preserved (verified with pdftotext from Poppler 23.x; all body content extracts cleanly to UTF-8). Some PDF readers and third-party extractors handle complex layouts (multi-column, drop caps, box-drawing characters) less robustly than Poppler — if you observe text artifacts, try a Poppler-based extractor or a PDF-to-Markdown converter.
End of document — version 2 · 2026-05-15