Nine tasks. June 3, 2026 weighted to 41.0x leverage across 617.5 human-equivalent hours in 904 Claude-minutes. Supervisory leverage closed at 661.6x.
15.4 weeks of human-equivalent throughput in 15.1 hours of Claude wall-clock. The 160.0x ceiling came from Drove all NOT_IMPLEMENTED patent claims to IMPLEMENTED (92->0): 3 net-new 20-claim engines + org-analytics, transformation-drift, multi-tenant, third-party validation, experimentat...; the 5.0x floor sat at Add parent-module subgraph numerals to a patent spec's diagram figures.
Task Log
| # | Task | Human Est. | Claude | Sup. | Factor | Sup. Factor |
|---|---|---|---|---|---|---|
| 1 | Drove all NOTIMPLEMENTED patent claims to IMPLEMENTED (92->0): 3 net-new 20-claim engines + org-analytics, transformation-drift, multi-tenant, third-party validation, experimentation, ring/distinction/lineage, atom federation/scope/image-integrity, instructor-mode UI; 0 NOTIMPLEMENTED; all on staging | 240.0h | 90m | 10m | 160.0x | 1440.0x |
| 2 | Wire final dormant engines (replication subscriber/verifier, staleness job) with unit tests; re-trace 11 patent specs to code under reachability standard; split patent traceability matrix into per-patent sources + generated build_matrix.py | 80.0h | 75m | 4m | 64.0x | 1200.0x |
| 3 | Harden 8 patent specification drafts per review: soften bitwise/RAG/NLI/guarantee overclaims to calibrated tolerances + answer-support verifier; fix stem-type count; worked-example heading; internal-engine cross-refs; anti-overclaim paragraphs; brand lines | 12.0h | 12m | 4m | 60.0x | 180.0x |
| 4 | Design + scaffold an internal tool-fleet dashboard (auth-gated tool dashboard, soft per-user/per-tool permission gate default-none, admin grant matrix, JIT user mirror, 22-tool registry, health-proxy, FastMCP); FastAPI backend + React/design-system frontend + 5 docs + fleet integration, 117 tests green (backend 108 @93% cov); plus auth-service Lambda-vs-Fargate recon and ADR 0006 | 40.0h | 42m | 4m | 57.8x | 600.0x |
| 5 | Figure-vs-spec alignment across patent specification drafts: add a detailed-description method section + add missing parent-module subgraph numerals (13 diagram figures) with render validation | 7.0h | 10m | 5m | 42.0x | 84.0x |
| 6 | Full a11y audit + fix of a family of 10 marketing sites (main + 9 sisters): built axe harness, diagnosed FOUC/scroll-reveal/dark-mode/redirect/gate-timing false positives, fixed contrast tokens (light+dark), link underlines, definition-list, heading-order, landmarks, scope, redirect stubs, velvet-rope gate; all clean both modes | 100.0h | 200m | 3m | 30.0x | 2000.0x |
| 7 | Autonomous patent reduction-to-practice audit (63-agent trace of 718 claims, matrix rebuild 544->153 wired), third-party model integration + cost capture + Valkey/label fixes, per-domain readiness calibration pipeline + intellectual property documentation activation | 120.0h | 300m | 20m | 24.0x | 360.0x |
| 8 | Implement 12 unimplemented patent claims to reduction-to-practice with tests+endpoints+config; prove 0% exam-bug fix with 7-test regression; one patent spec cleared to 0 unimplemented | 18.0h | 170m | 3m | 6.3x | 360.0x |
| 9 | Add parent-module subgraph numerals to a patent spec's diagram figures | 0.5h | 6m | 3m | 5.0x | 10.0x |
Aggregate Statistics
| Metric | Value |
|---|---|
| Total tasks | 9 |
| Total human-equivalent hours | 617.5 |
| Total Claude minutes | 904 |
| Total supervisory minutes | 56 |
| Total tokens | 12,118,000 |
| Weighted average leverage factor | 41.0x |
| Weighted average supervisory leverage factor | 661.6x |
| Human-equivalent weeks | 15.4 |
Analysis
The day's leverage distribution matters more than the headline figure. The 160.0x ceiling came from Drove all NOT_IMPLEMENTED patent claims to IMPLEMENTED (92->0): 3 net-new 20-claim engines + org-analytics, transformation-drift, multi-tenant, third-party vali...; the 5.0x floor was Add parent-module subgraph numerals to a patent spec's diagram figures. Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.
Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.
The supervisory leverage figure (661.6x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.
Across the 9 tasks, the day produced roughly 15.4 weeks of senior-engineer-equivalent throughput in 15.1 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.