Ten tasks. May 8, 2026 weighted to 22.4x leverage across 108.5 human-equivalent hours in 291 Claude-minutes. The day was dominated by an internal cross-domain warm-start architecture rolled out across engine, web, desktop, and mobile clients in five phases, plus a deep data-integrity audit and an IP working-draft amendment. Supervisory leverage closed at 323.9x.
Compared to the prior day, this one ran tighter; about a third of the human-equivalent hours but a higher weighted factor because most tasks were tightly-scoped engine or client wiring with explicit success criteria. The 53.3x ceiling came from a 5-phase routing implementation; the 4.7x floor was a session-recovery commit-bundling task where the human reviewed each step.
Task Log
| # | Task | Human Est. | Claude | Sup. | Factor | Sup. Factor |
|---|---|---|---|---|---|---|
| 1 | Browse-before-auth web client implementation: all 5 phases (router public/gated split, pendingIntent + resumeAfterAuth + AuthCallback dispatcher, anonymous CourseDetail with auth-aware Enroll, AppShell anonymous chrome with sign-in CTA, deep-link returnTo verified). | 40.0h | 45m | 1m | 53.3x | 2400.0x |
| 2 | an internal ADR Phase 1 engine: a Bayesian warm-starter module, a posterior model trustflagged field, mastery trust gate, autopilot creationRequest/Response field expansion, 5 CrossDomainConfig fields, cloud.toml section, createautopilot handler hook, 24 new unit tests across 3 files; 3,473 fast tests pass | 7.0h | 16m | 0m | 26.2x | 840.0x |
| 3 | Pair-to-node ref repair across 247 broken domains via embedding cosine match (146,762 pairs re-anchored, mean cosine 0.91). Bulk readiness-gate stamp across 178 manifests derived from exam metadata. Post-audit shows 319 of 320 viable domains HEALTHY (was 73). an internal ADR decision log updated. | 12.0h | 30m | 1m | 24.0x | 720.0x |
| 4 | Amend an IP working draft working draft (several new claims, a spec subsection, alt embodiment, related-inventions paragraphs for E and H), draft an internal ADR (cross-domain posterior warm-starting), update canonical claim totals 633->637 across 11 portfolio docs, regenerate Application_BB.pdf | 7.0h | 18m | 2m | 23.3x | 280.0x |
| 5 | an internal ADR Phase 2 client wiring (web + Electron): API types, env flag, autopilot store extensions, CrossDomain fast-track buttons, CourseDetail savings callouts, SkillsCarryingOverPanel warm-start data, i18n keys, Electron screen state machine transferContext threading; | 5.0h | 14m | 0m | 21.4x | 1000.0x |
| 6 | iOS cross-domain fast-track parity (EngineClient types, AppState TransferContext, CrossDomainView fast-track button, AutopilotView pre/post-activation callouts, env flag), invite-code gate removal (SiteKeyService/SiteKeyGateView delete + pbxproj cleanup + Localizable.xcstrings auto-clean), | 5.5h | 18m | 0m | 18.3x | 825.0x |
| 7 | Domain pair-to-node integrity audit (323 domains, 76% degraded), EB leaf catastrophic-regression fix (gate on domainobstotal instead of raw pair_stats ; acc92 crashed 1.0→0.001 on broken-pair domains), per-domain readiness gates on CLF/SAA/a professional cert/ANS manifests, 12 new regression tests, | 18.0h | 65m | 4m | 16.6x | 270.0x |
| 8 | an internal ADR Phase 3 artifacts: 5 reference profile YAMLs (CLF→SAA, SAA→SAP, a professional cert→a professional cert, a professional cert→a professional cert, a professional cert→a professional cert), runwarmstartvalidation.py synthetic A/B harness (~500 lines, parses clean), | 4.0h | 15m | 0m | 16.0x | 600.0x |
| 9 | Built shared NLI server (FastAPI/MPS) + LM Studio embeddings client + engine wiring so synthesis pipeline can run 10-way concurrent without OOM | 6.5h | 25m | 4m | 15.6x | 97.5x |
| 10 | Resume an internal ADR cross-domain warmstart work after crash: bundle drift into 4 focused engine commits + 1 web a11y commit, add Phase 11 to an audit harness content audit (md spec + py implementation) catching missing decoy validation prerequisites and 26 pre-existing duplicate exam_codes, | 3.5h | 45m | 7m | 4.7x | 30.0x |
Aggregate Statistics
| Metric | Value |
|---|---|
| Total tasks | 10 |
| Total human-equivalent hours | 108.5 |
| Total Claude minutes | 291 |
| Total supervisory minutes | 20 |
| Total tokens | 1,425,000 |
| Weighted average leverage factor | 22.4x |
| Weighted average supervisory leverage factor | 323.9x |
| Human-equivalent weeks | 2.7 |
Analysis
The day's leverage distribution matters more than the headline figure. The 53.3x ceiling came from Browse-before-auth web client implementation: all 5 phases (router public/gated split,; the 4.7x floor was Resume an internal ADR cross-domain warmstart work after crash: bundle drift into 4 focused engine c.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.
Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.
The supervisory leverage figure (323.9x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.
Across the 10 tasks, the day produced roughly 2.7 weeks of senior-engineer-equivalent throughput in 4.8 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.