Charles Sieg's Latest Posts

Leverage Record: May 25, 2026

Mon, 25 May 2026 23:59:00 GMT

Nine tasks. May 25, 2026 weighted to 25.3x leverage across 59.5 human-equivalent hours in 141 Claude-minutes. Supervisory leverage closed at 142.8x.

1.5 weeks of human-equivalent throughput in 2.4 hours of Claude wall-clock. The 33.6x ceiling came from Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files); the 12.0x floor sat at core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error logging.

About These Records

These time records capture personal project work done with Claude Code (Anthropic) only. They do not include work done with ChatGPT (OpenAI), Gemini (Google), Grok (xAI), or other models, all of which I use extensively. Client work is also excluded, despite being primarily Claude Code. The actual total AI-assisted output for any given day is substantially higher than what appears here.

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files)	28.0h	50m	5m	33.6x	336.0x
2	Investigate and backfill missing a metrics tracker leverage records since May 22: surveyed git logs across 80+ an inference engine repos, identified 7 commit-clusters across May 23-25, diagnosed root cause (process discipline gap, not infra), reconstructed and POSTed 7 records to a metrics tracker-api	4.0h	8m	4m	30.0x	60.0x
3	Audit 29 ~/.claude/skills/ manifests for missing leverage-POST step on free-form task path; identify 16 tool-loader skills (a marketing platform, a calendar platform, a knowledge base, an email platform, a defect tracker, a portfolio browser, a metrics tracker, a newsletter platform, a time-tracking app, a CM...	3.0h	6m	2m	30.0x	90.0x
4	Invert leverage tracking policy: CSV first then cloud second both mandatory; patched global CLAUDE.md Rules block, /fix skill Step 6k, 16 tool-loader Step 4 blocks, and /leverage-post Phase 2 reconciliation	4.0h	8m	2m	30.0x	120.0x
5	/leverage-post reconciliation Phase 1+2: backfilled 139 CSV rows across 12 days (5/14-5/25), verified all in sync, 0 stragglers remaining	1.5h	4m	1m	22.5x	90.0x
6	core/a simulation harness rebuild: brain answerer switched to direct Anthropic SDK with prompt caching (257 LOC), all zero/pmp sweep profiles flipped to omniscient:false (46 files), headless runner surfaces non-200 from /next-pair-mcq with context	10.0h	30m	4m	20.0x	150.0x
7	Cross-fleet prompt-caching micro-sweep: cachecontrol on a relationship CRM sonnet system block, an API gateway a recruiter product/llmnormalizer streaming call, automation-resume-refinement SYSTEMPROMPT, an origin service atoms/generator system + toolschema	3.0h	10m	2m	18.0x	90.0x
8	an audit toolchain: content-audit P4.1 pair-density check with PGWA-class detection (61 LOC), a configuration file headline counts bumped to 2026-05-25 audit snapshot, per-activity-format trackers added (scenarios, flashcards, etc.)	4.0h	15m	3m	16.0x	80.0x
9	core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error logging	2.0h	10m	2m	12.0x	60.0x

Aggregate Statistics

Metric	Value
Total tasks	9
Total human-equivalent hours	59.5
Total Claude minutes	141
Total supervisory minutes	25
Total tokens	928,000
Weighted average leverage factor	25.3x
Weighted average supervisory leverage factor	142.8x
Human-equivalent weeks	1.5

Analysis

The day's leverage distribution matters more than the headline figure. The 33.6x ceiling came from Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files); the 12.0x floor was core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.

The supervisory leverage figure (142.8x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 9 tasks, the day produced roughly 1.5 weeks of senior-engineer-equivalent throughput in 2.4 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 24, 2026

Sun, 24 May 2026 23:59:00 GMT

One task. May 24, 2026 weighted to 14.4x leverage across 6.0 human-equivalent hours in 25 Claude-minutes. Supervisory leverage closed at 120.0x.

0.1 weeks of human-equivalent throughput in 0.4 hours of Claude wall-clock. The 14.4x ceiling came from Anthropic cache-token surfacing in an LLM client library calllog (cachecreate/cacheread tokens) + an origin service spend-tracking fix flushing calllog from math runners and tr...; the 14.4x floor sat at Anthropic cache-token surfacing in an LLM client library calllog (cachecreate/cacheread tokens) + an origin service spend-tracking fix flushing calllog from math runners and tr....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Anthropic cache-token surfacing in an LLM client library calllog (cachecreate/cacheread tokens) + an origin service spend-tracking fix flushing calllog from math runners and tribunal	6.0h	25m	3m	14.4x	120.0x

Aggregate Statistics

Metric	Value
Total tasks	1
Total human-equivalent hours	6.0
Total Claude minutes	25
Total supervisory minutes	3
Total tokens	80,000
Weighted average leverage factor	14.4x
Weighted average supervisory leverage factor	120.0x
Human-equivalent weeks	0.1

Analysis

The day's leverage distribution matters more than the headline figure. The 14.4x ceiling came from Anthropic cache-token surfacing in an LLM client library calllog (cachecreate/cacheread tokens) + an origin service spend-tracking fix flushing calllog from...; the 14.4x floor was Anthropic cache-token surfacing in an LLM client library calllog (cachecreate/cacheread tokens) + an origin service spend-tracking fix flushing calllog from.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (120.0x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 1 task, the day produced roughly 0.1 weeks of senior-engineer-equivalent throughput in 0.4 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 23, 2026

Sat, 23 May 2026 23:59:00 GMT

One task. May 23, 2026 weighted to 33.6x leverage across 28.0 human-equivalent hours in 50 Claude-minutes. Supervisory leverage closed at 336.0x.

0.7 weeks of human-equivalent throughput in 0.8 hours of Claude wall-clock. The 33.6x ceiling came from Math content shapes for an origin service synthesis: three new content shapes (symbolic problems, modeling problems) + math tribunal verdict schema. libs/an origin runtime library...; the 33.6x floor sat at Math content shapes for an origin service synthesis: three new content shapes (symbolic problems, modeling problems) + math tribunal verdict schema. libs/an origin runtime library....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Math content shapes for an origin service synthesis: three new content shapes (symbolic problems, modeling problems) + math tribunal verdict schema. libs/an origin runtime library + services/an origin service, 1653 LOC across 16 files	28.0h	50m	5m	33.6x	336.0x

Aggregate Statistics

Metric	Value
Total tasks	1
Total human-equivalent hours	28.0
Total Claude minutes	50
Total supervisory minutes	5
Total tokens	350,000
Weighted average leverage factor	33.6x
Weighted average supervisory leverage factor	336.0x
Human-equivalent weeks	0.7

Analysis

The day's leverage distribution matters more than the headline figure. The 33.6x ceiling came from Math content shapes for an origin service synthesis: three new content shapes (symbolic problems, modeling problems) + math tribunal verdict schema. libs/an ori...; the 33.6x floor was Math content shapes for an origin service synthesis: three new content shapes (symbolic problems, modeling problems) + math tribunal verdict schema. libs/an ori.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (336.0x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 1 task, the day produced roughly 0.7 weeks of senior-engineer-equivalent throughput in 0.8 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 22, 2026

Fri, 22 May 2026 23:59:00 GMT

22 tasks. May 22, 2026 weighted to 27.3x leverage across 425.2 human-equivalent hours in 935 Claude-minutes. Supervisory leverage closed at 447.6x.

10.6 weeks of human-equivalent throughput in 15.6 hours of Claude wall-clock. The 161.5x ceiling came from Full an inference engine accessibility audit (50 repos, deterministic Phase 0 + 4 parallel LLM agents, ~288 findings) followed by full compliance audit (12 sections, 4 parallel age...; the 2.5x floor sat at a marketing site PMP card+banner: set available_at, update category/course-page templates to render Available May 25th.

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Full an inference engine accessibility audit (50 repos, deterministic Phase 0 + 4 parallel LLM agents, ~288 findings) followed by full compliance audit (12 sections, 4 parallel agents, 1 CRITICAL + 5 HIGH gaps, consolidated SOC 2/GDPR/CCPA report)	70.0h	26m	2m	161.5x	2100.0x
2	Accessibility zero-disruption HIGH sweep: 4 parallel agents fixed ~135 HIGH findings across 30+ repos — Phase 0 went from 60 to 0 verified by deterministic checker; 488 scope=col + 15 aria-modal added across 21 tools; canvas/SVG/input aria-label additions; outline:none replacements; aria-grabbed deprecated to...	80.0h	35m	1m	137.1x	4800.0x
3	Full an inference engine readiness audit: Phase 0 canonical + 4 parallel agents across 60 repos (core+services, clients+libs, 21 tools, docs+sites+infra), consolidated report at audit-report-2026-05-22.md with 10 HIGH + 13 MEDIUM (4 self-fixed in-flight) + 10 LOW; 9,157+ tests verified green	40.0h	25m	1m	96.0x	2400.0x
4	Readiness audit rerun: 4 parallel agents verified today HIGH fixes landed clean (admin, electron, infra) + audited 42 previously-uncovered repos; consolidated to audit-report-2026-05-22-rerun.md with 2 new systemic findings (6/8 automation Lambdas + 9/10 study product sub-sites are local-only with no GitHub r...	12.0h	14m	1m	51.4x	720.0x
5	Readiness rerun3 + security audit (5 parallel agents): verified today HIGH fixes clean, agents auto-fixed 9 test failures + 1 real h1->h3 heading-skip a11y bug, surfaced 1 CRITICAL (ElevenLabs key) + 3 HIGH (RDS 3306 open — user fixed; engine pipeline fail; IAM wildcards), 6 inline fixes shipped across an API...	35.0h	50m	3m	42.0x	700.0x
6	4 parallel readiness remediation agents: pushed 6 automation Lambdas + an origin service to GitHub, cleaned an infrastructure repo (6 commits — CLAUDE.md, lock files, plan.bin removal, 5 new marketing stacks, tfvars examples), registered an origin service port 8005 + real CodeBuild buildspec, fixed a web clie...	14.0h	22m	1m	38.2x	840.0x
7	Playwright a payment processor-free subscription lifecycle e2e: incomplete→invoice.paid→active w/ entitlement granted, cancel-at-period-end + reactivate, immediate cancel revokes entitlement, service-token gate. Uses test-ops /subscriptions + /webhooks/a payment processor/simulate; full live suite now 10/10 g...	10.0h	18m	1m	33.3x	600.0x
8	Security audit HIGH fix: revoked 0.0.0.0/0 + ::/0 tcp/3306 ingress on prod-ascloud-rds-sg (sg-07e500306cd69710e) — Aurora MySQL no longer reachable from public internet; verified internal app/admin paths still intact via VPC CIDR 10.10.0.0/16 + admin IP 66.182.197.254/32 + self-reference	1.0h	2m	1m	30.0x	60.0x
9	Playwright live-stack e2e suite for AuthModal + enrollment: register-verify-signin, signin happy path, forgot-reset-signin, dup-email error, enrollment + DB-verify, unverified-blocked. Captures emails via a notification service log API, verifies DB user records via an API gateway test-ops, uses Gmail+UUID ali...	16.0h	34m	2m	28.2x	480.0x
10	Phase 5 origin-extraction wiring: discover stub-runner gap, build runtime-to-service DomainSpecification adapter, real synthesis runner + three math content runners (workedexamples, misconceptions, representationpacks), env-gated registration to keep tests green, start an origin service with an inference en...	16.0h	35m	3m	27.4x	320.0x
11	Consolidate auth+purchase under an API gateway gateway and build in-modal auth UI (sign in, register, forgot/reset, MFA TOTP, verify email, Apple/Google social) replacing the hosted OIDC SPA; strip 12 legacy env vars and 14 per-service gateway argument call sites	18.0h	42m	5m	25.7x	216.0x
12	domain-difficulty-factor engine work: 4 decoy fixes (headless default, composite circuit-breaker, maxdays terminal event, catalog status from spec) + foundation-phase + alpha-saturation tuning landed in autopilotranker/orchestrator/autopilot_service; PMP+CAPM spec/manifest patches; boot cache rebuild ×2; en...	32.0h	90m	6m	21.3x	320.0x
13	GAP-06 fix: per-email rate limit on /forgot-password (Redis ZSET sliding window) and per-IP rate limit on /reset-password in an authentication service, with regression tests; 459 tests pass	4.0h	12m	1m	20.0x	240.0x
14	post-PMP-fleet morning session: AZ-500 root-cause (snapshot serializer dropped goalweights/goalsimilarity for entire v3 schema lifetime; engine fell into legacy 0.85 clamp); fixed serialize+deserialize+tensor-dispatch + bumped schema to v4; 6 new round-trip unit tests; audit guardrails for difficulty-on-spe...	24.0h	90m	8m	16.0x	180.0x
15	decoy daily proficiency snapshots — DailySnapshot model, alembic migration, dialect-agnostic upsert in workerpool, EOD autopilot fetch + daycompleted payload expansion in headlessrunner, GET /students/{id}/proficiencyseries endpoint, 9 new tests across workerpool/studentmanager/api	8.0h	30m	1m	16.0x	480.0x
16	Release-test a web client stack: fixed purchase-route DB binding (9 files hitting wrong DB) + rewrote purchase JWT verifier to use local public key (self-JWKS deadlock under single-worker uvicorn); verified end-to-end register/login/entitlements/subscriptions/plans	3.0h	14m	3m	12.9x	60.0x
17	Cloud-wide regression sweep: 44 students across AWS/Azure/GCP/PMP (3-batch parallel via decoy CLI). 43/44 passed; mean predicted 89.8%, mean actual 99.7%, mean gap +9.9pt. PGWA flagged with same empty-goal_weights bug as AZ-500.	18.0h	90m	4m	12.0x	270.0x
18	Compliance L1 (admin-service role check), L2 (audit-log profile updates), M16 (Dependabot for an inference engine + a notification service); 888 tests pass across an authentication service + admin-service; readiness audit dispatched (4 parallel agents); accessibility remediation plan (7 waves, 14-17 eng-days)	6.0h	35m	2m	10.3x	180.0x
19	Readiness blockers H5 (self-assign bug), H6 (eslint-plugin-react-hooks load + Sparkline conditional useEffect fix), H9 (commit infra VPC doc comments); ESLint 13 errors -> 0 errors across an admin client + a desktop client, 3 commits pushed	3.0h	18m	1m	10.0x	180.0x
20	Autonomous blueprint-anchor diagnosis + content-aware re-anchor script (139 domains fixed); full 47-profile confirmation sweep (43/44 passed); PGWA deep-dive identified borderline 74.7% reserved-pool accuracy with weak-goal-biased practice exam as root cause beyond blueprint fix.	12.0h	180m	3m	4.0x	240.0x
21	a marketing site: PMP nav entry with Coming Monday badge, catalog search box with JSON index + JS filter, PMI June dates, refactor templates, build+deploy staging+prod	2.5h	55m	4m	2.7x	37.5x
22	a marketing site PMP card+banner: set available_at, update category/course-page templates to render Available May 25th	0.8h	18m	3m	2.5x	15.0x

Aggregate Statistics

Metric	Value
Total tasks	22
Total human-equivalent hours	425.2
Total Claude minutes	935
Total supervisory minutes	57
Total tokens	8,145,000
Weighted average leverage factor	27.3x
Weighted average supervisory leverage factor	447.6x
Human-equivalent weeks	10.6

Analysis

The day's leverage distribution matters more than the headline figure. The 161.5x ceiling came from Full an inference engine accessibility audit (50 repos, deterministic Phase 0 + 4 parallel LLM agents, ~288 findings) followed by full compliance audit (12 sect...; the 2.5x floor was a marketing site PMP card+banner: set available_at, update category/course-page templates to render Available May 25th. Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (447.6x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 22 tasks, the day produced roughly 10.6 weeks of senior-engineer-equivalent throughput in 15.6 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 21, 2026

Thu, 21 May 2026 23:59:00 GMT

Three tasks. May 21, 2026 weighted to 36.3x leverage across 69.0 human-equivalent hours in 114 Claude-minutes. Supervisory leverage closed at 318.5x.

1.7 weeks of human-equivalent throughput in 1.9 hours of Claude wall-clock. The 55.4x ceiling came from Math Content Rollout Phases 0-4: v2 pipeline verification, AP Precalc spec fixes (61->69 leaves, CED practices, broken topicsandobjectives), math content schemas (workedexample/...; the 8.6x floor sat at an API gateway native-mode wiring (bcrypt pin, settings hardening, certs->certifications fix, native entitlement path, commit-on-exit deps, eventtype kwarg drift, secure-cookie to....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Math Content Rollout Phases 0-4: v2 pipeline verification, AP Precalc spec fixes (61->69 leaves, CED practices, broken topicsandobjectives), math content schemas (workedexample/misconception/representationpack pydantic + 3 LLM generators), 2 hand-curated static files (52 formulas + 17 function families),...	60.0h	65m	6m	55.4x	600.0x
2	AP Precalc spec audit + math content rollout plan (spec issue identification, activity catalog inventory, 8-phase plan covering spec fixes, math-specific content shapes, 5 new Tier A activities, v2 atom synthesis, full math family rollout)	4.0h	14m	3m	17.1x	80.0x
3	an API gateway native-mode wiring (bcrypt pin, settings hardening, certs->certifications fix, native entitlement path, commit-on-exit deps, eventtype kwarg drift, secure-cookie toggle) + seedtestuser.py + PlaywrightDriver.primeauthsession + JourneyOrchestrator.seedauthenticatedsession	5.0h	35m	4m	8.6x	75.0x

Aggregate Statistics

Metric	Value
Total tasks	3
Total human-equivalent hours	69.0
Total Claude minutes	114
Total supervisory minutes	13
Total tokens	360,500
Weighted average leverage factor	36.3x
Weighted average supervisory leverage factor	318.5x
Human-equivalent weeks	1.7

Analysis

The day's leverage distribution matters more than the headline figure. The 55.4x ceiling came from Math Content Rollout Phases 0-4: v2 pipeline verification, AP Precalc spec fixes (61->69 leaves, CED practices, broken topicsandobjectives), math content sche...; the 8.6x floor was an API gateway native-mode wiring (bcrypt pin, settings hardening, certs->certifications fix, native entitlement path, commit-on-exit deps, event_type kwarg dri.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (318.5x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 3 tasks, the day produced roughly 1.7 weeks of senior-engineer-equivalent throughput in 1.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 20, 2026

Wed, 20 May 2026 23:59:00 GMT

11 tasks. May 20, 2026 weighted to 54.5x leverage across 550.0 human-equivalent hours in 605 Claude-minutes. Supervisory leverage closed at 1269.2x.

13.8 weeks of human-equivalent throughput in 10.1 hours of Claude wall-clock. The 202.1x ceiling came from a knowledge graph Phases 4-31 complete — REST route table + 20 Act-II inventions (heartbeat, lens, focus, predictor, capture, topography, resonance, prefetch, flame, oscilloscope,...; the 14.4x floor sat at Full an inference engine content audit + generate v2 lesson atoms for AWS Solutions Architect Pro (893/894 atoms, diagnosed and fixed max_tokens truncation bug).

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	a knowledge graph Phases 4-31 complete — REST route table + 20 Act-II inventions (heartbeat, lens, focus, predictor, capture, topography, resonance, prefetch, flame, oscilloscope, commitment, sentinel, topology, SLO tattoo, genome, spatial briefing, a CMS publisher, audience mirror, war room, ticker) + Act-II...	320.0h	95m	1m	202.1x	19200.0x
2	Fleet round 4: a newsletter platform multi-tenant newsletter ownership + migration, an accounting tool cash flow investing/financing wiring + invoice taxcode per-line calculation, a relationship CRM networkgraph_task end-to-end persistence + health endpoint fix, an analytics platform scheduled report dispat...	42.0h	75m	2m	33.6x	1260.0x
3	Fleet round 5: a knowledge base wiki-link resolution to real pages in same space, an infrastructure tool tag-keys-canonical custom in-process evaluator registry, a marketing platform lead scoring service writing to Contact.lead_score	30.0h	55m	2m	32.7x	900.0x
4	Fleet round 7: a CMS FR-013 content export ZIP endpoint, full doc-vs-implementation alignment audit across 21 backend tools (an observability platform/a defect tracker/an AI tool/an email platform/a portfolio browser/a gateway/a metrics tracker/a CMS/a calendar platform/a relationship CRM/a monitoring tool/a...	20.0h	40m	2m	30.0x	600.0x
5	Fleet round 3: an observability platform SLO budget action dispatch, a calendar platform in-process scheduler, a defect tracker activity WS broadcast, a marketing platform campaign step PATCH + EmailEditor save wiring, an audio tool @mention email dispatch	40.0h	80m	2m	30.0x	1200.0x
6	Fleet round 6: a task tracker FR-SHARE-020 notification service worker, an audio tool FR §3.12 built-in slash commands (/me /shrug /status /away /dnd /topic /archive /leave /remind)	16.0h	35m	2m	27.4x	480.0x
7	Fleet round 5c: an infrastructure tool governance.enforcementpolicies + .enforcementviolations ops + frontend rewire, an infrastructure tool expires-on-not-passed + expires-on-required-in-dev custom evaluators, an infrastructure tool IpSpacePage and AdvisorPage stale TODOs cleared	18.0h	40m	2m	27.0x	540.0x
8	Fleet feature implementation round 2: a marketing platform landingpage prompt builder, a calendar platform event attachments end-to-end, an observability platform alert.firing -> a notification service dispatch with publishafter_commit, a defect tracker @mention notifications	28.0h	65m	2m	25.9x	840.0x
9	Fleet round 5b: a CMS frontmatter TODO cleared, a marketing platform site list/settings frontend error display, an infrastructure tool StackDetailPage costs.by_stack wiring	12.0h	35m	2m	20.6x	360.0x
10	Phase 1 recommender starvation fix (lesson-first + goal-scoped saturation + weakgoalids surfacing) + SAP-C02 baseline scenario family (vacation, recert, convoy) + lessons-learned doc + validation sweep runner. 5 new commits (2 engine, 3 decoy). 9 new regression tests; full 5999-test suite green.	18.0h	60m	4m	18.0x	270.0x
11	Full an inference engine content audit + generate v2 lesson atoms for AWS Solutions Architect Pro (893/894 atoms, diagnosed and fixed max_tokens truncation bug)	6.0h	25m	5m	14.4x	72.0x

Aggregate Statistics

Metric	Value
Total tasks	11
Total human-equivalent hours	550.0
Total Claude minutes	605
Total supervisory minutes	26
Total tokens	3,700,000
Weighted average leverage factor	54.5x
Weighted average supervisory leverage factor	1269.2x
Human-equivalent weeks	13.8

Analysis

The day's leverage distribution matters more than the headline figure. The 202.1x ceiling came from a knowledge graph Phases 4-31 complete — REST route table + 20 Act-II inventions (heartbeat, lens, focus, predictor, capture, topography, resonance, prefetch, f...; the 14.4x floor was Full an inference engine content audit + generate v2 lesson atoms for AWS Solutions Architect Pro (893/894 atoms, diagnosed and fixed max_tokens truncation bug). Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (1269.2x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 11 tasks, the day produced roughly 13.8 weeks of senior-engineer-equivalent throughput in 10.1 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 19, 2026

Tue, 19 May 2026 23:59:00 GMT

Seven tasks. May 19, 2026 weighted to 47.1x leverage across 182.0 human-equivalent hours in 232 Claude-minutes. Supervisory leverage closed at 574.7x.

4.5 weeks of human-equivalent throughput in 3.9 hours of Claude wall-clock. The 166.2x ceiling came from a knowledge graph Act I Phase 0 — Python orchestrator daemon (IPC, Opus agent, MCP bus, briefing, diagnostics, test) + Swift Command Bar app (NSPanel, Carbon hotkey, NWConnection I...; the 4.8x floor sat at Resume autopilot cascade: diagnose+fix start_student commit-order bug, fix runs.json parallel-sweep race, fix 5 pre-existing tests, run Azure+AWS+GCP+retry sweeps; final 35/39 clou....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	a knowledge graph Act I Phase 0 — Python orchestrator daemon (IPC, Opus agent, MCP bus, briefing, diagnostics, test) + Swift Command Bar app (NSPanel, Carbon hotkey, NWConnection IPC client, view-model, design tokens) + build scripts + Launch Agent plist; swift build green, pytest green	36.0h	13m	1m	166.2x	2160.0x
2	a knowledge graph design rewrite — Metal+Rive visual stack (§21), Fleet Integration Matrix (§22), 32-phase plan (foundation + one feature per phase), 20 invented features mapped to phases and persisted to innovation log	30.0h	14m	2m	128.6x	900.0x
3	a knowledge graph Act I Phase 2 — 21-peer fleet registry + httpx-probing MCP bus + Haiku/rule-based classifier + PermissionGuard with TTL + fast/slow/confirm router + 8 slash commands + IPC fleet. + confirm. + Mac settings window with fleet panel + inline confirmation card + route breadcrumb + ⌘⇧, hotkey; s...	36.0h	17m	1m	127.1x	2160.0x
4	a knowledge graph Act I Phase 3 — Visual Stack Foundation: AtlasRenderEnvironment singleton (Metal device + queue + library + DisplayLink + Rive factory + energy monitor + FramePacer), MetalLayerView NSViewRepresentable, AtlasRenderer protocol, 7 .metal shader sources + compute kernels, 7 Swift pass wrappers...	32.0h	16m	1m	120.0x	1920.0x
5	a knowledge graph Act I Phase 1 — Ledger & Self-Instrumentation: SQLite statedb (0600 mode, WAL, commandhistory/costrecords/settings), OTel ledger emitter with OTLP to an observability platform, CostAccountant with daily cap + Opus->Haiku hard-cap fallback, agent instrumentation, IPC ledger.listtoday + co...	24.0h	14m	1m	102.9x	1440.0x
6	an analytics platform audit + Statcounter feature gap analysis + remediation plan (6 phases)	12.0h	8m	3m	90.0x	240.0x
7	Resume autopilot cascade: diagnose+fix start_student commit-order bug, fix runs.json parallel-sweep race, fix 5 pre-existing tests, run Azure+AWS+GCP+retry sweeps; final 35/39 cloud certs passed (AWS 13/13, GCP 9/11, Azure 13/15) up from 16/38 baseline	12.0h	150m	10m	4.8x	72.0x

Aggregate Statistics

Metric	Value
Total tasks	7
Total human-equivalent hours	182.0
Total Claude minutes	232
Total supervisory minutes	19
Total tokens	1,433,000
Weighted average leverage factor	47.1x
Weighted average supervisory leverage factor	574.7x
Human-equivalent weeks	4.5

Analysis

The day's leverage distribution matters more than the headline figure. The 166.2x ceiling came from a knowledge graph Act I Phase 0 — Python orchestrator daemon (IPC, Opus agent, MCP bus, briefing, diagnostics, test) + Swift Command Bar app (NSPanel, Carbon ho...; the 4.8x floor was Resume autopilot cascade: diagnose+fix start_student commit-order bug, fix runs.json parallel-sweep race, fix 5 pre-existing tests, run Azure+AWS+GCP+retry swee.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (574.7x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 7 tasks, the day produced roughly 4.5 weeks of senior-engineer-equivalent throughput in 3.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 18, 2026

Mon, 18 May 2026 23:59:00 GMT

Five tasks. May 18, 2026 weighted to 30.4x leverage across 190.0 human-equivalent hours in 375 Claude-minutes. Supervisory leverage closed at 518.2x.

4.8 weeks of human-equivalent throughput in 6.2 hours of Claude wall-clock. The 120.0x ceiling came from Review an admin client and author full Stitch prompt for Westworld Delos-themed WebGL/Rive redesign covering all 24 pages, design tokens, component vocabulary, motion language, aud...; the 13.6x floor sat at Docstring audit Phase 7 (Protocol contract enforcement): new audit script (scripts/auditprotocolcontracts.py, 857 LoC) with AST-based one-hop expansion through same-class helpers....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Review an admin client and author full Stitch prompt for Westworld Delos-themed WebGL/Rive redesign covering all 24 pages, design tokens, component vocabulary, motion language, audio design, and fidelity grading rubric	24.0h	12m	3m	120.0x	480.0x
2	Aperture V2 viewer Phases 1-3: Three.js stage layer (paper-grain + page-turn shaders, mastery candle, postprocessing), Rive Living Diagrams integration (validator update in engine), layer-registry slot system, Settings UI toggle, 19 vitest cases, Storybook themes/density stories	45.0h	35m	1m	77.1x	2700.0x
3	Resume + commit cleanup across engine/audits/domains, then scaffold Phase 0 Aperture V2 lesson viewer (4-layer V1↔V2 toggle, theme + motion + density registries, ApertureShell, AdaptiveDensityLayer idea #9, ViewerErrorBoundary, an analytics platform telemetry; ~950 LOC; vite build clean)	16.0h	28m	6m	34.3x	160.0x
4	an inference engine autopilot Fix A: coverage damping + hard ceiling on readiness. SOA-C02 baseline 36/73 KG goals at exampassed→ 73/73 covered + passed; 36/38 cloud certs hit full per-goal coverage across AWS/GCP/Azure cascade. computedomainreadiness (helpers.py:712+) and computenext_actions (autopilot...	100.0h	278m	10m	21.6x	600.0x
5	Docstring audit Phase 7 (Protocol contract enforcement): new audit script (scripts/auditprotocolcontracts.py, 857 LoC) with AST-based one-hop expansion through same-class helpers AND field-attribute delegates; audited 12 raise contracts across 7 Protocol abc.py files against 7 canonical implementer classes;...	5.0h	22m	2m	13.6x	150.0x

Aggregate Statistics

Metric	Value
Total tasks	5
Total human-equivalent hours	190.0
Total Claude minutes	375
Total supervisory minutes	22
Total tokens	1,416,500
Weighted average leverage factor	30.4x
Weighted average supervisory leverage factor	518.2x
Human-equivalent weeks	4.8

Analysis

The day's leverage distribution matters more than the headline figure. The 120.0x ceiling came from Review an admin client and author full Stitch prompt for Westworld Delos-themed WebGL/Rive redesign covering all 24 pages, design tokens, component vocabulary,...; the 13.6x floor was Docstring audit Phase 7 (Protocol contract enforcement): new audit script (scripts/auditprotocolcontracts.py, 857 LoC) with AST-based one-hop expansion throug.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (518.2x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 5 tasks, the day produced roughly 4.8 weeks of senior-engineer-equivalent throughput in 6.2 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 17, 2026

Sun, 17 May 2026 23:59:00 GMT

17 tasks. May 17, 2026 weighted to 10.8x leverage across 309.0 human-equivalent hours in 1723 Claude-minutes. Supervisory leverage closed at 228.9x.

7.7 weeks of human-equivalent throughput in 28.7 hours of Claude wall-clock. The 96.0x ceiling came from Origin-extract Phase 3 — populate services/an origin service with synthesis code, merged backend, /jobs API + structlog observability, aoctl CLI, and relocated test surface (522 pa...; the 1.0x floor sat at Decoy zero-sweep on reclassified cloud cert packages: engine restart, fixed autopilot_service NameError (missing import os), ran sweep, 2 real terminals (AZ-120 crossed 0.5 readine....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Origin-extract Phase 3 — populate services/an origin service with synthesis code, merged backend, /jobs API + structlog observability, aoctl CLI, and relocated test surface (522 passing service-wide, was 7). 10 commits.	80.0h	50m	5m	96.0x	960.0x
2	Audit other Claude's outstanding-work report against an inference engine engine codebase; corrected stale claims and re-estimated effort	8.0h	11m	6m	43.6x	80.0x
3	Cloud deployment plan for an origin service: distilled Phases 5-7 (SQS+Fargate+Bedrock wiring, frontend refactor, deploy+cutover) + Phase 8 hygiene into a single 194-line plan doc with Mermaid flow diagram, decision matrix, open questions	3.0h	5m	1m	36.0x	180.0x
4	Persistence audit follow-through: all 4 fixes shipped. (1) DeltaReplicationPublisher fail-loud in cloud profile. (2) HIGH-severity in-flight exam persistence — Alembic 007activeexams + ActiveExamRow + ActiveExamRepository + write-through on createexam + cache-miss fallback on submitexam + boot-time hydrat...	24.0h	55m	1m	26.2x	1440.0x
5	Refresh patent valuations and content counts across 25 a planning repo docs (business, marketing, research, README, CHANGELOG); rebuild patent-portfolio valuation framework ($60-230M floor); scrub Android-via-PWA and late-July language from funding plans and JDs to reflect native iOS+Android both launching Ju...	14.0h	40m	4m	21.0x	210.0x
6	Origin-extract Phase 4: delete src/an inference engine/origin + dying preflight subdirs + origin_router + 100+ scripts; slim OriginConfig; collapse regression guard; ratchet coverage 81→82; recover 7 over-deleted test files; finalize docs across CLAUDE.md, CHANGELOG.md, plan doc — 4 commits, ~73k LOC deletion...	16.0h	47m	1m	20.4x	960.0x
7	Docstring audit Phase 3 (DOCOVERSELLS rewrite): F1 fix in adminevents.py (module + livesessionpayload docstrings) for asymmetric user fallback (username->entityid; useremail->""); audit re-run verified doc-likely 1->0; PHASESTART + DOCREWRITE + PHASE_END log entries; resolution arc CLOSED	3.0h	10m	1m	18.0x	180.0x
8	Deterministic docstring-vs-code audit for engine: AST-driven scripts/auditdocstrings.py with 12 categories (structural + intent-vs-impl), per-finding likelytruth heuristic (fix doc	fix code	review). 65 findings across 292 files / 3012 symbols. Surfaced reloaddomains dedupe-vs-raise pattern + 5 durable...	12.0h	40m	2m	18.0x	360.0x
9	Docstring audit Phase 2 (FP bookkeeping): added EXCLUDEDFINDINGS set + AuditReport.addfinding() to scripts/auditdocstrings.py with 28 exact-tuple exclusions (file, line, symbol, category) retiring the 29 FALSEPOSITIVE dispositions from Phase 1. F26+F27 collapse to one tuple. Doc-likely count drops 30->1 (...	3.0h	10m	1m	18.0x	180.0x
10	CI hardening (fixed silently-dead nightly leak gate in engine nightly.yml — wrong import path; dropped continue-on-error from memray steps; mirrored nightly to an origin service with 500MB import baseline) + full persistence audit across engine+service. Found 4 issues, 1 HIGH (in-flight exams not persisted; e...	6.0h	20m	1m	18.0x	360.0x
11	Origin-extract Phases 6+8: an origin client frontend retargeted at an origin service via VITEORIGINAPIURL/VITEORIGINWSURL; swapped local 300-LOC bug-reporter for @an inference engine/bug-reporter on new a defect tracker Origin board; updated CLAUDE.md/README/CHANGELOG. Phase 8: contract-changes.md entry...	8.0h	35m	1m	13.7x	480.0x
12	an inference engine: fix domain reload manifold dupe (if_exists policy) + 8 unit tests + endpoint regression test; live-validated by reloading 38 AWS/GCP/Azure cert packages into running engine	5.0h	25m	6m	12.0x	50.0x
13	Round content metrics to nnn,nnn+ notation, fix per-domain cost from $0.17 to ~$20 end-to-end (Mercury 2 + question bank + adversarial + tribunal + lessons + scenarios + labs), strip Apple Vision Pro from all marketing/business docs (no plans to ship), bump LaTeX template needspace values to keep section head...	4.0h	25m	3m	9.6x	80.0x
14	recall-tier regeneration sweep — 274 domains across cloud + non-cloud buckets, +36462 nodes, +72924 contrastive pairs, fresh audit shows 0 in-scope CRITICAL+HIGH remaining; also tribunal pass on 8 orphan-fix packages, S3 backup of 295 domains (6.94 GB), AZ-140 synthesis resume + embedder lifecycle fix, autopi...	80.0h	540m	12m	8.9x	400.0x
15	Decoy zero-sweep diagnosis: fixed current_day/elo DB sync + zombie 'running' reaper + content-density auditor, traced 365-day exam-plateau to 74% of goals lacking recall foundation	15.0h	210m	8m	4.3x	112.5x
16	Docstring audit Phase 1: deterministic 9-step disposition pass for 30 doc-likely findings (3 batches of 10), with verbatim docstring/code citations, call-site enumeration, and per-finding justification. Output: append-only disposition table (1716 lines, 30 finding rows + 1 correction) and append-only resoluti...	24.0h	360m	20m	4.0x	72.0x
17	Decoy zero-sweep on reclassified cloud cert packages: engine restart, fixed autopilot_service NameError (missing import os), ran sweep, 2 real terminals (AZ-120 crossed 0.5 readiness=0.509 day 44 confirming recall lift; ANS-C01 partial climb to 0.204 day 48). 13 profiles untested at user request to stop.	4.0h	240m	8m	1.0x	30.0x

Aggregate Statistics

Metric	Value
Total tasks	17
Total human-equivalent hours	309.0
Total Claude minutes	1723
Total supervisory minutes	81
Total tokens	10,907,000
Weighted average leverage factor	10.8x
Weighted average supervisory leverage factor	228.9x
Human-equivalent weeks	7.7

Analysis

The day's leverage distribution matters more than the headline figure. The 96.0x ceiling came from Origin-extract Phase 3 — populate services/an origin service with synthesis code, merged backend, /jobs API + structlog observability, aoctl CLI, and relocated...; the 1.0x floor was Decoy zero-sweep on reclassified cloud cert packages: engine restart, fixed autopilot_service NameError (missing import os), ran sweep, 2 real terminals (AZ-120.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (228.9x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 17 tasks, the day produced roughly 7.7 weeks of senior-engineer-equivalent throughput in 28.7 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 16, 2026

Sat, 16 May 2026 23:59:00 GMT

38 tasks. May 16, 2026 weighted to 23.3x leverage across 393.5 human-equivalent hours in 1012 Claude-minutes. Supervisory leverage closed at 373.3x.

9.8 weeks of human-equivalent throughput in 16.9 hours of Claude wall-clock. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), Phone...; the 4.4x floor sat at Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), PhoneSyncClient over Wearable Data Layer (callbackFlow DataClient listener + decode pure helper), PhoneSyncModule, 5 screens (Welcome /...	26.0h	27m	1m	57.8x	1560.0x
2	an Android client Phase 11 five patent screens: 4 new EngineApi endpoints (governance/trajectory/cross-domain/scenario+submit) + 4 DTO files, PatentRepository, MockEngineDispatcher Contains match mode + 5 new fixtures, PatentScreenScaffold shared chrome, AnalyticsScreen (style axes + drift alerts + recommenda...	26.0h	28m	1m	55.7x	1560.0x
3	an Android client Phase 10 course mode + TTS: ElevenLabsTts (Media3 ExoPlayer wrapper with callbackFlow Player.Listener bridge), PlaybackUpdate, TtsCacheStore (SHA-256-keyed disk cache + resolve/enrollFile/clear/sizeBytes), VoiceModule, CourseViewModel (taxonomy → tree with depth-cap cycle short-circuit), bui...	22.0h	24m	1m	55.0x	1320.0x
4	an Android client Phase 9 active session: ActiveSessionViewModel (engine session lifecycle + wall-clock-anchored timing + DailyRingsStore mutation), ActiveSessionState sealed class, SessionHeader, ActiveSessionScreen with ActivityRouter, SessionResultsScreen with ELO delta tile, 6 activity composables (Contra...	28.0h	31m	1m	54.2x	1680.0x
5	an Android client Phase 13 competitive multiplayer: 2 new lobby endpoints + CompetitiveDto + CompetitiveRepository + 2 fixtures, ReconnectingEngineEventClient (exponential backoff 1/2/4/8/16s cap with ConnectionState StateFlow + healthy-reconnect counter reset), CompetitiveLobbyViewModel/Screen (create + join...	22.0h	25m	1m	52.8x	1320.0x
6	an Android client Phase 16 billing + i18n + finishing: Plus Jakarta Sans via Compose downloadable fonts + GoogleFont.Provider (5 weights, transparent SansSerif fallback), font_certs.xml documented stub, PlayBillingClient (suspending BillingClient wrapper + SharedFlow purchase updates + acknowledge auto-flow),...	24.0h	28m	1m	51.4x	1440.0x
7	an Android client Phase 12 Autopilot + WorkManager: AutopilotStore (encrypted prefs) + InMemoryAutopilotStore, NotificationChannels (autopilot.reminders + streak.milestones), AutopilotReminderScheduler (nextOccurrence pure helper + OneTimeWorkRequest sized delay), AutopilotReminderNotifier (Android 13+ permis...	22.0h	26m	1m	50.8x	1320.0x
8	an Android client Phase 14 knowledge cosmos: CosmosLayoutEngine in :domain (pure-Kotlin Fruchterman-Reingold with deterministic seed and 7 unit tests), LayoutNode/Edge/PositionedNode framework-free records, KnowledgeGraphDto + new EngineApi endpoint + KnowledgeGraphRepository + 9-node fixture, KnowledgeMapVie...	18.0h	22m	1m	49.1x	1080.0x
9	an Android client Phase 17 macrobenchmark + baseline profile: :macrobenchmark Gradle module (com.android.test + androidx.baselineprofile + self-instrumenting + variant gating), StartupBenchmark (cold + warm × None/Partial-BaselineProfileMode-Require/Full × 10 iterations targeting .benchmark variant), Baseline...	14.0h	18m	1m	46.7x	840.0x
10	Phase 6A: extract examservice from restgateway (createexam+submitexam+getstudyplan, 800 LOC removed, 22 new unit tests)	12.0h	23m	1m	31.3x	720.0x
11	Phase 7B: autopilotservice composite-path unit tests (computecompositereadiness aggregation + computecompositenextactions cluster-dedup + diversity guard)	5.0h	12m	0m	25.0x	6000.0x
12	Phase 7D: manifold + strategy gRPC servicer tests (fixed manifold.proto deprecated option, unblocked proto codegen, 14 new tests; api 75.3->79.3%, origin 78.2->80.5%)	5.0h	13m	0m	23.1x	3000.0x
13	Phase 6H: extract composite autopilot routes + cross-domain cluster helpers to autopilot_service (359 LOC, collocates the full autopilot brain in one service)	9.0h	24m	0m	22.5x	5400.0x
14	Phase 6F: extract insightsservice (computeinsights + cognitive-state classifier; 402 LOC out of rest_gateway, 16 new tests covering each card heuristic)	7.0h	19m	0m	22.1x	2100.0x
15	Phase 6C: extract questionservice (getnextpairmcq + getnextquestion) + generatemicrochallenge into autopilotservice (350 LOC, 21 new tests, fixes Phase 6B computenext_actions regression)	8.0h	22m	0m	21.8x	1920.0x
16	LLM-IT 8: controllerloop integration tests (3 tests covering constructor wiring + runsynthesis_stage + token usage rollup; $0.04/run)	4.0h	11m	0m	21.8x	2400.0x
17	an inference engine Phase 3 heavyweight extractions: deleteentity (127 LOC) + submitanswer (313 LOC) + submitquestionanswer (258 LOC) + assessreadiness (225 LOC) + getfingerprint (85 LOC) into sessionanswerservice + strategy_service. Includes ~100 new comprehensive unit tests covering every contract p...	18.0h	50m	2m	21.6x	540.0x
18	Phase 6B: extract submitactivitycredit + getcrossdomain_transfer into existing service modules (311 LOC, 12 new tests, 3 pre-existing tests updated)	6.0h	17m	0m	21.2x	720.0x
19	Phase 6I: extract catalog_service (catalog-projections + catalog-proficiency routes plus shared cache state + invalidation; 370 LOC)	6.0h	17m	0m	21.2x	1800.0x
20	an inference engine Phase 3 final heavyweight push: getdailystats + getentityreadinesshistory + getlesson + recordautopilotactivity + diagnoserootcause + createremediationsession (6 endpoints; ~750 LOC consolidated into strategyservice/lessonservice/autopilotservice/entityservice). ~80 new uni...	14.0h	40m	2m	21.0x	420.0x
21	Phase 7C: snapshotcache pure-logic unit tests (17 tests: msgpack coercion, SnapshotMeta round-trip, tensor markers, url resolution, loadsnapshot error paths)	3.0h	9m	0m	20.0x	3600.0x
22	an inference engine final autopilot brain extraction: getnextactionsinner (660 LOC) moved to autopilotservice.computenext_actions. Late-imports for 7 gateway-local helpers keep helpers + brain on separate sides without forcing helper migration. Audit-regression test updated to track the safety read at t...	6.0h	18m	2m	20.0x	180.0x
23	an inference engine Phase 5 ratchet + client update plan: bumped failunder 79->80 (actual 81.46%), wrote 200-line client-update-plan.md with endpoint-by-endpoint compatibility table, per-client impact assessment, behavior corrections (epsilon seeding, contenttype passthrough, exception ordering), pre-merge...	4.0h	12m	2m	20.0x	120.0x
24	LLM-IT 9: ValidationPipeline integration tests (3 tests covering 3-pass validation through real embedder+NLI+LLM; happy/empty/wrong-fragment paths)	3.0h	9m	0m	20.0x	3600.0x
25	LLM integration test harness: 17 tests across 5 origin modules (client, synthesizer, amplifier, validator tribunal, flashcard tribunal) with cost guard + auto-skip; first run cost $0.0255	12.0h	38m	2m	18.9x	360.0x
26	Origin extract Phase 2: 7 grouped commits cutting engine off an inference engine.origin. (LLM-client/embedder rewires in 9 files, composer relocation to an inference engine.runtime, PERSONALIZATION_ relocation to an inference engine.api.prompts, ScenarioConfig carve-off, AtomBundle/Collection lib path swaps...	8.0h	26m	1m	18.5x	480.0x
27	Phase 6G: move computedomainreadiness from restgateway to services/helpers (zero late-imports from services to restgateway anymore; 227 LOC, 5 new readiness-math tests)	4.0h	13m	0m	18.5x	2400.0x
28	an inference engine Phase 5 coverage backfill: 85 new tests across snapshotcache (msgpack default, tensor markers strip/restore, URL resolver, SnapshotPayload), scenarioseeds (normalizedifficulty, filter, tokens, coverage, grade keyword fallback, composecontext, buildscenarioresponse), computenextacti...	6.0h	20m	2m	18.0x	180.0x
29	Phase 6D: extract shared math+taxonomy helpers into services/helpers (eliminates late-import dance; 328 LOC out of restgateway, 25 new helper tests)	5.0h	17m	0m	17.6x	1200.0x
30	Phase 7A: catalogservice unit tests (15 tests covering cache helpers, projection bundle, invalidation, both routes; lifts catalogservice from 24% to ~95%)	4.0h	14m	0m	17.1x	2400.0x
31	Phase 7E: engine_context singleton + lab-index unit tests (6 tests; api 79.3->79.4%)	2.0h	7m	0m	17.1x	1200.0x
32	an inference engine Phase 5 final coverage backfill: 25 new tests for restgateway math helpers (poissonbinomialpassprobability, targetperquestionprobability inverse with round-trip verification, entityrollingcorrectnessrate, requiredobservationsper_node). Round-trip property test between forward +...	2.0h	8m	2m	15.0x	60.0x
33	Phase 6E: move 15 inline Pydantic models from rest_gateway to api/models.py (197 LOC, 0 regressions)	2.0h	9m	0m	13.3x	1200.0x
34	Origin extraction Phase 0: full inventory + dependency map + 9-phase plan + 3 new lib repos + new service repo with CLI/observability skeleton + 4 existing repos updated + 7 commits	14.0h	95m	15m	8.8x	56.0x
35	Audit-orphanfix batch complete: 9 fresh re-syntheses + 9 question banks landed at 100% graph∩pair overlap, VPR 0.87-0.98. Engine bug fix (regeneratenodes pair-orphan) verified end-to-end across all 9 packages. Monitored via 10-min cron with custom monitororphanfix.sh script that ran ~85 checks across 14h. A...	2.5h	20m	2m	7.5x	75.0x
36	Origin extract Phase 1: populate 3 new libs from an inference engine.origin (llm/embeddings/runtime types + schemas + parser + validator), full coverage suites, 197 tests green at ≥92% per lib, all 4 docs and commits per lib	9.0h	75m	3m	7.2x	180.0x
37	Created 4 new zero-sweep profiles, ran 9-domain a simulation harness calibration sweep, diagnosed portfolio-wide synthesis bug: contrastive pairs reference missing knowledge_graph nodes (33%-100% broken refs), starving engine readiness signal	3.0h	35m	4m	5.1x	45.0x
38	Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep preflight to auto-reload, fixed PCA profile resolver bug, identified FinOps-for-AI content bug (2 recall nodes vs 200+ baseline),...	8.0h	110m	12m	4.4x	40.0x

Aggregate Statistics

Metric	Value
Total tasks	38
Total human-equivalent hours	393.5
Total Claude minutes	1012
Total supervisory minutes	63
Total tokens	5,552,000
Weighted average leverage factor	23.3x
Weighted average supervisory leverage factor	373.3x
Human-equivalent weeks	9.8

Analysis

The day's leverage distribution matters more than the headline figure. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSyn...; the 4.4x floor was Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wir.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (373.3x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 38 tasks, the day produced roughly 9.8 weeks of senior-engineer-equivalent throughput in 16.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 15, 2026

Fri, 15 May 2026 23:59:00 GMT

19 tasks. May 15, 2026 weighted to 21.1x leverage across 378.0 human-equivalent hours in 1075 Claude-minutes. Supervisory leverage closed at 238.7x.

9.4 weeks of human-equivalent throughput in 17.9 hours of Claude wall-clock. The 73.8x ceiling came from an Android client repo skeleton: README, CLAUDE.md, and four parity docs (requirements, design, design-system, testing-strategy) translating the iOS Swift/SwiftUI client to Kotlin/...; the 7.2x floor sat at Recovered 5 misdirected re-synth packages (scripts/data/domains -> data/domains); diagnosed and fixed engine bug at loop.py:460 (prevalidate_nodes string-not-dict crash) mirrorin....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	an Android client repo skeleton: README, CLAUDE.md, and four parity docs (requirements, design, design-system, testing-strategy) translating the iOS Swift/SwiftUI client to Kotlin/Compose/AppAuth/Wear OS	16.0h	13m	2m	73.8x	480.0x
2	an Android client Phase 3 data layer: EngineApi (single Retrofit interface, all endpoint groups), 7 DTO files, EngineClient facade with HttpException/SerializationException/IOException → EngineError mapping, EngineError sealed class, AuthInterceptor + TokenProvider, EngineEventClient (OkHttp WebSocket → Flow<...	30.0h	28m	1m	64.3x	1800.0x
3	an Android client Phase 8 dashboard + rings: DailyRingsState + RingTargets, DailyRingsRollover (pure-function rules with 7 test cases), DailyRingsStore (SharedPreferences-backed with StateFlow + recordAnswer/recordActivity/rolloverIfNeeded), DailyRingsModule, DailyRingResetWorker (HiltWorker periodic 1-day fl...	22.0h	25m	1m	52.8x	1320.0x
4	an Android client Phase 2 domain logic: LcsDiff (with iOS-bug fix), DeterministicShuffle (DJB2+Mulberry32+Fisher-Yates), BehavioralRingsComputation + RingArc + RingConstants + BehavioralRings, ProficiencyColor + TimedRecallTimer, Base64Url with PKCE helpers (RFC 7636 verified), full AppModels with kotlinx-ser...	14.0h	16m	1m	52.5x	840.0x
5	an Android client Phase 7 onboarding + initialization: OnboardingViewModel (3-step state machine with DeterministicShuffle-seeded calibration quiz, 8-question SAMPLE_BANK, SavedStateHandle restoration), OnboardingScreen (tier cards + progress-tracked quiz + completion), KnowledgeTier 5-tier enum, Initializati...	14.0h	16m	1m	52.5x	840.0x
6	an Android client Phase 6 catalog + exam info: CatalogViewModel (StateFlow combine + EngineError-to-message mapping), DomainCatalogScreen (adaptive LazyVerticalGrid 1/2/3 cols, badges, top app bar with refresh + sign-in/profile, loading/empty/error states), ExamInfoViewModel (SavedStateHandle for domainId), E...	18.0h	21m	1m	51.4x	1080.0x
7	an Android client Phase 4 authentication: TokenStore + EncryptedTokenStore (AES-256-GCM Keystore), PendingEnrollmentStore + Encrypted impl, PkceVerifierStore + Encrypted impl with 5-min TTL, OidcConfig, OidcAuthService (AppAuth Custom Tabs orchestration with suspend code-exchange), AuthResult sealed class, Au...	16.0h	19m	1m	50.5x	960.0x
8	an Android client Phase 5 app shell + state machine: 28-state Phase sealed interface (all @Parcelize), ActivityModeKey + WatchPhase, AppState, AppStateHolder (StateFlow Singleton), AppViewModel (HiltViewModel with SavedStateHandle restoration + auth bootstrap + startStudying decision + handleBackPressed), Pha...	14.0h	17m	1m	49.4x	840.0x
9	an Android client Phase 1 design system: HslColor + an inference engineColorScheme (light + dark, 1:1 parity with web tokens.css), an inference engineBrand runtime accent override, an inference engineTokens public surface (Composable getters + Spacing/Radius/Motion/TapTarget/Elevation/FontSize), an inference...	18.0h	22m	1m	49.1x	1080.0x
10	an Android client Phase 0: phased build plan (18 phases) + Gradle multi-module skeleton (app/wear/design-system/domain/data/testing), Kotlin 2.0 + AGP 8.5 + Compose BOM, Hilt+KSP, version catalog, Hilt Application + Compose MainActivity for phone+Wear, manifest with OIDC + App Link intent-filters, network-sec...	12.0h	18m	1m	40.0x	720.0x
11	Two funding-strategy documents (pre-revenue SAFE path and growth-bridge + priced-seed path) covering consumer + a recruiter product + enterprise markets with branded PDFs	16.0h	28m	8m	34.3x	120.0x
12	an inference engine: retire @pytest.mark.slow tests, add 30s default timeout + pristine RNG seeding, lift 14 of 16 packages to >=85% unit-test coverage with 1,342 new fast tests across 20 files (5,010 pass / 0 fail / 75s wall-clock, pristine across 3 back-to-back runs). Built per-module coverage gate, fixed L...	80.0h	240m	15m	20.0x	320.0x
13	Third funding-plan variant (SAFE + 2 equity-comp founding hires + native Android September 2026); PDF tooling improvements (DOC_DATE override, H2 page-break removal)	6.0h	22m	6m	16.4x	60.0x
14	an inference engine Phase 3 service-layer extraction: 16 endpoints across 12 service modules (sequencing, interaction, atom_service compose v1+v2, autopilot lifecycle/create/composite/list-due, operations+telemetry batch, entity self-report + seed-from-mastery, session audio/hint/end/upload-resume/next-challe...	32.0h	130m	4m	14.8x	480.0x
15	a simulation harness: audited UI vs post-April app-web rebuild, fixed Postgres auth + 22 stuck workers, added 4 frontend polish fixes (SSE wiring, sidebar grouping, cloud filter, per-provider calibration facet), remapped 8 Playwright page objects (onboarding, dashboard, exam, mcq, library, session_config, aut...	22.0h	95m	6m	13.9x	220.0x
16	Diagnosed pair-orphan engine bug (regenerate_nodes returned only new nodes, caller looked up stale pairs by NEW id; pairs hold OLD id so intersection always empty); fixed signature + caller, added regression test #19; archived 9 affected packages; relaunched orphan-fix batch with 3-way parallel concurrency; s...	5.0h	35m	5m	8.6x	60.0x
17	Consolidated advisor-ready funding plan 02c (5-person team, $6.5M SAFE, profit-sharing, patent-adjusted valuations, 5-year comp tables) plus HoRO/CFO + Marketing Director job description PDFs	32.0h	240m	35m	8.0x	54.9x
18	Patent + diagram audit clean-up bundle for 7 follow-on filing working drafts (working drafts): reverted regression edges to deterministic-classifier verdicts (a follow-on FIG 2, a follow-on FIG 7), fixed EE 710 cross-fig conflict (subgraph carries numeral, PROV unnumbered, spec corrected), added success-path...	8.0h	65m	2m	7.4x	240.0x
19	Recovered 5 misdirected re-synth packages (scripts/data/domains -> data/domains); diagnosed and fixed engine bug at loop.py:460 (prevalidate_nodes string-not-dict crash) mirroring synthesizer/engine.py defensive coercion; added regression test #18; relaunched resume batch (3 syntheses + 5 QBs)	3.0h	25m	3m	7.2x	60.0x

Aggregate Statistics

Metric	Value
Total tasks	19
Total human-equivalent hours	378.0
Total Claude minutes	1075
Total supervisory minutes	95
Total tokens	5,130,000
Weighted average leverage factor	21.1x
Weighted average supervisory leverage factor	238.7x
Human-equivalent weeks	9.4

Analysis

The day's leverage distribution matters more than the headline figure. The 73.8x ceiling came from an Android client repo skeleton: README, CLAUDE.md, and four parity docs (requirements, design, design-system, testing-strategy) translating the iOS Swift/Swift...; the 7.2x floor was Recovered 5 misdirected re-synth packages (scripts/data/domains -> data/domains); diagnosed and fixed engine bug at loop.py:460 (prevalidate_nodes string-not-.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (238.7x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 19 tasks, the day produced roughly 9.4 weeks of senior-engineer-equivalent throughput in 17.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 14, 2026

Thu, 14 May 2026 23:59:00 GMT

Eight tasks. May 14, 2026 weighted to 31.1x leverage across 189.0 human-equivalent hours in 365 Claude-minutes. Supervisory leverage closed at 290.8x.

4.7 weeks of human-equivalent throughput in 6.1 hours of Claude wall-clock. The 64.0x ceiling came from Merge an authentication service + a purchase service + an onboarding service + an inference engine-a recruiter product-web backend into an API gateway as separate logical DBs (auth...; the 4.4x floor sat at Content audit run; identified next 10 priority domains; SOA-C02 pair_id linkage repair (12.2% -> 100%); diagnosed cross-domain prereq validator bug; built and launched audit-batch....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Merge an authentication service + a purchase service + an onboarding service + an inference engine-a recruiter product-web backend into an API gateway as separate logical DBs (authdb, purchasedb, an inference engine_aces). Phases 0-5: SQLAlchemy + Alembic multi-DB foundation, JWT signing + JWKS, feature fla...	80.0h	75m	8m	64.0x	600.0x
2	Full implementation pass: delete redundant worker code (an authentication service workers/, a recruiter product Celery workers); build masterskills (188 rows) + mastercerts (101 industry + 10 recruiter product = 111 rows) with real seed from a content specification system/certifications and curated taxonomy...	60.0h	90m	6m	40.0x	600.0x
3	Implement 7-day free-trial epic: a purchase service comp endpoints + auto-revoke, an authentication service signup hook + trial-started email, a notification service templates, EventBridge Lambda for T-1d + T0 sweep, a web client trial badge, marketing copy on a marketing site + auth signup page	24.0h	65m	6m	22.1x	240.0x
4	Full patent and diagram audits across 7 follow-on filing working drafts (working drafts): Phase 0 deterministic edge classifier (0 findings), Phases 1-7 of full-patent-audit.md, and 7 per-app diagram agents per full-diagram-audit.md. Identified 2 new regressions (a follow-on FIG 2 CFU->REC dotted-forward, a f...	6.0h	18m	1m	20.0x	360.0x
5	Backfill 4 days of leverage posts (May 10-13) on a personal site: synced 69 missing records from cloud API to CSV, wrote sanitization pipeline (regex-based) and generated 4 markdown posts with intro/task-table/aggregates/analysis structure covering 73 total tasks across the 4 days, scrubbed all proprietary re...	7.0h	22m	4m	19.1x	105.0x
6	Merge Anthropic email content into How I Built a study product launch story essay (3200 words, 11 sections): added By the Numbers leverage table, Built with Claude observations section, a metrics tracker + The Deferral side quests, Accessibility section, and rewrote Giving Back to correctly distinguish free K...	5.0h	18m	6m	16.7x	50.0x
7	Bring full an inference engine local stack up (11 services) and fix unauth /entitlements/me + /auth/refresh 401 cascade on public dashboard	3.0h	22m	4m	8.2x	45.0x
8	Content audit run; identified next 10 priority domains; SOA-C02 pairid linkage repair (12.2% -> 100%); diagnosed cross-domain prereq validator bug; built and launched audit-batch (5 re-syntheses + 4 question regens) with an inference engineSKIPCROSSDOMAINPREREQCHECK env var	4.0h	55m	4m	4.4x	60.0x

Aggregate Statistics

Metric	Value
Total tasks	8
Total human-equivalent hours	189.0
Total Claude minutes	365
Total supervisory minutes	39
Total tokens	2,131,000
Weighted average leverage factor	31.1x
Weighted average supervisory leverage factor	290.8x
Human-equivalent weeks	4.7

Analysis

The day's leverage distribution matters more than the headline figure. The 64.0x ceiling came from Merge an authentication service + a purchase service + an onboarding service + an inference engine-a recruiter product-web backend into an API gateway as separa...; the 4.4x floor was Content audit run; identified next 10 priority domains; SOA-C02 pair_id linkage repair (12.2% -> 100%); diagnosed cross-domain prereq validator bug; built and l.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (290.8x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 8 tasks, the day produced roughly 4.7 weeks of senior-engineer-equivalent throughput in 6.1 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 13, 2026

Wed, 13 May 2026 23:59:00 GMT

Three tasks. May 13, 2026 weighted to 54.5x leverage across 80.0 human-equivalent hours in 88 Claude-minutes. A quieter day: an observability-platform from design-to-implementation gap closure, a deterministic diagram-edge audit pass, and a single flagship-course buildout with curriculum mapping, study plan, and interaction tagging. Supervisory leverage closed at 480.0x.

2.0 weeks of human-equivalent throughput in 1.5 hours of Claude wall-clock. The 130.0x ceiling came from an observability platform: closed design-vs-implementation gap — 14 models + migration 0012, RBAC + API keys + audit, 30+ REST routes, 12 Celery workers, in-process MCP mount, 3...; the 15.0x floor sat at an AP course: CED mapping + 10-day study plan + V2 atom interaction tagger + goal_id bug fix + repair tooling + 354 atoms tagged with 708 interactions.

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	an observability platform: closed design-vs-implementation gap — 14 models + migration 0012, RBAC + API keys + audit, 30+ REST routes, 12 Celery workers, in-process MCP mount, 3 ingest protocols (Prom remote_write/StatsD/syslog), 6 new frontend pages, real LLM wiring (a mid-tier model RCA + an embedding model embedd...	65.0h	30m	3m	130.0x	1300.0x
2	Deterministic diagram edge audit: Python classifier, 6 .mmd fixes, 12 per-edge exceptions, audit doc update	5.0h	18m	2m	16.7x	150.0x
3	an AP course: CED mapping + 10-day study plan + V2 atom interaction tagger + goal_id bug fix + repair tooling + 354 atoms tagged with 708 interactions	10.0h	40m	5m	15.0x	120.0x

Aggregate Statistics

Metric	Value
Total tasks	3
Total human-equivalent hours	80.0
Total Claude minutes	88
Total supervisory minutes	10
Total tokens	490,000
Weighted average leverage factor	54.5x
Weighted average supervisory leverage factor	480.0x
Human-equivalent weeks	2.0

Analysis

The day's leverage distribution matters more than the headline figure. The 130.0x ceiling came from an observability platform: closed design-vs-implementation gap — 14 models + migration 0012, RBAC + API keys + audit, 30+ REST routes, 12...; the 15.0x floor was an AP course: CED mapping + 10-day study plan + V2 atom interaction tagger + goal_id bug fix + repair tooling + 354 atoms tagged with 708.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (480.0x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

May 13 was a low-task-count day but with one large, high-leverage build (the observability platform). When a single agent gets handed a coherent implementation spec covering 14 models, ~30 routes, RBAC, audit logging, and Celery workers, the ratio of human prompt-writing to AI output reaches its highest reasonable bound. Days like this produce big numbers from small task counts.

Across the 3 tasks, the day produced roughly 2.0 weeks of senior-engineer-equivalent throughput in 1.5 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 12, 2026

Tue, 12 May 2026 23:59:00 GMT

Twenty-four tasks. May 12, 2026 weighted to 65.7x leverage across 877.0 human-equivalent hours in 801 Claude-minutes. The day shifted into post-launch consolidation: porting the web client's full feature set to the desktop client, authoring four follow-on IP filings end-to-end, and running deterministic patent-and-diagram audits four consecutive times until the recurrence cycle broke. A typed-atom authoring subsystem and a continuous-density rendering subsystem both had patent drafts completed and audited. Supervisory leverage closed at 506.0x.

21.9 weeks of human-equivalent throughput in 13.4 hours of Claude wall-clock. The 213.3x ceiling came from Author 4 new follow-on filing patent applications (4 follow-on subsystems) — each ~100KB markdown with 20 claims and 8 Mermaid figures, plus full cross-document consistency upda...; the 5.0x floor sat at Fix 8 pre-existing test failures in an inference engine API endpoint suite (route mismatches, wrong status codes, inverted diminishing_note logic).

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Author 4 new follow-on filing patent applications (4 follow-on subsystems) — each ~100KB markdown with 20 claims and 8 Mermaid figures, plus full cross-document consistency updates (canonical numbers, gen scripts, audit JSON, CHANGELOG, 14 portfolio docs)	160.0h	45m	5m	213.3x	1920.0x
2	a desktop client full web feature parity — foundation deps + 16 IPC handlers + 8 charts + 15 components + 24 data stores + 22 i18n namespaces + readiness module + session machine + voice/TTS + sync/telemetry + app-services + 4 big-rock screens (Session 1244 LOC, CourseDetail full, Exam 420 LOC, LessonView 570 LOC) +...	240.0h	95m	8m	151.6x	1800.0x
3	Build remaining ~57 Tier 3-4 interaction components across 12 domains; FullComponentCatalog browse page; registry wire-up; build green	160.0h	85m	3m	112.9x	3200.0x
4	Build all 10 Tier-2 interaction components (graphingcalc, compoundinterest, punnettsquare, timeline, conjugationdrill, piano, mapquiz, orbitalsim, physicssim, circuitbuilder) plus shared utilities; gallery + registry + build green	80.0h	50m	3m	96.0x	1600.0x
5	a desktop client: wire every local-only stub to real IPC — getDailyStats, postCognitiveState, patchEnrollment/archiveEnrollment, userState get/put/delete, testimonial get/upsert/delete/streaming-suggest (NDJSON per-chunk fan-out), plus dailyStats/userPrefs/activityPreferences/enrollment store rewrites to use real an...	12.0h	12m	2m	60.0x	360.0x
6	an iOS client: web parity sweep (9 of 12 deltas closed) — auto bug reporter, native Autopilot settings, Credential Mapping, Insights/Forecast/KnowledgeMap promotions, Offline mode, Calibrate, KaTeX math, Accept Invite flow; docs + build green	32.0h	35m	4m	54.9x	480.0x
7	Fix all rerun-2 patent + diagram audit findings (16 FAILs + 3 WARNs across 7 follow-on filing apps): refresh canonical.json (a follow-on range added, a follow-on app to 26 claims); replace learner with entity in several follow-on apps; rename daystoexam to daystoassessment in a follow-on app; expand Invention_Li...	14.0h	19m	1m	42.9x	840.0x
8	Port 10 screens + KnowledgeMap chart from a web client to a desktop client (ExamResultsScreen, ReadinessForecast, CredentialMapping, Courses, FlashcardsScreen, CertificationsScreen, KnowledgeMapScreen, OfflineScreen, PageNotFound, AcceptInvite)	8.0h	12m	3m	40.0x	160.0x
9	Run full patent and diagram audits for an IP portfolio repo: 7 follow-on filing apps (7 follow-on apps), 56 diagrams, 7 phases of patent checks plus per-app semantic agents. Produced timestamped report and updated diagram baseline.	6.0h	9m	1m	36.7x	360.0x
10	Full patent and diagram audit (rerun-4) in an IP portfolio repo: 7 follow-on filing apps, 56 diagrams, ~30 supporting docs, 7 parallel per-app diagram agents. Found 7 FAIL + 8 WARN against rerun-3 0/0 claim; diagnosed structural recurrence (uncommitted fixes, prose-mirror drift, stale audit-doc expectations).	8.0h	14m	2m	34.3x	240.0x
11	Seed four Entity Collections for an inference engine adaptive learning platform (periodicelements 118, usstates 50, countries 50, historical_figures 44)	20.0h	35m	5m	34.3x	240.0x
12	Port CourseDetail.tsx (2930 LOC, 5 tabs) from a web client to CourseStructure.tsx in a desktop client — full feature parity including Autopilot, Study Plan, Curriculum, Activities, Labs tabs	24.0h	45m	8m	32.0x	180.0x
13	Full an inference engine patent + diagram audit (7 follow-on filing apps, 56 diagrams, 27 docs)	6.0h	12m	1m	30.0x	360.0x
14	Audit, optimize, and ship all 58 CLAUDE.md files across the an inference engine monorepo: 6 parallel audit agents, 5 parallel editing agents, 50 repos committed and pushed. Net -3500 lines, 6 new docs files extracted, internal contradictions resolved (a CMS CodePipeline, websites parallel-build), version staleness f...	35.0h	75m	12m	28.0x	175.0x
15	a desktop client Wave 5 parity: Help Center (10 screens), full Insights rewrite (AnalyticsPanel), Dashboard polish (DriftActionCard + ConvoyCard + DashboardAcesSection), Settings polish (tabbed layout + ScheduleTab + account deletion with react-hook-form/zod)	24.0h	55m	10m	26.2x	144.0x
16	Break the patent-audit recurrence cycle: commit 49 rerun-3 fixes; fix 5 real diagram FAILs (FIG 1 arrows, FIG 7 label, FIG 8 (740), FIG 8 (720)/(730)); identify 2 BB findings as agent errors via cycle test and add exceptions; migrate CLAUDE.md/AGENTS.md exception-list prose to canonical pointers; refactor full-paten...	8.0h	25m	1m	19.2x	480.0x
17	Port active-session screen from a web client to a desktop client - full state machine with countdown/active/feedback/paused/summary phases, ActivityFrame, cognitive state, TTS narration, plan session	8.0h	28m	5m	17.1x	96.0x
18	Build deterministic a11y audit toolchain (axe-core CLI + Playwright sweep + jsx-a11y + Python source checker, unified through stable-hash triage ledger) to eliminate cross-run finding nondeterminism. New scripts: a11y_ledger.py with adopt/list/mark/filter; run-a11y-static.sh axe-core/cli wrapper. ESLint jsx-a11y wir...	6.0h	22m	5m	16.4x	72.0x
19	Run full deterministic accessibility audit via new 3-engine toolchain (Python source + Playwright axe + static-site axe via Playwright .mjs replacing broken @axe-core/cli). Ledger bootstrapped with 185 unique findings. Critical infra bug surfaced: existing a web client npm run test:axe has been silently scanning an...	8.0h	30m	4m	16.0x	120.0x
20	Cascade 717->733 claim total across patent portfolio docs, audits canonical, architecture README, canonical-values.yaml	1.5h	6m	2m	15.0x	45.0x
21	Port 22 utility modules (hooks, voice, sync, telemetry, app-services, a11y) from a web client to a desktop client with IPC adaptations	8.0h	35m	8m	13.7x	60.0x
22	Port LessonView from a web client to a desktop client LessonScreen — full markdown/math/code rendering, collapsible sidebar taxonomy, TTS IPC audio, adaptive toggle, section pagination, completion credit, confetti	4.0h	18m	4m	13.3x	60.0x
23	Port readiness and session modules (16 files) from a web client to a desktop client with API import adaptation	3.0h	20m	5m	9.0x	36.0x
24	Fix 8 pre-existing test failures in an inference engine API endpoint suite (route mismatches, wrong status codes, inverted diminishing_note logic)	1.5h	18m	2m	5.0x	45.0x

Aggregate Statistics

Metric	Value
Total tasks	24
Total human-equivalent hours	877.0
Total Claude minutes	801
Total supervisory minutes	104
Total tokens	5,146,500
Weighted average leverage factor	65.7x
Weighted average supervisory leverage factor	506.0x
Human-equivalent weeks	21.9

Analysis

The day's leverage distribution matters more than the headline figure. The 213.3x ceiling came from Author 4 new follow-on filing patent applications (4 follow-on subsystems) — each ~100KB markdown with 20 claims and 8 Mermaid figures, p...; the 5.0x floor was Fix 8 pre-existing test failures in an inference engine API endpoint suite (route mismatches, wrong status codes, inverted diminishing_no.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (506.0x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

May 12 was the highest-volume day in the four-day window. The 213x ceiling on the four-IP-filings task came from work that maps cleanly to a known authoring template; the model fills the slot, the audit catches issues, the loop closes in minutes. Cross-platform feature-parity ports also scored high because the source-of-truth implementation already existed in another codebase.

Across the 24 tasks, the day produced roughly 21.9 weeks of senior-engineer-equivalent throughput in 13.4 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 11, 2026

Mon, 11 May 2026 23:59:00 GMT

Nineteen tasks. May 11, 2026 weighted to 37.2x leverage across 473.5 human-equivalent hours in 764 Claude-minutes. The day was launch-night itself plus a sustained accessibility-audit-and-remediation push across the customer product and 8 marketing-site fleet members. Late-night security audit, real-time fabric refactor, and the inevitable post-launch infrastructure fixes rounded it out. Supervisory leverage closed at 263.1x.

11.8 weeks of human-equivalent throughput in 12.7 hours of Claude wall-clock. The 240.0x ceiling came from WCAG 2.1 AA accessibility audit across 9 properties (a web client + 8 marketing sites) — ~120 concrete findings with file:line refs, severity grouping, cross-cutting themes, and...; the 7.6x floor sat at Launch-night batch: fix admin delete lockup (a cache layer purge timeout), unblock an API service CI build (ruff lint), kill a frontend library 401 retry storm, rebuild + upload....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	WCAG 2.1 AA accessibility audit across 9 properties (a web client + 8 marketing sites) — ~120 concrete findings with file:line refs, severity grouping, cross-cutting themes, and 6-8 dev-day remediation roadmap	60.0h	15m	2m	240.0x	1800.0x
2	Full WCAG 2.1 AA accessibility audit on a web client + 8 sister sites — deterministic checker + parallel LLM judgment phase, 56 findings (7 CRITICAL, 17 HIGH, 24 MEDIUM, 8 LOW) with sequenced remediation plan	30.0h	17m	2m	105.9x	900.0x
3	WCAG 2.1 AA remediation across 11 repos (a web client + design-system + activities + a marketing site flagship + 6 sister sites + shared template + enterprise accessibility-statement rewrite). 8 parallel fix agents, design-system fixes propagate via roving tabindex/aria-controls/FocusScope traps; shared Jinja partia...	70.0h	40m	1m	105.0x	4200.0x
4	Full WCAG 2.1 AA accessibility audit across a web client and 10 sister sites (123 findings; 13 P0 blockers identified). Consolidated report written to the monorepo audits/reports/accessibility-audit-report-2026-05-11-deep.md.	24.0h	18m	3m	80.0x	480.0x
5	Fix all 56 WCAG 2.1 AA accessibility findings (7 CRITICAL + 17 HIGH + 24 MEDIUM + 8 LOW) across a web client and the 8 sister sites — token contrast, focus management, ARIA wiring, keyboard nav, focus traps, animation guards, touch targets, document titles, modal labelling, custom tablists, FAQ semantic structure, e...	60.0h	50m	1m	72.0x	3600.0x
6	Pre-launch security & crash audit + fix sweep across auth/purchase/onboarding/notification services: 21 issues fixed (4 CRITICAL admin gaps + IDOR, MFA bypass, webhook bypass, IDOR/spam, plus 17 HIGH), 2 alembic migrations, 109 new tests, all 4 services deployed and smoke-tested in prod, plus a notification service...	48.0h	55m	8m	52.4x	360.0x
7	a newsletter platform: refactor real-time fabric from WebSocket to REST + SSE (a cache layer pub/sub + ring buffer, new /events/stream + /events/recent endpoints, EventStreamContext, cross-newsletter ActivityPage, full test rewrite)	14.0h	18m	4m	46.7x	210.0x
8	a project management cert demo + Adaptive Lesson Generation 2.0: plan + patentability (8 claims), atom schema+validator+composer+generator end-to-end, 6 project management cert item generators producing +671 new items (multiselect/dragmatch/sequence/roleplay/constructedresponse), 8x throughput refactor via map_s...	50.0h	90m	5m	33.3x	600.0x
9	WCAG 2.1 AA accessibility audit of shared a learning platform Jinja templates (30 templates + main.js, 23 issues found)	12.0h	28m	5m	25.7x	144.0x
10	WCAG 2.1 AA accessibility audit of a marketing site and a marketing site — all templates, content pages, built HTML	8.0h	22m	5m	21.8x	96.0x
11	Build isolated E2E Playwright harness (auth, stubs, page objects, firehose + journey runners) + fix 6 production bugs surfaced by harness (legacy token scrub, RemoteBanners filter, proficiency entries, dailyStats NaN, ResumeReviewSection length, offlineQueue indexedDB); 10 commits across an inference engine/an API s...	30.0h	90m	8m	20.0x	225.0x
12	Launch-night DB pool sweep across 19 repos + a cache layer-backed user/refresh-token cache (refresh tokens moved to a cache layer-only, Postgres no longer system of record) + cross-service cascade delete (auth → purchase) + entitlements queryKey user-scoping	24.0h	90m	8m	16.0x	180.0x
13	Deploy a newsletter platform SSE refactor + fix an assets CDN CORS (S3 bucket policy + CloudFront invalidation)	1.5h	6m	1m	15.0x	90.0x
14	an admin tool: wire hard-delete customer flow to a billing service GDPR endpoint so subscriptions/payments/comps cascade-delete and a payment provider stops billing; receipt modal now shows purchase-side counts and a payment provider cancel errors	2.0h	8m	3m	15.0x	40.0x
15	Add system snapshot purge (archived + older-than modes) to an admin tool SnapshotsTab + RPC handler; fix banner save MissingGreenlet by setting eager_defaults=True on Banner model	3.0h	12m	4m	15.0x	45.0x
16	CSS accessibility audit: color contrast, focus styles, motion preferences across sister sites and a web client	6.0h	25m	10m	14.4x	36.0x
17	WCAG 2.1 AA accessibility audit of a web client React SPA	8.0h	35m	10m	13.7x	48.0x
18	Launch-day recovery: rewrote launch schedule for post-PH-flop reality (struck dead email-blast rows, added wire spend, fixed LinkedIn post date), audited homepage email-capture gap, wrote 08solofounderpressplan.md (~430 lines: Anthropic-first/newsletter/exclusive/HN-inbound/aggregator strategy with per-outlet pe...	16.0h	90m	22m	10.7x	43.6x
19	Launch-night batch: fix admin delete lockup (a cache layer purge timeout), unblock an API service CI build (ruff lint), kill a frontend library 401 retry storm, rebuild + upload 4.3GB boot cache to S3, author SessionStart voice hook with compaction-safe persistence	7.0h	55m	6m	7.6x	70.0x

Aggregate Statistics

Metric	Value
Total tasks	19
Total human-equivalent hours	473.5
Total Claude minutes	764
Total supervisory minutes	108
Total tokens	5,185,500
Weighted average leverage factor	37.2x
Weighted average supervisory leverage factor	263.1x
Human-equivalent weeks	11.8

Analysis

The day's leverage distribution matters more than the headline figure. The 240.0x ceiling came from WCAG 2.1 AA accessibility audit across 9 properties (a web client + 8 marketing sites) — ~120 concrete findings with file:line refs, seve...; the 7.6x floor was Launch-night batch: fix admin delete lockup (a cache layer purge timeout), unblock an API service CI build (ruff lint), kill a frontend l.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (263.1x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

May 11 was the actual launch day. The 240x ceiling on the WCAG audit task is a useful data point: deterministic audit work against a defined standard is where AI leverage maxes out, because the specification is external and the checker is mechanical. Launch-night fixes ran lower-leverage because every change needed live-system verification.

Across the 19 tasks, the day produced roughly 11.8 weeks of senior-engineer-equivalent throughput in 12.7 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

How I Built AccelaStudy AI

Mon, 11 May 2026 12:00:00 GMT

Today I launched AccelaStudy AI: what I believe is the most advanced, most capable adaptive learning platform ever created. That's a bold claim but one I believe will quickly be proven as people start using it to study.

The technology behind AccelaStudy AI is called AVIAN — Adaptive Vector Intelligence and Network — and is protected by 33 patent filings describing 192 distinct inventions. The filings run nearly 1,000 pages of documentation, with 263 technical figures, 733 claims, grouped into 36 branded platform clusters spanning a 13-tier pipeline architecture. No competitor has anything remotely like it.

I built all of this in 80 days. Solo. Bootstrapped. $0 raised, no team, no co-founders. My only collaborator was Anthropic's Claude.

This post is the story of how that happened.

The Problem

I've worn many hats in my career but the one I wear most often these days is "Solution Architect," which is a somewhat generic term that means I build infrastructure in the cloud, usually the Amazon Web Services (AWS) cloud. I have passed most of the AWS certification exams, some multiple times, but in September 2025 I was preparing to study for the Advanced Networking Specialty (ANS) exam. ANS is widely considered the most difficult of the AWS certifications to pass.

For other certifications in the past, I've used A Cloud Guru (acquired by Pluralsight), Udemy, and other sites that are supposed to help you prepare for the exam. I hate these sites. They are all the same. An exam has a syllabus and most of the topics have videos and transcripts of the videos and simple, static quizzes at the end of each topic. After slogging through all of this, there are usually 1–3 practice exams that, assuming you pass, indicate you are ready for the real exam.

Garbage.

The first issue I have is the "one size fits all" curriculum model. Every class treats every student the same. And since they have to teach to the lowest common denominator, they assume you are coming at the exam with minimal prior knowledge. So they all start with refreshers on prerequisite material. You can skip these usually, but maybe I want a refresher and just don't need the WHOLE thing — just some of the more esoteric details. No way to get a refresher on just the details you need refreshed.

The primary course material is grouped into fairly broad topics. This means the course itself is largely like the refreshers: new material coupled with basic material many students already know. So you end up watching a 30-minute video to get 2 minutes of new knowledge that you need for the exam. It's not possible to skip around or you might miss the new material. To help with this, the video can often be watched at 1.5x or 2x speed. That's an awesome experience: having to focus intently on someone speaking super fast to make sure you don't miss the new material. Exhausting. The transcripts aren't much better. They are usually just blobs of text dumped out by a speech-to-text utility with zero formatting, no headers, nothing.

Some topics have practice "quizzes" which are essentially a handful of multiple choice questions to answer. There is only one practice quiz and it never changes, so once you've taken it, that's it. You can take it again but it's the same questions with, maybe, the answers sorted into a different order than the first attempt. Woo!

Some topics have "labs" which is where they give you some instructions and then you go log into your own live cloud account and muck around following the instructions and hope you don't mess anything up or accidentally run up a bunch of charges. I've never done a lab. I understand the value of doing things for real, but I'm not messing around in my own cloud account. Forget it.

And the practice exams — these are arguably the most useful feature of these online courses. A good one simulates the format of the exam and its duration. I thought the A Cloud Guru (Pluralsight) ones were pretty good until I passed all three available exams with near-perfect scores and then went on to fail the real exam. $300 down the drain and a serious shot to my confidence. The main problem is that these exams use a fixed battery of questions and you end up learning their practice exam and not the real material being tested.

I was not looking forward to studying for ANS with any of these sites.

The Idea

I had been thinking about building my own certification prep site for awhile. I figured if I was frustrated with the existing options, others were too. I was using Sonnet 4.5 regularly to write code and was able to have it put together a basic site in a few hours. There were two major obstacles to launching a real site, though.

One, how do I make mine better and truly useful? It wouldn't be sufficient to just put out a site that was the same as the competition. It had to be measurably better. Really, it had to be revolutionary.

Two, how do I create all of that content for users to study? Even one exam required a massive amount of content, and while I like writing, no way I had the free time to write the code AND write the content. And I didn't know all of it, either. I needed content for exams I hadn't passed yet.

Fortunately, I already knew all about creating educational software. The original AccelaStudy was the first flashcard app in the App Store when it opened in July 2008. That AccelaStudy was basically just foreign-language vocabulary flashcards: "Hello" on one side, "Hola" on the other. But I didn't know all of the languages (Spanish, French, German, Italian, and Turkish on opening day), so how did I generate the translations? I didn't. I hired professors at the premier foreign-language university in the world — Brigham Young University in Utah — to do the translations. Then I simply imported them into the app. For the native speaker audio files, I hired professional voiceover artists who spoke each language natively. That was a lot of fun, actually. The voice for Japanese was done by the same actor who does voiceovers in TV commercials for Mercedes-Benz.

But this content was on a different scale. Pluralsight has over 2,500 expert authors creating their technical courses. Of course, keeping 2,500 authors around is very expensive, and probably part of the reason Pluralsight is struggling financially. I had no money for content authors, so I needed a different solution.

Content Galore

For quite awhile, myself and all of my professional colleagues had been using ChatGPT for infrastructure questions. For example: "What are the options for encrypting an S3 bucket?" or "I'm getting a 502 error on a new web service I'm running in Fargate. What could be the problem?" I realized that the LLM's training data included every possible detail about every resource, every service that you could use in the AWS cloud.

Or be tested on in an AWS certification exam.

A few test prompts later — "Tell me everything I need to know about S3 buckets to pass the Solutions Architect Professional exam" — and I knew that AI had all the knowledge I needed to generate content for the site.

But how to handle hallucinations? How to make sure the content is accurate? These are tough problems with LLMs today. The solution to these issues is quite complicated but achievable. The solution that evolved became part of the AVIAN Origin and AVIAN Preflight patents, two of the 33 AVIAN patent filings, in the Content Creation architectural tier. AVIAN can generate the entire content of an AWS certification course in about 8 hours for around $100. And if the exam changes? A new version can be ready in 30 minutes.

But I'm getting ahead of myself.

Adaptive Learning, Solved

For over 10 years, I had been working on an adaptive learning patent. It started out as an idea to improve on the Leitner spaced-repetition algorithm. That improvement proved unpatentable but it was a real improvement, and it shipped in AccelaStudy years ago. So I kept working on it. By 2020 or so, I had a draft of The AccelaStudy Method, which captured most of the ideas I had around adaptive learning. Alas, that document was heavy on the concepts and light on the technical implementation. Not patentable.

Then, last September, when I was getting started on a proof of concept for what would eventually become AccelaStudy AI, I entered a fateful prompt:

I'm working on an educational site and I've got some ideas in this document, accelastudy_method.md. What would it take to make this a real patent?

And so it began. What started off as a single Markdown file describing an array of ideas for making online learning adaptive and personalized became 33 separate patents, not just the one I thought I had. The first patent was filed in October 2025, another 25 in March and April 2026, and 7 more in early May.

One of the key aspects of the patent portfolio is that it applies to ANYTHING that can be learned. As long as the AI has a deep knowledge of the subject, curriculum can be created. And given that the training data for OpenAI and Anthropic models (and Grok and Gemini and others) includes essentially every document ever written by humans, the AI has far deeper knowledge than even the most experienced content author.

Code Warrior

On February 16, 2026, it was time to build it. The patents were mostly done, but I wanted to ensure they worked before I went to all the trouble and expense of filing them.

The first task was to build the AVIAN engine itself. This meant taking all of that patent documentation and extracting a system architecture, and then an implementation and testing plan. That work was done in an afternoon.

The next several weeks were a sustained sprint of building, in roughly this order: the engine, the content synthesis pipeline, the web application, the API, the admin tooling, the marketing site, the press kit, the iOS app, the desktop apps for macOS / Windows / Linux, and the entire supporting infrastructure to run all of it. Then, in parallel with the customer-facing product, I built out a fleet of internal tools to actually operate the company: a CMS, an email client, a CRM, an accounting system, a calendar, an analytics platform, a service-health monitor, a leverage-metrics tracker, and more than a dozen others. Each one is a real production application. Each one was 100% built with Claude Code.

I'll write a longer technical post about the architecture choices that made this pace possible. But the single biggest workflow unlock was something simple and structural: I used 57 nested CLAUDE.md constraint files as a per-repo knowledge graph that Claude Code walks before any edit. Plan mode and parallel sub-agents rode on top of that. It felt like handing Claude a map of the entire monorepo. Every constraint I would have wanted to enforce as a code reviewer — coding style, architectural rules, naming conventions, testing requirements, what NOT to touch — lives in those files. The agent reads them. The agent respects them.

I ran 2–3 concurrent Claude Max subscriptions for most of the build window so I could fan out work across multiple repos at once. I typically had 10-12 terminals up, each doing work in a different repo. Through the API, the content-synthesis pipeline ran independently — various Anthropic models orchestrated in sequence to yield the most accurate and comprehensive course material. That synthesis spend lives in a separate stack of credit-recharge invoices: 80+ at roughly $50 each, $4,000+ documented. The coding spend through Claude Code lives in Fulcrum, the leverage tracker, which is itself one of the 19 internal tools I built along the way.

By the Numbers

Eighty days. Solo. The tracker captured every non-trivial task as a row: estimated human-equivalent hours, actual Claude wall-clock minutes, tokens consumed, leverage factor, supervisory leverage. Here is what 80 days of compressed work looks like:

Metric	Value
Days of build	80 (Feb 23 → May 13, 2026)
Measured tasks	2,115
Human-equivalent work hours	~50,319
Human-equivalent work-years	24.2
Claude wall-clock	~1,061 hours
My supervisory time (writing prompts)	~148 hours
Average task leverage	51.5×
Average supervisory leverage (personal ROI)	432.4×
Maximum single-task leverage	240×
Claude Code tokens consumed	~360 million

The full record set has been published daily since early April at charlessieg.com/leverage/all. Every task, every estimate, every minute of Claude wall-clock. Nothing redacted. Each day's post also includes an analytical writeup of which task patterns produced the highest leverage and which were still gated by human review.

And here is what those 24 work-years of compressed effort produced:

AccelaStudy AI — the customer product. Over 900 certifications, standardized tests, and other courses covered, 1.4 million synthesized questions, sub-2-millisecond knowledge updates, root-cause prerequisite-gap detection, pass-probability forecasting before you spend hundreds of dollars on an exam voucher. Live on the web today at accelastudy.ai; native iOS / iPadOS / macOS / Windows / Linux apps follow on June 1.
AVIAN — the patent portfolio behind it. 33 USPTO filings, 192 distinct inventions, 733 claims (68 independent + 665 dependent), 263 technical figures, organized into 36 platform clusters across 13 pipeline tiers. avian.renkara.com, also built by Claude.
74 repositories, 1.27 million lines of code, 25,000+ automated tests.
19 production Renkara internal tools — listed publicly at renkara.com/tools, with each tool's page tagged "100% Built by Claude" alongside the commercial SaaS category it replaces: Narrative (static site generator), Courier (email client), Tribe (CRM), Trellis (cloud accounting), Vigil (uptime monitoring), Cadence (calendar), Pulse (web analytics), Fulcrum (leverage tracker), Docket (issue tracking), Chronicle (observability), Beacon (marketing automation), Herald (newsletter platform), and seven more. Together they expose 800+ MCP tools to any Claude session — so the entire fleet is agent-addressable through Anthropic's own protocol, not just human-addressable. That fleet is the operational backbone that lets one person run a 74-repo monorepo.
21 production websites — 16 AVIAN/Renkara properties plus four fictional in-world sites and the book's own site for the novel below, all generated by Narrative.
19,000+ pages of Markdown documentation — 3,513 files, 4.85 million words. Including the 57 nested CLAUDE.md constraint files.

Fulcrum, and Other Side Quests

Fulcrum, the leverage tracker, deserves its own paragraph. As I was starting the build I realized that nobody had ever produced a longitudinal dataset on a single solo developer's actual productivity with an AI coding agent. Most "AI productivity" claims are marketing. I wanted real data — task by task, hour by hour, dollar by dollar — and I wanted it public. So I built Fulcrum. It records every non-trivial task as a row, computes leverage factor and supervisory ROI per task, and publishes a daily blog post with analytical commentary. As of today: 2,115 records, 51.5× weighted leverage, 432.4× supervisory ROI, 24.2 work-years compressed into 80 calendar days. If anyone wants to challenge the numbers, the records are there.

The other side quest is a novel.

In parallel with the AVIAN build, I co-wrote a 67,000-word literary novel with Claude called The Deferral. As part of the world-building, Claude designed and built four in-world fictional company websites — Strataforge Robotics, Luthan Dynamics, Elysium Atelier, and MIDAS — each with its own brand identity and full marketing copy, plus the book's own site at the-deferral.com. We even wrote a fake patent to deepen the world. The novel announcement and a behind-the-scenes writeup live here. Total wall-clock cost: a side hobby on weekends. The point: this isn't just about code. Working with Claude expands what one person can attempt across every creative discipline at once.

Accessibility

Most software fails accessibility. I didn't want AccelaStudy AI to be most software.

In the final weeks before launch I ran a series of WCAG 2.1 AA audits across the web client and all 16 marketing-site properties — a deterministic Python checker plus a parallel LLM-judgment phase. The first deep audit found 123 findings, with 13 P0 blockers. I then dispatched eight parallel Claude Code sub-agents to fix them in the order an accessibility consultant would prioritize them: token contrast, focus management, ARIA wiring, keyboard navigation, focus traps, animation guards, touch targets, document titles, modal labelling, custom tablists, FAQ semantic structure, and the long tail of smaller issues. Across the fleet of 56 UI repos, the final sweep cleared 2,460 HIGH findings, 2,553 MEDIUM, and a long tail of LOW findings.

This work is invisible to most users. But it is the entire experience for users who depend on screen readers, who navigate by keyboard only, who need reduced motion, who use voice control. There is no chance I could have manually audited 16 marketing sites + a complex React SPA + a Swift iOS app + four desktop builds for full WCAG 2.1 AA compliance in a week. With Claude Code, it was tightly scoped, parallelizable, and verifiable. The deterministic checker is itself open-source, lives in the monorepo, and runs on every CI build.

That last detail matters. The audits are reproducible. Anyone can rerun them.

Built with Claude

I want to be honest about what this actually was.

I didn't write a single line of production code in 80 days. I wrote prompts, I wrote CLAUDE.md constraint files, I wrote architecture decision records, I reviewed pull requests, I made judgment calls about what to build next and what to defer. Claude wrote the code. Claude helped me turn my ideas into patents and did the grunt work of hardening the language, working examples, constructing diagrams, and checking the math. Claude wrote the marketing copy (with my voice). Claude wrote the documentation. Claude designed the UIs. Claude wrote the synthesis pipeline that wrote the learning content. Claude wrote the leverage tracker that documented Claude writing everything else.

A few specific observations from the 80 days, for anyone curious about what working at this scale with Claude is actually like:

Plan mode is the highest-leverage feature for any change touching more than three files. It surfaces dependency cycles and forces explicit reasoning about ordering. Twice it caught a circular import my own static analysis had missed.
CLAUDE.md constraint files are dramatically underused. 57 of them across 74 repos formed a knowledge graph the agent navigated before any edit. The agent's adherence to nuanced architectural rules tracked almost perfectly with whether those rules were written down. If a rule wasn't in a CLAUDE.md file, it might as well not have existed.
Parallel sub-agents change the work model. For the synthesis pipeline, three or four sub-agents could fan out across distinct learning domains and produce independent drafts in 10 minutes. The bottleneck moves from "writing the content" to "specifying what the content should be."
Hooks reduce approval-cycle friction more than any other optimization. A small settings.json hook that runs my test suite after every edit saved an enormous amount of manual cycling.

AccelaStudy AI is, in the end, an incredible product, and I didn't write a single line of its code. It is Claude's masterpiece. I am the operator who pointed the model at the target.

"Create like a god; command like a king; work like a machine."

This philosophy comes from the famous Romanian sculptor Constantin Brâncuși and is what I now live by.

Claude Code has given me the power of creation, to transform world-changing ideas into stunning reality.

Claude followed command after command after command, over 2,000 of them, tirelessly working to execute my vision.

However, I did work like a machine.

In my favorite scene from Jurassic Park, John Hammond says memorably that "creation is an act of sheer will". Delivering AccelaStudy AI, even with the work being done almost entirely by Claude Code, required the mental resolve and determination to sit at my desk an average of 120+ hours a week for almost 12 weeks, prompting Claude along, reviewing the work. That left only a handful of hours a day for sleep, eating, exercising, and spending time with family and friends. I should mention that I also worked a full-time job during 8 of those daily hours.

It was my deadline, optimistically set early on when it seemed like I'd be done in no time at Claude Code pace. But, like any project that has to go to production, the 80/20 rule applies and it was clearly evident in this effort. It's the kind of ballooning that happens when the "user sign up" feature expands to include social media sign-ups, forgot password and MFA flows, and regulatory account closure requirements. In the end, even with all the hours, I still had to move the launch by 3 weeks. But it did launch.

Giving Back

Middle school and high school curriculum is free. For students. For schools. For homeschoolers. For anyone teaching kids who deserve adaptive, personalized learning without a paywall. The K-12 curriculum rolls out across summer and fall 2026, available to any student, school, or family at no cost. Pass-probability forecasting, root-cause gap detection, real adaptive sequencing — at no cost, ever, full stop.

Adaptive learning shouldn't be a luxury good. The kids whose families can afford $4,000 tutors have always had the edge over the kids whose families can't. AccelaStudy AI doesn't know what a family's bank balance looks like, and that's the point.

The paid products fund the free K-12 work. We are launching with professional certifications to kickstart revenue. The AP catalog, AccelaStudy AI Languages, AccelaStudy AI English (IELTS + TOEFL, coming this summer), and the graduate-and-professional tests (GRE, GMAT, MCAT, and LSAT, coming in October) are all paid products. The college-entrance tests (SAT, ACT, PSAT) may also go free — that call is still open.

A solo founder, working with Claude, can build all of this in 80 days. The implication for what the rest of us — teachers, students, families — can attempt is what I want people to take from this story.

The ceiling moved. Look up.

Charles Sieg is the founder of Renkara Media Group. AccelaStudy AI is live at accelastudy.ai. The full daily leverage dataset is public at charlessieg.com/leverage. The 19 internal Renkara tools, each tagged "100% Built by Claude," are listed at renkara.com/tools. The AVIAN patent portfolio summary lives at avian.renkara.com.

Leverage Record: May 10, 2026

Sun, 10 May 2026 23:59:00 GMT

Twenty-seven tasks. May 10, 2026 weighted to 21.3x leverage across 524.0 human-equivalent hours in 1,478 Claude-minutes. The day was a pre-launch sweep across compliance and security remediation, audit-driven cleanups, press-kit asset regeneration, transactional email template overhauls, sister-site internationalization, and launch-teaser polish. Supervisory leverage closed at 251.5x.

13.1 weeks of human-equivalent throughput in 24.6 hours of Claude wall-clock. The 68.6x ceiling came from Compliance HIGH remediation: bumped a cloud database cluster RDS retention 1d→7d, removed localhost from an admin service prod CORS, added auth to 7 unauth anomalies endpoints i...; the 2.2x floor sat at Pre-launch calibration iteration: diagnosed v11 inverse-formula regression, designed and tested asymmetric-sigma fixes (v12, v13) via 12-journey a professional cert sweeps, reve....

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Compliance HIGH remediation: bumped a cloud database cluster RDS retention 1d→7d, removed localhost from an admin service prod CORS, added auth to 7 unauth anomalies endpoints in an admin tool (421 tests pass), wrote 1066-line Incident Response Plan + 915-line Disaster Recovery Plan (12 sections each with Mermaid di...	32.0h	28m	4m	68.6x	480.0x
2	Audit findings remediation: BLOCKER fixes (an onboarding service test threshold + 21 orphan adjacency entries removed), CRITICAL #2 fix (HttpOnly refresh-cookie + in-memory tokenStore across an auth service + a web client + a desktop client, 540 backend tests + 212 frontend tests pass), an auth service coverage 71→7...	120.0h	110m	10m	65.5x	720.0x
3	Run all 9 an inference engine audits (canonical, ecosystem inventory, content, accessibility, health-check, security, documentation, compliance, full-readiness) — 7 reports written to the monorepo audits/reports/	80.0h	95m	1m	50.5x	4800.0x
4	a learning platform press-kit features 1/2/4/5/6: mastery seal, transfer-credit banner, root-cause diagnosis modal+endpoint, Monte Carlo distribution chart, past-readiness trend chart+endpoint — 5 UI components, 2 engine endpoints, 4 readiness helpers, 57 tests, 5 verified captures	50.0h	75m	3m	40.0x	1000.0x
5	Fix all HIGH/MEDIUM/LOW findings from an inference engine documentation audit (2026-05-10): README Features/Tech sections, stale CHANGELOGs, missing CI/CD sections, cross-reference links, missing docs for libs	20.0h	45m	3m	26.7x	400.0x
6	Post-practice-exam autopilot remediation: submit_exam auto-injects wrong-node IDs into sequencing remediation queue; new POST /entities/{id}/remediation-session endpoint; ExamResults rewritten with Start-targeted-study CTA + See-why diagnosis hook on weakest gap; 19 tests (11 BE + 8 FE) all passing, no regressions	14.0h	32m	2m	26.2x	420.0x
7	Roll the new email design across the remaining 22 transactional templates: welcome, invitation, comp-welcome, account-update/closed/deleted, daily-study-reminder, streak-at-risk, elo-decay-warning, elo-level-achieved, course-completed, exam-passed, weekly-progress, win-back, 5 exam-reminders (30d/14d/7d/3d/1d), recr...	11.0h	28m	2m	23.6x	330.0x
8	Generate full launch demo: lived-in Charles a professional cert dashboard via engine seeding + DEV auth bypass, 14 retina press-kit screenshots, 64 site feature-mock screenshots (32 labels × 2 themes), ElevenLabs narration, Ken Burns 90-sec demo video, brand-styled lower-thirds, press-kit zip wired with assets, webs...	14.0h	40m	5m	21.0x	168.0x
9	Rebuild shared feature page template Supernova-style: strip fake browser chrome (red/yellow/green dot row + URL chip), move hero shot below H1/subtitle/CTA at full container width, pair each how-it-works step crop inline with its paragraph. Add new feature-shot CSS class (rounded + soft elevation + theme-aware light...	7.0h	22m	2m	19.1x	210.0x
10	Press-kit full sweep: 124 PNGs regenerated (62 slugs × 2 themes), 4 new onboarding heroes (resume-dropzone with new drag handlers, credential-mapping preview route, calibration-quiz, dashboard-pre-credited), Beat-0 added to remediation video (exam-finishing → submit → results → breakdown → gaps → plan → session), 68...	30.0h	95m	5m	18.9x	360.0x
11	Remediation video + plan-preview modal + ExamReview fix + delete-entity completeness audit & fix (engine multi-layer purge + admin cascade) — RemediationPlanModal, Exam.tsx review payload, targetconcepts endpoint extension, ExamAttemptRepository.deletefor_entity, multi-repo commits + pushes, 22s remediation-loop.m...	22.0h	70m	4m	18.9x	330.0x
12	Launch-night polish batch: cross-domain field rename, resume dropzone drag handlers, trendline animation boost, ready-to-test button nowrap, lab cards line-clamp removal, micro-challenge goal cutoff, minimal-pair scoring + prompt rewrite, error-detection JSON pretty-print + hljs syntax highlighting, scenario rehype-...	18.0h	60m	6m	18.0x	180.0x
13	Brand pass on a sister marketing site (always a learning platform, never a learning platform alone), repricing to $29/$23 from $59/$47 across site.yml, content stubs, both templates, README, comparison tables, FAQs; hero copy centered with break before Adaptive, side-gradient rebalanced for centered text.	1.5h	6m	2m	15.0x	45.0x
14	a sister marketing site i18n full rollout (Phases 2-5 + 1B mechanism + a newsletter platform wire-up): 7 LLM-generated translations (hi, zh, es, ar, pt, ko, ja) of ~150 strings each across home + pricing; per-language content stubs; language picker in shared header gated on Custom.Languages; hreflang alternates with...	18.0h	75m	5m	14.4x	216.0x
15	Shared overlay i18n full rollout via tiered approach: Tier A (full conditional i18n on about/accessibility/platforms/faq with translations across 7 languages, ~400 string-language pairs), Tier B (chrome i18n on features/feature, features/activities, blog, post -- per-feature/per-post content stays English), Tier C (...	14.0h	60m	4m	14.0x	210.0x
16	a notification service email template overhaul: convert 4 an HTML design tool-generated HTML designs (Tailwind CDN + JS, won't render in mail clients) into email-safe table-based HTML with inline CSS, system-font fallbacks, dark-mode @media swaps, Outlook VML CTAs, mobile-responsive media query, plain-text alternati...	8.0h	35m	3m	13.7x	160.0x
17	Move Whats New release notes out of the SPA bundle: new GET /api/v1/whats-new route in an API service proxies markdown from an assets CDN/whats-new.md (engine content bucket) with 60s cache; new clients/a web client/src/api/whatsNew.ts client; rewrote WhatsNewPanel to use a frontend library Query (refetches on every...	4.0h	18m	2m	13.3x	120.0x
18	Fleet-wide nav + CSS + content sweep: (1) hide desktop CTA on	6.0h	28m	5m	12.9x	72.0x
19	Generate two missing daily leverage blog posts (May 8 + May 9): fetch records from Leverage Manager API, sanitize 48 task descriptions for public disclosure, write Python sanitization pass with ~80 replacement rules, build markdown posts with task tables + aggregate stats + analysis sections, update about-page post...	6.0h	30m	1m	12.0x	360.0x
20	Four a web client UI fixes: (1) AnalyticsPanel restack — Accuracy/Drift/Recs stacked left, wider Learning Style Fingerprint right with wrapping legend labels; (2) added productLabel slot to design-system Brand and wired Certs badge into AppShell matching marketing-site wordmark pattern; (3) fixed build-catalog doubl...	6.0h	32m	4m	11.2x	90.0x
21	a marketing site launch teaser: add 4-cell DD:HH:MM:SS countdown clock to midnight Pacific (2026-05-11T00:00:00-07:00) above the teaser video; deploy to production (clean rebuild + S3 sync + CloudFront invalidation), then restore staging to real home page; push websites repo	3.0h	18m	1m	10.0x	180.0x
22	Email template polish + a payment provider PDF invoice capture wired through a billing service. Templates: drop Manage Notifications link, swap billing email to a marketing site, rebuild receipt as edge-to-edge full-width band, add an inference engine bird mark to header. Backend: alembic migration 005 adds invoice_...	5.0h	30m	4m	10.0x	75.0x
23	Fleet sweep: disable pricing/subscribe CTAs across all 6 sister sites (a standardized test/a standardized test/ap/test-prep/english/languages) — pricing.jinja Start-Monthly/Annual/Product CTAs and home Get-Started buttons all swapped to /#signup Notify-Me-at-Launch; hide Platforms entry from footer Product column on...	5.0h	30m	3m	10.0x	100.0x
24	a sister marketing site i18n Phase 1A: extracted ~150 user-visible strings across home + pricing into i18n/en.jinja, refactored both templates to load via Jinja {% import %} (since {% include %} doesnt propagate set), renamed Jinja-conflicting items->entries, added bilingual draft-translation banner gated on non-Eng...	4.0h	28m	6m	8.6x	40.0x
25	Hide placeholder testimonials across all a learning platform sister sites — audit identified a standardized test/ap/test-prep with ungated TESTIMONIALS sections (a standardized test/english/aces/enterprise clean; a marketing site already had showsocialproof=false). Wrapped each section in {% if false %}, parallel-...	2.5h	18m	1m	8.3x	150.0x
26	9-beat launch press-kit capture: audited decoy playwright code (16 page objects + headless_runner against current app-web — 71% selectors stale), wrote smart engine seeder with peek-session correct-answer discovery (150 interactions, 69% accuracy), wrote 700-line Playwright capture script with localStorage planting...	14.0h	130m	25m	6.5x	33.6x
27	Pre-launch calibration iteration: diagnosed v11 inverse-formula regression, designed and tested asymmetric-sigma fixes (v12, v13) via 12-journey a professional cert sweeps, reverted v13 to v12, built + pushed cloud boot cache to S3, committed + deployed v12 to prod via CodePipeline, wrote post-launch entity-embeddin...	9.0h	240m	12m	2.2x	45.0x

Aggregate Statistics

Metric	Value
Total tasks	27
Total human-equivalent hours	524.0
Total Claude minutes	1478
Total supervisory minutes	125
Total tokens	6,963,000
Weighted average leverage factor	21.3x
Weighted average supervisory leverage factor	251.5x
Human-equivalent weeks	13.1

Analysis

The day's leverage distribution matters more than the headline figure. The 68.6x ceiling came from Compliance HIGH remediation: bumped a cloud database cluster RDS retention 1d→7d, removed localhost from an admin service prod CORS, adde...; the 2.2x floor was Pre-launch calibration iteration: diagnosed v11 inverse-formula regression, designed and tested asymmetric-sigma fixes (v12, v13) via 12-.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (251.5x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

May 10 was the final-prep day before web GA. The work clustered tightly: half the tasks were either audit-driven compliance fixes or asset/visual polish for the launch surface, and the other half were i18n + brand-pass rolls across the marketing-site fleet. That bimodal shape produced steady mid-band leverage rather than runaway high or low extremes; the work was real, but well-bounded.

Across the 27 tasks, the day produced roughly 13.1 weeks of senior-engineer-equivalent throughput in 24.6 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 9, 2026

Sat, 09 May 2026 23:59:00 GMT

Thirty-eight tasks. May 9, 2026 weighted to 26.9x leverage across 632.5 human-equivalent hours in 1,410 Claude-minutes. The day was a pre-launch sweep across iOS web parity, an end-to-end status site stand-up, a fleet-wide accessibility audit fix, an analytics platform overhaul, and a marketing-site canon-swap propagation. Supervisory leverage closed at 223.2x.

The volume reflects a launch deadline; 15.8 weeks of human-equivalent throughput in twenty-three and a half hours of Claude wall-clock. The 85.7x ceiling came from an 8-phase mobile rebuild rebuilding the mobile client to match the web client, while the floor in the table sits at 6.7x on a four-tab settings restructure with extensive design-token migration. The middle of the distribution is dominated by accessibility audits, content-pipeline integrity work, and the ground infrastructure for the launch site.

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	iOS web-parity rebuild: 8 phases ; phase machine restructure, an app shell+a top-nav component, launch routing fix, HomeView (slim hub), multi-course Dashboard, CoursesView+CourseDetailView, SettingsView split, container/transitions/radius polish	50.0h	35m	8m	85.7x	375.0x
2	an analytics platform: date-range fix + SSE-driven realtime ticks + bounce/duration/GeoIP + funnel ordering, attribution models, webhook handlers, CSV export, IP exclusions, public dashboard share	60.0h	44m	6m	81.8x	600.0x
3	a status site: built and deployed a status site end-to-end ; new clients/a status site SPA (React 19/Vite/TS), a monitoring tool schema + public read API + alembic migration + 12 sanitization tests, admin-service banner.channels JSONB + public banners endpoint,	80.0h	70m	8m	68.6x	600.0x
4	iOS Help fixes: tab strikethrough fix (overlay alignment), port 40 help guides verbatim from a help-doc source file → a help-doc target file (sidebar+content layout, iPhone sheet), embed 5 legal docs (privacy, terms, accessibility, trademarks,	24.0h	22m	4m	65.5x	360.0x
5	iOS app facelift: SwiftUI design system port (tokens, typography, 14 components), a brand sans font bundling, a design theme shim, migrate 6 high-traffic views (LoginView, ResultsView, WelcomeView, ProfileView, DashboardView, BugReportView), pbxproj patch, docs	40.0h	50m	6m	48.0x	400.0x
6	iOS Settings/Profile/Help web parity: fix sign-in button (a top-nav component overflow on iPhone), build new HelpView (5-tab Overview/FAQ/Guides/WhatsNew/Legal + .help phase + bug-report bridge), refactor ProfileView into hero+4-tab (Profile/Resume/Subscription/Account),	18.0h	24m	5m	45.0x	216.0x
7	Build reusable static-site Terraform module (S3+OAC+CloudFront+ACM+Route53) with edge-enforced CloudFront-Function an access gate gate, plus english-accelastudy-website root stack (prod imports existing E51I2L5WDXNNS via auto-discovering import.sh, staging fresh-provisions with the gate).	9.0h	14m	4m	38.6x	135.0x
8	Upgrade a language-exam product site (a language-proficiency exam product) to multi-page subscription product site: standalone /pricing/ page with comparison table & FAQ, switched nav to standalone routes, live header CTA, sister-site parity in Custom block, README + CHANGELOG updated. Verified clean build (26 pages,	5.0h	8m	3m	37.5x	100.0x
9	Major engine fix + audit expansion. (1) Built a backfill script - deterministic pairid linkage backfill across 234 synthesized domain packages. Drove pairid coverage from 32.3% to 54.1% across 1.29M questions, with the worst cert domains (a professional cert 0.1%->38.1%, a professional cert, a professional cert,	16.0h	28m	6m	34.3x	160.0x
10	a status site: round 2 ; closed remaining gaps from initial deploy. a monitoring tool frontend SiteSettingsForm gets publicstatusvisible/group/publicdisplayname/publicdescription fields; new IncidentDetailModal lets operators set severity/title/publicvisible and post markdown updates (investigating→identified→mon...	32.0h	65m	2m	29.5x	960.0x
11	MEDIUM cleanup wave: 9 reduced-motion guards + 8 sr-only utilities + 594 h1->h2 codemod demotions across 241 files + 50 input-adjacent-label codemod pairings + 5 hand-fixes (BillingPage h1, Blog.jsx h1s, purchase-service globals.css, a legacy product site SCSS sr-only, charlessieg-redesign exemption);	14.0h	30m	1m	28.0x	840.0x
12	Fleet-wide a11y fix sweep across 56 UI repos: 2,460 HIGH findings fixed (2,235 via a simulator suite label-pairing codemod + 17 manual + 5 wave-1 activities-react + 59 wave-3 client apps + 136 wave-4 tools fleet + 8 a simulator suite primitives);	60.0h	130m	5m	27.7x	720.0x
13	Final wave: clear remaining 192 HIGH a11y findings ; patched 71 stale cloudops dist HTML files with lang=en (Python sed), dispatched focused subagent to fix 118 of 120 a simulator suite view-level inputs/svgs/clickable-divs (NetworkTopology+PolicyEditor+PacketInspector+ProjectBoard+a top-nav component+30 more dashboard...	16.0h	35m	1m	27.4x	960.0x
14	Port web Help Center guide articles to iOS (40 docs, 7 categories) and rebuild Guides tab with sidebar+content layout	8.0h	18m	5m	26.7x	96.0x
15	Drafted Making a learning platform Accessible to All across 3 sites ; a personal site (~3500-word technical deep-dive with mermaid wave diagram + 6 reference tables + concrete codebase counts: 2185 TSX/JSX files, 2527 native buttons, 3486 form inputs, 3019 ARIA uses, 1530 aria-labels, 752 aria-hidden, 101 role=button,	12.0h	28m	2m	25.7x	360.0x
16	Drove the a structured-content spec catalog spec audit from 257 LOW (post-prior-pass) to absolute zero across all four severities. Tightened a spec auditor (broadened verb whitelist, fixed cross-domain prefix detection, normalized weight-sum auto-fix to handle any non-100 sum, relaxed cross-domain check to >=1,	14.0h	35m	4m	24.0x	210.0x
17	Refactor CLAUDE.md chain: extract patent checklist, repo map, ADR rules, SSM, domain inventory, synthesis pipeline into subtree files; relocate API keys to mode-600 env file outside prompt	2.5h	7m	3m	21.4x	50.0x
18	Built deterministic Python accessibility-audit checker (15 rules, 56-repo discovery, brace/quote-aware JSX tokeniser, JSON+MD output, mode-aware exit codes); updated accessibility-audit.md with Phase 0 spec citing the script; ran fleet-wide audit (414 HIGH + 2553 MEDIUM identified);	24.0h	70m	4m	20.6x	360.0x
19	Content audit reconciliation: dropped 30 of 40 findings (all 11 CRITICALs + all 14 catalog/canonical MEDIUMs + 5 LOWs). Wrote a dedup script to remap 33 collided exam_code values to vendor-correct codes (a professional cert Plus suite -> a professional cert/a professional cert/a professional cert/a professional cert/a ...	4.0h	12m	2m	20.0x	120.0x
20	Restructure iOS ProfileView to mirror web Profile 4-tab layout (hero + Profile/Resume/Subscription/Account)	6.0h	18m	5m	20.0x	72.0x
21	an analytics platform last-hour delta on MetricCards (backend + frontend), an admin tool SoundProvider + priority-aware notification/anomaly cues, and tool-specific cues across foundry/chronicle/trellis/herald/meridian/envoy/fulcrum/tribe (plus pre-existing tsc fixes)	18.0h	55m	5m	19.6x	216.0x
22	Sound effects follow-ups: nuked an inference engine root node_modules + made tools self-contained, Terraform stack lib-pipelines/ provisioning a build service + a CI/CD pipeline for all 6 publishable @avian/* libs (5 imported + new sound-effects), tool-specific cues wired into chirp/courier/vigil/slate/packed	16.0h	50m	4m	19.2x	240.0x
23	Build CoursesView and CourseDetailView for a learning platform iOS app (web parity)	6.0h	22m	8m	16.4x	45.0x
24	Port web legal docs into iOS HelpView ; embed privacy policy, terms, accessibility, trademarks, and credits as scrollable in-app a legal-doc tree node tree; rebuild Legal tab with iPad sidebar and iPhone sheet flow	6.0h	22m	5m	16.4x	72.0x
25	Sound effects fleet rollout: created a shared sound-effects library v0.1.0 standalone package, uploaded 28 mp3s to an assets CDN CDN with CORS, wired SoundProvider into 21 tools (3 needed manual handling, fixed pre-existing TS/JSX errors in cadence/courier/dossier along the way), committed + pushed each tool repo.	24.0h	90m	8m	16.0x	180.0x
26	an analytics platform: webhook hardening (require pulsesiteid, no default fallback) + dedicated 30 req/min webhook rate limit; deployed and verified live	4.0h	16m	2m	15.0x	120.0x
27	Mode-1 a11y audit on a web client + 6 React libs: 11 HIGH fixes (RemoteBanners aria-live, HelpCenter dialog focus, LabConsole tab pattern, ProgressBar/ExamScoreReport/Sidebar progress+log roles, ProceduralStepSequencing focus-visible, InteractiveMap :focus-visible, BugReportModal dialog semantics,	12.0h	55m	3m	13.1x	240.0x
28	a marketing site staging: redeploy with an access gate, restore real home page + /platforms/ nav link, remove /vote teaser; ship stage-isolated dist// build directories in narrative CMS so parallel Staging+Production never overwrite each other (3 unit tests, doc updates across 3 repos)	7.5h	36m	4m	12.5x	112.5x
29	Build real SettingsView for iOS app mirroring web settings page (appearance, language, study prefs, voice, accessibility, privacy, about sections)	3.0h	15m	5m	12.0x	36.0x
30	Migrate a SwiftUI view to an inference engine iOS design system (design tokens, design typography, a button component, a card component, an empty-state component)	1.5h	8m	3m	11.2x	30.0x
31	Rebuild a SwiftUI view to multi-course portfolio matching web Dashboard.tsx	4.0h	22m	5m	10.9x	48.0x
32	Migrate a SwiftUI view (1521 lines) to an inference engine iOS design system tokens, typography, and components	3.0h	18m	3m	10.0x	60.0x
33	a marketing site canon-swap propagation: replace hardcoded counts with [[canon:...]] placeholders across press, about, how-it-works, faq, accessibility, pricing, courses, free, 5 feature pages and shared pricing-card partial; fix stale patent counts (27→29 filings, 593/613→637 claims);	5.0h	35m	2m	8.6x	150.0x
34	canon-swap sweep across 7 sister sites: mcat/lsat/ap/test-prep/english (OtherProducts blocks), a corporate site corporate (10 files - patent counts on index/about/ip/timeline/products/dossier/etc), enterprise (activity-formats); fix stale 27/28 filings → 29 + 593/613 claims → 637 + 20→13 activity formats;	4.0h	30m	1m	8.0x	240.0x
35	Activities catalog reorg (default+5 addons across 62 categories) + 4 web bug fixes (data-driven Service Match applicability, Privacy footer link, Bio Profile→Resume, unenroll→autopilot cascade) + Settings prefs gray-out ; three repos committed and pushed	14.0h	110m	18m	7.6x	46.7x
36	pre-launch staging audit + fixes: add og:image fallback in _metadata.jinja, strip /index.html from canonical URLs, add Exclude/ExcludeWhere collection filters in narrative CMS (4 unit tests), exclude a deferred category + a deferred category categories + 85 child course pages from rendering,	5.0h	40m	2m	7.5x	150.0x
37	Migrate a SwiftUI view (1108 lines) to an inference engine iOS design system ; replace a design theme tokens with design tokens, update typography to design typography presets, replace ad-hoc cards/buttons with a card component/a button component/a badge component/an empty-state component/an inline-alert component,	3.0h	25m	3m	7.2x	60.0x
38	Restructure a SwiftUI view to 4-tab layout (General/Autopilot/Accessibility/Privacy) with Audio section, extended-time toggle, Privacy Policy link, and SoundManager integration	2.0h	18m	5m	6.7x	24.0x

Aggregate Statistics

Metric	Value
Total tasks	38
Total human-equivalent hours	632.5
Total Claude minutes	1410
Total supervisory minutes	170
Total tokens	6,445,000
Weighted average leverage factor	26.9x
Weighted average supervisory leverage factor	223.2x
Human-equivalent weeks	15.8

Analysis

The day's leverage distribution matters more than the headline figure. The 85.7x ceiling came from iOS web-parity rebuild: 8 phases ; phase machine restructure, an app shell+a top-nav component,; the 6.7x floor was Restructure a SwiftUI view to 4-tab layout (General/Autopilot/Accessibility/Privacy) with Audio sect.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (223.2x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 38 tasks, the day produced roughly 15.8 weeks of senior-engineer-equivalent throughput in 23.5 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 8, 2026

Fri, 08 May 2026 23:59:00 GMT

Ten tasks. May 8, 2026 weighted to 22.4x leverage across 108.5 human-equivalent hours in 291 Claude-minutes. The day was dominated by an internal cross-domain warm-start architecture rolled out across engine, web, desktop, and mobile clients in five phases, plus a deep data-integrity audit and an IP working-draft amendment. Supervisory leverage closed at 323.9x.

Compared to the prior day, this one ran tighter; about a third of the human-equivalent hours but a higher weighted factor because most tasks were tightly-scoped engine or client wiring with explicit success criteria. The 53.3x ceiling came from a 5-phase routing implementation; the 4.7x floor was a session-recovery commit-bundling task where the human reviewed each step.

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Browse-before-auth web client implementation: all 5 phases (router public/gated split, pendingIntent + resumeAfterAuth + AuthCallback dispatcher, anonymous CourseDetail with auth-aware Enroll, AppShell anonymous chrome with sign-in CTA, deep-link returnTo verified).	40.0h	45m	1m	53.3x	2400.0x
2	an internal ADR Phase 1 engine: a Bayesian warm-starter module, a posterior model trustflagged field, mastery trust gate, autopilot creationRequest/Response field expansion, 5 CrossDomainConfig fields, cloud.toml section, createautopilot handler hook, 24 new unit tests across 3 files; 3,473 fast tests pass	7.0h	16m	0m	26.2x	840.0x
3	Pair-to-node ref repair across 247 broken domains via embedding cosine match (146,762 pairs re-anchored, mean cosine 0.91). Bulk readiness-gate stamp across 178 manifests derived from exam metadata. Post-audit shows 319 of 320 viable domains HEALTHY (was 73). an internal ADR decision log updated.	12.0h	30m	1m	24.0x	720.0x
4	Amend an IP working draft working draft (several new claims, a spec subsection, alt embodiment, related-inventions paragraphs for E and H), draft an internal ADR (cross-domain posterior warm-starting), update canonical claim totals 633->637 across 11 portfolio docs, regenerate Application_BB.pdf	7.0h	18m	2m	23.3x	280.0x
5	an internal ADR Phase 2 client wiring (web + Electron): API types, env flag, autopilot store extensions, CrossDomain fast-track buttons, CourseDetail savings callouts, SkillsCarryingOverPanel warm-start data, i18n keys, Electron screen state machine transferContext threading;	5.0h	14m	0m	21.4x	1000.0x
6	iOS cross-domain fast-track parity (EngineClient types, AppState TransferContext, CrossDomainView fast-track button, AutopilotView pre/post-activation callouts, env flag), invite-code gate removal (SiteKeyService/SiteKeyGateView delete + pbxproj cleanup + Localizable.xcstrings auto-clean),	5.5h	18m	0m	18.3x	825.0x
7	Domain pair-to-node integrity audit (323 domains, 76% degraded), EB leaf catastrophic-regression fix (gate on domainobstotal instead of raw pair_stats ; acc92 crashed 1.0→0.001 on broken-pair domains), per-domain readiness gates on CLF/SAA/a professional cert/ANS manifests, 12 new regression tests,	18.0h	65m	4m	16.6x	270.0x
8	an internal ADR Phase 3 artifacts: 5 reference profile YAMLs (CLF→SAA, SAA→SAP, a professional cert→a professional cert, a professional cert→a professional cert, a professional cert→a professional cert), runwarmstartvalidation.py synthetic A/B harness (~500 lines, parses clean),	4.0h	15m	0m	16.0x	600.0x
9	Built shared NLI server (FastAPI/MPS) + LM Studio embeddings client + engine wiring so synthesis pipeline can run 10-way concurrent without OOM	6.5h	25m	4m	15.6x	97.5x
10	Resume an internal ADR cross-domain warmstart work after crash: bundle drift into 4 focused engine commits + 1 web a11y commit, add Phase 11 to an audit harness content audit (md spec + py implementation) catching missing decoy validation prerequisites and 26 pre-existing duplicate exam_codes,	3.5h	45m	7m	4.7x	30.0x

Aggregate Statistics

Metric	Value
Total tasks	10
Total human-equivalent hours	108.5
Total Claude minutes	291
Total supervisory minutes	20
Total tokens	1,425,000
Weighted average leverage factor	22.4x
Weighted average supervisory leverage factor	323.9x
Human-equivalent weeks	2.7

Analysis

The day's leverage distribution matters more than the headline figure. The 53.3x ceiling came from Browse-before-auth web client implementation: all 5 phases (router public/gated split,; the 4.7x floor was Resume an internal ADR cross-domain warmstart work after crash: bundle drift into 4 focused engine c.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

The supervisory leverage figure (323.9x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 10 tasks, the day produced roughly 2.7 weeks of senior-engineer-equivalent throughput in 4.8 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.

Leverage Record: May 7, 2026

Thu, 07 May 2026 23:59:00 GMT

Twenty tasks. May 7, 2026 weighted to 10.9x leverage across 304.5 human-equivalent hours in 1676 Claude-minutes. Admin/ops dominated the day's volume. Supervisory leverage closed at 188.4x.

The day's ceiling was 68.6x (40h human in 35 Claude-minutes) on Pre-launch burndown: fixed 3 holdout partial labs (git-lab-02, a cloud cert exam-lab-16, a cloud cert exam-lab-14), shipped Phase-2 polish for 5 simulators (not. The floor was 0.7x on the marketing site courses page: tighten card cap from 20 to 15, strip Certified word from 99 course titles via template filter (cards + course pages), reorder . Median Claude-minutes per task: 60; median human-equivalent hours per task: 7.

About These Records

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Pre-launch burndown: fixed 3 holdout partial labs (git-lab-02, a cloud cert exam-lab-16, a cloud cert exam-lab-14), shipped Phase-2 polish for 5 simulators (notebook markdown preview, SQL chart panel, project-board drag-and-drop kanban, SIEM MITRE ATT&CK tagging, network topology SVG diagram), shipped 8 native-language syntax-validating resolvers (Java/Go/Rust/Swift/C#/PHP/Ruby/Kotlin) with 14 unit tests, documented vendor-console deferral until post-Monday-launch. 1 commit pushed.	40.0h	35m	1m	68.6x	2400.0x
2	Phase-2 round 2 across all 7 simulators: Project Board (visual Gantt + burndown SVGs), SQL Workbench (schema browser sidebar + describeSchema SDK), Policy Editor (SVG diagram canvas with arrows), Device Manager (Disks tab with partition bar + POST screen), SIEM Workbench (event detail with pivots + kill-chain investigations timeline), Network Topology (Cisco-style CLI panel with show ip interface brief / show ip route / configure terminal / ping), Notebook (matplotlib inline PNG capture + DataFrame HTML rendering). 7 tasks completed; 51 simulator unit tests pass.	56.0h	50m	1m	67.2x	3360.0x
3	Built Top-3 parity catch-up via parallel sub-agents: Electron SSE event-bus client (port from web), Electron embedded Stripe subscribe flow + useRequireSubscription gate (CSP allowlist, SSE-driven completion, deep-link 3DS return), iOS ExamReviewView (new SwiftUI view + data model + 13 localization keys + xcodeproj wiring)	10.0h	15m	1m	40.0x	600.0x
4	the an internal service: generate 5 top-level hero images via an image model.1 Pro (home, about, applications, contact, portfolio), wire 7 heroes total into all top-level page templates including index.jinja behind particle canvas, WebP optimization, deploy prod+staging	10.0h	16m	1m	37.5x	600.0x
5	Audited web client vs electron + iOS; expanded parity script (+22 features, 2 false-positive fixes, console-sim reclassification), regenerated FEATUREPARITYMATRIX.md, wrote parity-drift-prioritization-2026-05-07.md sprint plan with two parallel tracks for catch-up	5.0h	15m	2m	20.0x	150.0x
6	Three audience-tailored 'Making What If?' blog posts: a personal site (first-person reflective, lessons-learned tone), renkara.com (engineering build voice with ffmpeg code blocks), _shared-the product/blog (product marketing, links to /vote/). All 3 set to draft:true and dated 2026-05-12. Plus comprehensive rewrite of tools/static site generator/CLAUDE.md and README.md deploy sections documenting the actual no-CI/CD reality for marketing sites, the safe sequential build pattern (rm -rf dist .static site generator-build between stages to prevent staging→production cross-contamination), draft handling, post-deploy verification, and common-mistake catalog. ~5000 words of new prose total.	16.0h	60m	5m	16.0x	192.0x
7	the an internal service: homepage app-domain cards w/ heroes, footer text fix, replace hardcoded counts with [[canon:]] placeholders, renumber+reorder tiers (Foundational=1, Validation moved to 8, Transparency-Social swap), add 5 brand.bio.* canon keys, fix 27→canon on renkara.com, cascade tier reorder to IP portfolio docs (README, PlatformArchitectureTiers, FAQ, PatentFamilyGrouping), recursive resolver fix (static site generator+standalone), replace cdn.tailwindcss with built tailwind-compiled.css; deploy prod+staging	24.0h	95m	12m	15.2x	120.0x
8	Reordered Phase E queue to prioritize CompTIA after PMI for launch credibility. Wrote Phase E2 orchestrator (PMI→CompTIA→ScrumAlliance→ISACA→ISC2 at 4-way) and a race-free swap handler that polls for active python content jobs hitting zero (Phase E batch boundary), grants 15s grace for run_one post-processing, then SIGTERMs the Phase E parent and launches Phase E2 lossless — no in-flight specs interrupted. Chains forward to Phase F (Meta recovery)	5.0h	22m	4m	13.6x	75.0x
9	Press release rewrite (live vs shipping, Autopilot/behavioral, strip jargon, anchor originating patent + perf), add deferred-content launch placeholders, correct HQ city/dateline, build pre-commit canon validator + helper script	4.0h	18m	6m	13.3x	40.0x
10	Port embedded subscribe flow from web client to desktop client (SubscribeModal, SubscribeScreen, SubscribeCompleteScreen, useRequireSubscription, subscription API client, CSP update, TTS gate wiring)	8.0h	40m	5m	12.0x	96.0x
11	the product launch teaser end-to-end production pipeline: 5 protagonist refs (an image model.1 Pro Ultra), 16+ character-locked stills (an image model) with multiple iterations per shot, 16 video shots (a video model) animated from locked stills, 3 music tracks (a TTS service) with iterative prompts, narration recording + ffmpeg cleanup chain (highpass, FFT denoise, declick, deesser, compressor, limiter), ffmpeg assembly with timing-derived cuts, animated LAUNCHING/MONDAY title plate (PIL+ffmpeg fades), crossfade transitions, poster prepend for messaging-app preview, 60s trim, 4 compressed delivery variants	80.0h	540m	12m	8.9x	400.0x
12	Add ExamReviewView.swift to iOS client — per-question post-exam review screen with NavigationStack push from ExamResultsView	4.0h	28m	5m	8.6x	48.0x
13	Port SSE event-bus client from web client to desktop client	2.0h	14m	3m	8.6x	40.0x
14	the marketing site launch pages + newsletter platform integration: built /vote/ (A/B teaser comparison with bias-neutral Video 1/Video 2 labels, JS-driven radio selection, newsletter platform public subscribe form) and /product-hunt/ (launch CTA explainer with upvote walkthrough). Custom Jinja templates extending shared the product overlay. Created newsletter platform 'the product Launch Feedback' newsletter via MCP. Iterative bug-fix cycle: asset path resolution (/assets/ vs root), CORS-aware fetch with graceful fallback, B-version voice regeneration with George + audio level matching to A (-20dB attenuation), shot-1 poster cache-busting. Targeted S3 + CloudFront deploys via aws-cli (no CI/CD exists for marketing sites).	18.0h	180m	8m	6.0x	135.0x
15	the platform ADR-0002 follow-ups Thread 1+3+4: autopilot-driven harness mode in headlessrunner (StudentProfile.harnessmode + loadpairsbygoal helper + gradeonepair goalid parameter), clarifying comment block on gradeonepair documenting calibration vs optimizer validation paths, per-domain targetcompetence + competencefloor overrides from domain.exammetadata plumbed through restgateway → orchestrator → plansession.	4.0h	60m	2m	4.0x	120.0x
16	the platform multi-cohort calibration sweep proving predictor handles heterogeneous learners (Charles-style 10/10 pass at predicted 0.975 actual 0.824 ECE 0.025) — MoE design exploration deferred since single-model predictor is well-calibrated for novice/ready/heterogeneous regimes (overall Brier=0.003, ECE=0.034). Postgres recovery from Docker corruption.	4.0h	70m	3m	3.4x	80.0x
17	the platform predictor mixture-of-experts design exploration + Phase F (heterogeneous goaltargetaccuracies in StudentProfile + per-question lookup in headless_runner) + Charles Sieg resume-modeled a cloud cert exam profile generator (70 leaf goals classified into weak/moderate/strong by keyword rules from resume) + multi-cohort sweep script (novice CLF, ready CLF, Charles-style heterogeneous ANS).	5.0h	90m	8m	3.3x	37.5x
18	the platform ADR-0002 + ELIF (predictor calibration robustness + gap-focused optimizer): full ADR with 12-section MADR shape (decision drivers, considered options A-F, detailed design split into 5.1 predictor + 5.2 optimizer, 5-phase implementation plan, validation criteria, 4 documented risks, decision log including a correction entry). Implemented Fix 1 (gapfocus urgency function), Fix 2 (competence floor on readiness), Fix 3 (two-phase state machine) behind a feature flag feature flag in autopilotranker.py + restgateway.py. Five regression tests in testaudit_regressions.py. Validation testing surfaced that the original diagnosis was partially wrong — the legacy ranker already picks weak goals; the decoy harness was bypassing the optimizer. Honest correction logged in ADR decision log.	7.0h	130m	10m	3.2x	42.0x
19	the marketing site title cleanup: hide redundant total pill on provider pages, factor coursetitle macro into tmmacros, preserve Certified-in-X (ISC2/ISACA) carve-outs, strip trailing Certificate (ISACA Certificates), wire macro into 9 call sites across courses/course-page/category-page templates, deploy 3 prod + 1 staging cycles	1.5h	110m	5m	0.8x	18.0x
20	the marketing site courses page: tighten card cap from 20 to 15, strip Certified word from 99 course titles via template filter (cards + course pages), reorder VMware after Cisco in Networking and Salesforce/SAP/Oracle after IBM in Enterprise, deploy 1 prod + 1 staging build	1.0h	88m	3m	0.7x	20.0x

Aggregate Statistics

Metric	Value
Total tasks	20
Total human-equivalent hours	304.5
Total Claude minutes	1676
Total supervisory minutes	97
Total tokens	4,951,500
Weighted average leverage factor	10.9x
Weighted average supervisory leverage factor	188.4x

Analysis

The day's leverage distribution is the part that matters more than the headline figure. 4 tasks cleared the 30x threshold; 6 tasks ran below 5x. The 30x+ tier is what produces the impression that AI changes the time-cost curve; the sub-5x tier is what reminds anyone watching that some work is still gated by human review and cannot speed up arbitrarily.

Top-of-distribution tasks tend to share a shape: tightly-scoped, well-specified, with no integration ambiguity. On May 7, 2026 the 68.6x ceiling came from Pre-launch burndown: fixed 3 holdout partial labs (git-lab-02, a cloud cert exam-lab-16, a cloud cert exam-lab. The work fit cleanly into 35 Claude-minutes because the inputs and the success criterion were both explicit; the AI was not required to discover anything new. That shape is repeatable; tasks like it post 30x to 60x consistently across the recent log.

Bottom-of-distribution work runs differently. The 0.7x floor on the marketing site courses page: tighten card cap from 20 to 15, strip Certified word from 99 course titles vi reflects a near-1:1 ratio that reflects bounded review-heavy work where the human watches each step. The supervisory ratio (188x weighted today) tracks differently: it captures how much human prompt-writing time the day's output consumed, and it stays high even on lower-leverage days because supervisory minutes scale roughly with task count, not with human-equivalent hours.