Leverage Record: May 16, 2026

38 tasks. May 16, 2026 weighted to 23.3x leverage across 393.5 human-equivalent hours in 1012 Claude-minutes. Supervisory leverage closed at 373.3x.

9.8 weeks of human-equivalent throughput in 16.9 hours of Claude wall-clock. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), Phone...; the 4.4x floor sat at Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep....

About These Records

These time records capture personal project work done with Claude Code (Anthropic) only. They do not include work done with ChatGPT (OpenAI), Gemini (Google), Grok (xAI), or other models, all of which I use extensively. Client work is also excluded, despite being primarily Claude Code. The actual total AI-assisted output for any given day is substantially higher than what appears here.

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), PhoneSyncClient over Wearable Data Layer (callbackFlow DataClient listener + decode pure helper), PhoneSyncModule, 5 screens (Welcome /...	26.0h	27m	1m	57.8x	1560.0x
2	an Android client Phase 11 five patent screens: 4 new EngineApi endpoints (governance/trajectory/cross-domain/scenario+submit) + 4 DTO files, PatentRepository, MockEngineDispatcher Contains match mode + 5 new fixtures, PatentScreenScaffold shared chrome, AnalyticsScreen (style axes + drift alerts + recommenda...	26.0h	28m	1m	55.7x	1560.0x
3	an Android client Phase 10 course mode + TTS: ElevenLabsTts (Media3 ExoPlayer wrapper with callbackFlow Player.Listener bridge), PlaybackUpdate, TtsCacheStore (SHA-256-keyed disk cache + resolve/enrollFile/clear/sizeBytes), VoiceModule, CourseViewModel (taxonomy → tree with depth-cap cycle short-circuit), bui...	22.0h	24m	1m	55.0x	1320.0x
4	an Android client Phase 9 active session: ActiveSessionViewModel (engine session lifecycle + wall-clock-anchored timing + DailyRingsStore mutation), ActiveSessionState sealed class, SessionHeader, ActiveSessionScreen with ActivityRouter, SessionResultsScreen with ELO delta tile, 6 activity composables (Contra...	28.0h	31m	1m	54.2x	1680.0x
5	an Android client Phase 13 competitive multiplayer: 2 new lobby endpoints + CompetitiveDto + CompetitiveRepository + 2 fixtures, ReconnectingEngineEventClient (exponential backoff 1/2/4/8/16s cap with ConnectionState StateFlow + healthy-reconnect counter reset), CompetitiveLobbyViewModel/Screen (create + join...	22.0h	25m	1m	52.8x	1320.0x
6	an Android client Phase 16 billing + i18n + finishing: Plus Jakarta Sans via Compose downloadable fonts + GoogleFont.Provider (5 weights, transparent SansSerif fallback), font_certs.xml documented stub, PlayBillingClient (suspending BillingClient wrapper + SharedFlow purchase updates + acknowledge auto-flow),...	24.0h	28m	1m	51.4x	1440.0x
7	an Android client Phase 12 Autopilot + WorkManager: AutopilotStore (encrypted prefs) + InMemoryAutopilotStore, NotificationChannels (autopilot.reminders + streak.milestones), AutopilotReminderScheduler (nextOccurrence pure helper + OneTimeWorkRequest sized delay), AutopilotReminderNotifier (Android 13+ permis...	22.0h	26m	1m	50.8x	1320.0x
8	an Android client Phase 14 knowledge cosmos: CosmosLayoutEngine in :domain (pure-Kotlin Fruchterman-Reingold with deterministic seed and 7 unit tests), LayoutNode/Edge/PositionedNode framework-free records, KnowledgeGraphDto + new EngineApi endpoint + KnowledgeGraphRepository + 9-node fixture, KnowledgeMapVie...	18.0h	22m	1m	49.1x	1080.0x
9	an Android client Phase 17 macrobenchmark + baseline profile: :macrobenchmark Gradle module (com.android.test + androidx.baselineprofile + self-instrumenting + variant gating), StartupBenchmark (cold + warm × None/Partial-BaselineProfileMode-Require/Full × 10 iterations targeting .benchmark variant), Baseline...	14.0h	18m	1m	46.7x	840.0x
10	Phase 6A: extract examservice from restgateway (createexam+submitexam+getstudyplan, 800 LOC removed, 22 new unit tests)	12.0h	23m	1m	31.3x	720.0x
11	Phase 7B: autopilotservice composite-path unit tests (computecompositereadiness aggregation + computecompositenextactions cluster-dedup + diversity guard)	5.0h	12m	0m	25.0x	6000.0x
12	Phase 7D: manifold + strategy gRPC servicer tests (fixed manifold.proto deprecated option, unblocked proto codegen, 14 new tests; api 75.3->79.3%, origin 78.2->80.5%)	5.0h	13m	0m	23.1x	3000.0x
13	Phase 6H: extract composite autopilot routes + cross-domain cluster helpers to autopilot_service (359 LOC, collocates the full autopilot brain in one service)	9.0h	24m	0m	22.5x	5400.0x
14	Phase 6F: extract insightsservice (computeinsights + cognitive-state classifier; 402 LOC out of rest_gateway, 16 new tests covering each card heuristic)	7.0h	19m	0m	22.1x	2100.0x
15	Phase 6C: extract questionservice (getnextpairmcq + getnextquestion) + generatemicrochallenge into autopilotservice (350 LOC, 21 new tests, fixes Phase 6B computenext_actions regression)	8.0h	22m	0m	21.8x	1920.0x
16	LLM-IT 8: controllerloop integration tests (3 tests covering constructor wiring + runsynthesis_stage + token usage rollup; $0.04/run)	4.0h	11m	0m	21.8x	2400.0x
17	an inference engine Phase 3 heavyweight extractions: deleteentity (127 LOC) + submitanswer (313 LOC) + submitquestionanswer (258 LOC) + assessreadiness (225 LOC) + getfingerprint (85 LOC) into sessionanswerservice + strategy_service. Includes ~100 new comprehensive unit tests covering every contract p...	18.0h	50m	2m	21.6x	540.0x
18	Phase 6B: extract submitactivitycredit + getcrossdomain_transfer into existing service modules (311 LOC, 12 new tests, 3 pre-existing tests updated)	6.0h	17m	0m	21.2x	720.0x
19	Phase 6I: extract catalog_service (catalog-projections + catalog-proficiency routes plus shared cache state + invalidation; 370 LOC)	6.0h	17m	0m	21.2x	1800.0x
20	an inference engine Phase 3 final heavyweight push: getdailystats + getentityreadinesshistory + getlesson + recordautopilotactivity + diagnoserootcause + createremediationsession (6 endpoints; ~750 LOC consolidated into strategyservice/lessonservice/autopilotservice/entityservice). ~80 new uni...	14.0h	40m	2m	21.0x	420.0x
21	Phase 7C: snapshotcache pure-logic unit tests (17 tests: msgpack coercion, SnapshotMeta round-trip, tensor markers, url resolution, loadsnapshot error paths)	3.0h	9m	0m	20.0x	3600.0x
22	an inference engine final autopilot brain extraction: getnextactionsinner (660 LOC) moved to autopilotservice.computenext_actions. Late-imports for 7 gateway-local helpers keep helpers + brain on separate sides without forcing helper migration. Audit-regression test updated to track the safety read at t...	6.0h	18m	2m	20.0x	180.0x
23	an inference engine Phase 5 ratchet + client update plan: bumped failunder 79->80 (actual 81.46%), wrote 200-line client-update-plan.md with endpoint-by-endpoint compatibility table, per-client impact assessment, behavior corrections (epsilon seeding, contenttype passthrough, exception ordering), pre-merge...	4.0h	12m	2m	20.0x	120.0x
24	LLM-IT 9: ValidationPipeline integration tests (3 tests covering 3-pass validation through real embedder+NLI+LLM; happy/empty/wrong-fragment paths)	3.0h	9m	0m	20.0x	3600.0x
25	LLM integration test harness: 17 tests across 5 origin modules (client, synthesizer, amplifier, validator tribunal, flashcard tribunal) with cost guard + auto-skip; first run cost $0.0255	12.0h	38m	2m	18.9x	360.0x
26	Origin extract Phase 2: 7 grouped commits cutting engine off an inference engine.origin. (LLM-client/embedder rewires in 9 files, composer relocation to an inference engine.runtime, PERSONALIZATION_ relocation to an inference engine.api.prompts, ScenarioConfig carve-off, AtomBundle/Collection lib path swaps...	8.0h	26m	1m	18.5x	480.0x
27	Phase 6G: move computedomainreadiness from restgateway to services/helpers (zero late-imports from services to restgateway anymore; 227 LOC, 5 new readiness-math tests)	4.0h	13m	0m	18.5x	2400.0x
28	an inference engine Phase 5 coverage backfill: 85 new tests across snapshotcache (msgpack default, tensor markers strip/restore, URL resolver, SnapshotPayload), scenarioseeds (normalizedifficulty, filter, tokens, coverage, grade keyword fallback, composecontext, buildscenarioresponse), computenextacti...	6.0h	20m	2m	18.0x	180.0x
29	Phase 6D: extract shared math+taxonomy helpers into services/helpers (eliminates late-import dance; 328 LOC out of restgateway, 25 new helper tests)	5.0h	17m	0m	17.6x	1200.0x
30	Phase 7A: catalogservice unit tests (15 tests covering cache helpers, projection bundle, invalidation, both routes; lifts catalogservice from 24% to ~95%)	4.0h	14m	0m	17.1x	2400.0x
31	Phase 7E: engine_context singleton + lab-index unit tests (6 tests; api 79.3->79.4%)	2.0h	7m	0m	17.1x	1200.0x
32	an inference engine Phase 5 final coverage backfill: 25 new tests for restgateway math helpers (poissonbinomialpassprobability, targetperquestionprobability inverse with round-trip verification, entityrollingcorrectnessrate, requiredobservationsper_node). Round-trip property test between forward +...	2.0h	8m	2m	15.0x	60.0x
33	Phase 6E: move 15 inline Pydantic models from rest_gateway to api/models.py (197 LOC, 0 regressions)	2.0h	9m	0m	13.3x	1200.0x
34	Origin extraction Phase 0: full inventory + dependency map + 9-phase plan + 3 new lib repos + new service repo with CLI/observability skeleton + 4 existing repos updated + 7 commits	14.0h	95m	15m	8.8x	56.0x
35	Audit-orphanfix batch complete: 9 fresh re-syntheses + 9 question banks landed at 100% graph∩pair overlap, VPR 0.87-0.98. Engine bug fix (regeneratenodes pair-orphan) verified end-to-end across all 9 packages. Monitored via 10-min cron with custom monitororphanfix.sh script that ran ~85 checks across 14h. A...	2.5h	20m	2m	7.5x	75.0x
36	Origin extract Phase 1: populate 3 new libs from an inference engine.origin (llm/embeddings/runtime types + schemas + parser + validator), full coverage suites, 197 tests green at ≥92% per lib, all 4 docs and commits per lib	9.0h	75m	3m	7.2x	180.0x
37	Created 4 new zero-sweep profiles, ran 9-domain a simulation harness calibration sweep, diagnosed portfolio-wide synthesis bug: contrastive pairs reference missing knowledge_graph nodes (33%-100% broken refs), starving engine readiness signal	3.0h	35m	4m	5.1x	45.0x
38	Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep preflight to auto-reload, fixed PCA profile resolver bug, identified FinOps-for-AI content bug (2 recall nodes vs 200+ baseline),...	8.0h	110m	12m	4.4x	40.0x

Aggregate Statistics

Metric	Value
Total tasks	38
Total human-equivalent hours	393.5
Total Claude minutes	1012
Total supervisory minutes	63
Total tokens	5,552,000
Weighted average leverage factor	23.3x
Weighted average supervisory leverage factor	373.3x
Human-equivalent weeks	9.8

Analysis

The day's leverage distribution matters more than the headline figure. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSyn...; the 4.4x floor was Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wir.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.

The supervisory leverage figure (373.3x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 38 tasks, the day produced roughly 9.8 weeks of senior-engineer-equivalent throughput in 16.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.