Skip to main content
AI MAY 16, 2026

Leverage Record: May 16, 2026

38 tasks. May 16, 2026 weighted to 23.3x leverage across 393.5 human-equivalent hours in 1012 Claude-minutes. Supervisory leverage closed at 373.3x.

38 tasks. May 16, 2026 weighted to 23.3x leverage across 393.5 human-equivalent hours in 1012 Claude-minutes. Supervisory leverage closed at 373.3x.

9.8 weeks of human-equivalent throughput in 16.9 hours of Claude wall-clock. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), Phone...; the 4.4x floor sat at Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep....

About These Records
These time records capture personal project work done with Claude Code (Anthropic) only. They do not include work done with ChatGPT (OpenAI), Gemini (Google), Grok (xAI), or other models, all of which I use extensively. Client work is also excluded, despite being primarily Claude Code. The actual total AI-assisted output for any given day is substantially higher than what appears here.

Task Log

#TaskHuman Est.ClaudeSup.FactorSup. Factor
1an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSync collection), PhoneSyncClient over Wearable Data Layer (callbackFlow DataClient listener + decode pure helper), PhoneSyncModule, 5 screens (Welcome /...26.0h27m1m57.8x1560.0x
2an Android client Phase 11 five patent screens: 4 new EngineApi endpoints (governance/trajectory/cross-domain/scenario+submit) + 4 DTO files, PatentRepository, MockEngineDispatcher Contains match mode + 5 new fixtures, PatentScreenScaffold shared chrome, AnalyticsScreen (style axes + drift alerts + recommenda...26.0h28m1m55.7x1560.0x
3an Android client Phase 10 course mode + TTS: ElevenLabsTts (Media3 ExoPlayer wrapper with callbackFlow Player.Listener bridge), PlaybackUpdate, TtsCacheStore (SHA-256-keyed disk cache + resolve/enrollFile/clear/sizeBytes), VoiceModule, CourseViewModel (taxonomy → tree with depth-cap cycle short-circuit), bui...22.0h24m1m55.0x1320.0x
4an Android client Phase 9 active session: ActiveSessionViewModel (engine session lifecycle + wall-clock-anchored timing + DailyRingsStore mutation), ActiveSessionState sealed class, SessionHeader, ActiveSessionScreen with ActivityRouter, SessionResultsScreen with ELO delta tile, 6 activity composables (Contra...28.0h31m1m54.2x1680.0x
5an Android client Phase 13 competitive multiplayer: 2 new lobby endpoints + CompetitiveDto + CompetitiveRepository + 2 fixtures, ReconnectingEngineEventClient (exponential backoff 1/2/4/8/16s cap with ConnectionState StateFlow + healthy-reconnect counter reset), CompetitiveLobbyViewModel/Screen (create + join...22.0h25m1m52.8x1320.0x
6an Android client Phase 16 billing + i18n + finishing: Plus Jakarta Sans via Compose downloadable fonts + GoogleFont.Provider (5 weights, transparent SansSerif fallback), font_certs.xml documented stub, PlayBillingClient (suspending BillingClient wrapper + SharedFlow purchase updates + acknowledge auto-flow),...24.0h28m1m51.4x1440.0x
7an Android client Phase 12 Autopilot + WorkManager: AutopilotStore (encrypted prefs) + InMemoryAutopilotStore, NotificationChannels (autopilot.reminders + streak.milestones), AutopilotReminderScheduler (nextOccurrence pure helper + OneTimeWorkRequest sized delay), AutopilotReminderNotifier (Android 13+ permis...22.0h26m1m50.8x1320.0x
8an Android client Phase 14 knowledge cosmos: CosmosLayoutEngine in :domain (pure-Kotlin Fruchterman-Reingold with deterministic seed and 7 unit tests), LayoutNode/Edge/PositionedNode framework-free records, KnowledgeGraphDto + new EngineApi endpoint + KnowledgeGraphRepository + 9-node fixture, KnowledgeMapVie...18.0h22m1m49.1x1080.0x
9an Android client Phase 17 macrobenchmark + baseline profile: :macrobenchmark Gradle module (com.android.test + androidx.baselineprofile + self-instrumenting + variant gating), StartupBenchmark (cold + warm × None/Partial-BaselineProfileMode-Require/Full × 10 iterations targeting .benchmark variant), Baseline...14.0h18m1m46.7x840.0x
10Phase 6A: extract examservice from restgateway (createexam+submitexam+getstudyplan, 800 LOC removed, 22 new unit tests)12.0h23m1m31.3x720.0x
11Phase 7B: autopilotservice composite-path unit tests (computecompositereadiness aggregation + computecompositenextactions cluster-dedup + diversity guard)5.0h12m0m25.0x6000.0x
12Phase 7D: manifold + strategy gRPC servicer tests (fixed manifold.proto deprecated option, unblocked proto codegen, 14 new tests; api 75.3->79.3%, origin 78.2->80.5%)5.0h13m0m23.1x3000.0x
13Phase 6H: extract composite autopilot routes + cross-domain cluster helpers to autopilot_service (359 LOC, collocates the full autopilot brain in one service)9.0h24m0m22.5x5400.0x
14Phase 6F: extract insightsservice (computeinsights + cognitive-state classifier; 402 LOC out of rest_gateway, 16 new tests covering each card heuristic)7.0h19m0m22.1x2100.0x
15Phase 6C: extract questionservice (getnextpairmcq + getnextquestion) + generatemicrochallenge into autopilotservice (350 LOC, 21 new tests, fixes Phase 6B computenext_actions regression)8.0h22m0m21.8x1920.0x
16LLM-IT 8: controllerloop integration tests (3 tests covering constructor wiring + runsynthesis_stage + token usage rollup; $0.04/run)4.0h11m0m21.8x2400.0x
17an inference engine Phase 3 heavyweight extractions: deleteentity (127 LOC) + submitanswer (313 LOC) + submitquestionanswer (258 LOC) + assessreadiness (225 LOC) + getfingerprint (85 LOC) into sessionanswerservice + strategy_service. Includes ~100 new comprehensive unit tests covering every contract p...18.0h50m2m21.6x540.0x
18Phase 6B: extract submitactivitycredit + getcrossdomain_transfer into existing service modules (311 LOC, 12 new tests, 3 pre-existing tests updated)6.0h17m0m21.2x720.0x
19Phase 6I: extract catalog_service (catalog-projections + catalog-proficiency routes plus shared cache state + invalidation; 370 LOC)6.0h17m0m21.2x1800.0x
20an inference engine Phase 3 final heavyweight push: getdailystats + getentityreadinesshistory + getlesson + recordautopilotactivity + diagnoserootcause + createremediationsession (6 endpoints; ~750 LOC consolidated into strategyservice/lessonservice/autopilotservice/entityservice). ~80 new uni...14.0h40m2m21.0x420.0x
21Phase 7C: snapshotcache pure-logic unit tests (17 tests: msgpack coercion, SnapshotMeta round-trip, tensor markers, url resolution, loadsnapshot error paths)3.0h9m0m20.0x3600.0x
22an inference engine final autopilot brain extraction: getnextactionsinner (660 LOC) moved to autopilotservice.computenext_actions. Late-imports for 7 gateway-local helpers keep helpers + brain on separate sides without forcing helper migration. Audit-regression test updated to track the safety read at t...6.0h18m2m20.0x180.0x
23an inference engine Phase 5 ratchet + client update plan: bumped failunder 79->80 (actual 81.46%), wrote 200-line client-update-plan.md with endpoint-by-endpoint compatibility table, per-client impact assessment, behavior corrections (epsilon seeding, contenttype passthrough, exception ordering), pre-merge...4.0h12m2m20.0x120.0x
24LLM-IT 9: ValidationPipeline integration tests (3 tests covering 3-pass validation through real embedder+NLI+LLM; happy/empty/wrong-fragment paths)3.0h9m0m20.0x3600.0x
25LLM integration test harness: 17 tests across 5 origin modules (client, synthesizer, amplifier, validator tribunal, flashcard tribunal) with cost guard + auto-skip; first run cost $0.025512.0h38m2m18.9x360.0x
26Origin extract Phase 2: 7 grouped commits cutting engine off an inference engine.origin. (LLM-client/embedder rewires in 9 files, composer relocation to an inference engine.runtime, PERSONALIZATION_ relocation to an inference engine.api.prompts, ScenarioConfig carve-off, AtomBundle/Collection lib path swaps...8.0h26m1m18.5x480.0x
27Phase 6G: move computedomainreadiness from restgateway to services/helpers (zero late-imports from services to restgateway anymore; 227 LOC, 5 new readiness-math tests)4.0h13m0m18.5x2400.0x
28an inference engine Phase 5 coverage backfill: 85 new tests across snapshotcache (msgpack default, tensor markers strip/restore, URL resolver, SnapshotPayload), scenarioseeds (normalizedifficulty, filter, tokens, coverage, grade keyword fallback, composecontext, buildscenarioresponse), computenextacti...6.0h20m2m18.0x180.0x
29Phase 6D: extract shared math+taxonomy helpers into services/helpers (eliminates late-import dance; 328 LOC out of restgateway, 25 new helper tests)5.0h17m0m17.6x1200.0x
30Phase 7A: catalogservice unit tests (15 tests covering cache helpers, projection bundle, invalidation, both routes; lifts catalogservice from 24% to ~95%)4.0h14m0m17.1x2400.0x
31Phase 7E: engine_context singleton + lab-index unit tests (6 tests; api 79.3->79.4%)2.0h7m0m17.1x1200.0x
32an inference engine Phase 5 final coverage backfill: 25 new tests for restgateway math helpers (poissonbinomialpassprobability, targetperquestionprobability inverse with round-trip verification, entityrollingcorrectnessrate, requiredobservationsper_node). Round-trip property test between forward +...2.0h8m2m15.0x60.0x
33Phase 6E: move 15 inline Pydantic models from rest_gateway to api/models.py (197 LOC, 0 regressions)2.0h9m0m13.3x1200.0x
34Origin extraction Phase 0: full inventory + dependency map + 9-phase plan + 3 new lib repos + new service repo with CLI/observability skeleton + 4 existing repos updated + 7 commits14.0h95m15m8.8x56.0x
35Audit-orphanfix batch complete: 9 fresh re-syntheses + 9 question banks landed at 100% graph∩pair overlap, VPR 0.87-0.98. Engine bug fix (regeneratenodes pair-orphan) verified end-to-end across all 9 packages. Monitored via 10-min cron with custom monitororphanfix.sh script that ran ~85 checks across 14h. A...2.5h20m2m7.5x75.0x
36Origin extract Phase 1: populate 3 new libs from an inference engine.origin (llm/embeddings/runtime types + schemas + parser + validator), full coverage suites, 197 tests green at ≥92% per lib, all 4 docs and commits per lib9.0h75m3m7.2x180.0x
37Created 4 new zero-sweep profiles, ran 9-domain a simulation harness calibration sweep, diagnosed portfolio-wide synthesis bug: contrastive pairs reference missing knowledge_graph nodes (33%-100% broken refs), starving engine readiness signal3.0h35m4m5.1x45.0x
38Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wired decoy zero-sweep preflight to auto-reload, fixed PCA profile resolver bug, identified FinOps-for-AI content bug (2 recall nodes vs 200+ baseline),...8.0h110m12m4.4x40.0x

Aggregate Statistics

MetricValue
Total tasks38
Total human-equivalent hours393.5
Total Claude minutes1012
Total supervisory minutes63
Total tokens5,552,000
Weighted average leverage factor23.3x
Weighted average supervisory leverage factor373.3x
Human-equivalent weeks9.8

Analysis

The day's leverage distribution matters more than the headline figure. The 57.8x ceiling came from an Android client Phase 15 Wear OS companion: WatchPhase + WatchActivityMode + WatchAppState + WatchAppViewModel (HiltViewModel with SavedStateHandle + PhoneSyn...; the 4.4x floor was Diagnosed + fixed stale engine domain-cache bug (engine in-memory pairs/KG drift from disk after resynth), added /api/v1/admin/domains/reload bulk endpoint, wir.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.

The supervisory leverage figure (373.3x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 38 tasks, the day produced roughly 9.8 weeks of senior-engineer-equivalent throughput in 16.9 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.