Leverage Record: May 25, 2026

Nine tasks. May 25, 2026 weighted to 25.3x leverage across 59.5 human-equivalent hours in 141 Claude-minutes. Supervisory leverage closed at 142.8x.

1.5 weeks of human-equivalent throughput in 2.4 hours of Claude wall-clock. The 33.6x ceiling came from Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files); the 12.0x floor sat at core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error logging.

About These Records

These time records capture personal project work done with Claude Code (Anthropic) only. They do not include work done with ChatGPT (OpenAI), Gemini (Google), Grok (xAI), or other models, all of which I use extensively. Client work is also excluded, despite being primarily Claude Code. The actual total AI-assisted output for any given day is substantially higher than what appears here.

Task Log

#	Task	Human Est.	Claude	Sup.	Factor	Sup. Factor
1	Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files)	28.0h	50m	5m	33.6x	336.0x
2	Investigate and backfill missing a metrics tracker leverage records since May 22: surveyed git logs across 80+ an inference engine repos, identified 7 commit-clusters across May 23-25, diagnosed root cause (process discipline gap, not infra), reconstructed and POSTed 7 records to a metrics tracker-api	4.0h	8m	4m	30.0x	60.0x
3	Audit 29 ~/.claude/skills/ manifests for missing leverage-POST step on free-form task path; identify 16 tool-loader skills (a marketing platform, a calendar platform, a knowledge base, an email platform, a defect tracker, a portfolio browser, a metrics tracker, a newsletter platform, a time-tracking app, a CM...	3.0h	6m	2m	30.0x	90.0x
4	Invert leverage tracking policy: CSV first then cloud second both mandatory; patched global CLAUDE.md Rules block, /fix skill Step 6k, 16 tool-loader Step 4 blocks, and /leverage-post Phase 2 reconciliation	4.0h	8m	2m	30.0x	120.0x
5	/leverage-post reconciliation Phase 1+2: backfilled 139 CSV rows across 12 days (5/14-5/25), verified all in sync, 0 stragglers remaining	1.5h	4m	1m	22.5x	90.0x
6	core/a simulation harness rebuild: brain answerer switched to direct Anthropic SDK with prompt caching (257 LOC), all zero/pmp sweep profiles flipped to omniscient:false (46 files), headless runner surfaces non-200 from /next-pair-mcq with context	10.0h	30m	4m	20.0x	150.0x
7	Cross-fleet prompt-caching micro-sweep: cachecontrol on a relationship CRM sonnet system block, an API gateway a recruiter product/llmnormalizer streaming call, automation-resume-refinement SYSTEMPROMPT, an origin service atoms/generator system + toolschema	3.0h	10m	2m	18.0x	90.0x
8	an audit toolchain: content-audit P4.1 pair-density check with PGWA-class detection (61 LOC), a configuration file headline counts bumped to 2026-05-25 audit snapshot, per-activity-format trackers added (scenarios, flashcards, etc.)	4.0h	15m	3m	16.0x	80.0x
9	core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error logging	2.0h	10m	2m	12.0x	60.0x

Aggregate Statistics

Metric	Value
Total tasks	9
Total human-equivalent hours	59.5
Total Claude minutes	141
Total supervisory minutes	25
Total tokens	928,000
Weighted average leverage factor	25.3x
Weighted average supervisory leverage factor	142.8x
Human-equivalent weeks	1.5

Analysis

The day's leverage distribution matters more than the headline figure. The 33.6x ceiling came from Synthesis pipeline: prompt caching + Anthropic Batches API integration across synthesis scripts in core/an inference engine (1649 LOC, 5 files); the 12.0x floor was core/an inference engine: autopilotservice legacy coverage-damping ceiling lifted + bulkamplifyfleet and bulkbackfillrecall customid format fix with error.... Tasks at the top of the distribution share a shape: tightly-scoped specifications, clear success criteria, and minimal integration ambiguity. The AI doesn't need to discover anything new; it executes against an explicit target.

Tasks at the bottom run differently. They're either bounded by review-heavy work where every step gets verified, or they involve ambiguity that demands several rounds of trial and adjustment. The factor is real and informative, not a failure mode.

The supervisory leverage figure (142.8x today) tracks something orthogonal to wall-clock leverage. It's the ratio of human-equivalent output to human prompt-writing time. It stays high even on lower-leverage days because supervisory minutes scale with task count, not with the human-hour estimate; a 20-minute task and a 4-hour task can both be specified in two minutes of human prompt-writing.

Across the 9 tasks, the day produced roughly 1.5 weeks of senior-engineer-equivalent throughput in 2.4 hours of model wall-clock. That ratio is the practical answer to the question of how much output a single operator can move per day when the model handles the execution and the operator handles the direction.