Leverage Record: March 3, 2026
Daily accounting of what Claude Opus 4.6 built today, measured against how long a senior engineer familiar with each codebase would need for the same work. This was a day dominated by education platform development, engineering metrics tooling, and build infrastructure improvements.
Automated TDD with Claude Code: Testing Strategy for AI-Assisted Engineering
Every project I hand to Claude Code starts the same way: I write the testing strategy before the first line of application code exists. Not because I am a TDD purist (I have skipped tests on personal projects like anyone else), but because I learned the hard way that an AI agent without test constraints will produce code that works today and breaks tomorrow. The agent is fast, confident, and has zero memory of what it built yesterday. Tests are the only thing that survives between sessions and keeps the codebase honest.
Leverage Record: March 2, 2026
Daily accounting of what Claude Opus 4.6 built today, measured against how long a senior engineer familiar with each codebase would need for the same work. This was a marathon day dominated by structured document authoring at scale: 95 domain specification documents across seven certification families, plus CMS infrastructure work, content moderation, and article writing.
The Leverage Factor, Part 2: Defending the Numbers
The Leverage Factor: Measuring AI-Assisted Engineering Output generated more direct messages than anything else I have published. Some of the feedback was enthusiastic. A significant portion was hostile. "Exaggerated." "False." "No way those numbers are real." Fair enough. I published extraordinary claims with data but without enough context for readers to evaluate the methodology. This article fills that gap. I am going to take specific time records, break them apart, defend the human estimates with engineering detail, and then show that the original leverage calculation actually understates the real multiplier.
Agentic Coding, FOMO, and Flow State Addiction
Last Monday I went into my office at 7 AM to kick off a few Claude sessions before taking the trash cans to the street. I sat down, wrote three prompts, and started reviewing the first batch of output. At noon I looked up and realized the garbage truck had come and gone four hours ago. I had not eaten breakfast. I had not taken the trash out. I had been sitting in the same chair for five hours without standing up, and the only reason I noticed was that someone texted to ask if I was still alive. The work was going so well that stopping felt physically wrong.
Leverage Record: March 1, 2026
Twelve tasks today across five workstreams: a desktop Electron application from architecture document through Phase 1 implementation, reference data compilation and matching pipelines, patent portfolio documentation, ML pipeline evaluation and architecture work, and static site tooling improvements including unit test backfills.
Leverage Record: February 28, 2026
Eighteen tasks today across five workstreams: a resume generator built from scratch and iterated through three major revisions, knowledge synthesis tooling enhancements, reference architecture documentation, an ML validation pipeline, and a technical article on decision fatigue in agentic coding workflows.
Agentic Coding and Decision Fatigue: The Cognitive Cost of Supervising AI
Recently during heavy Claude Code usage, I started noticing an uncomfortable trend. At 8 AM I could run three agent sessions at once, spot a bad abstraction in a 200-line diff, and push back on architectural shortcuts without hesitation. By 3 PM the same work felt like wading through concrete. My prompts got sloppy. I started approving diffs I would have questioned six hours earlier. Twice I caught myself closing a session just to avoid making a decision about it. Once I even prompted the following: "I know you can do better than this. Be thorough and just get it done, bro." The work had not gotten harder. My interest had not faded. I wanted to understand what had changed between 8 AM and 3 PM inside my skull.
Leverage Record: February 27, 2026
Nineteen tasks today across three distinct workstreams: patent figure generation, cloud certification tooling, and technical writing. The patent work dominated in volume (11 tasks) while the infrastructure design document dominated in leverage factor.
Building a Cloud Knowledge Benchmark: Testing What LLMs Actually Know About AWS
I spend most of my time building production systems on AWS. I also spend a growing fraction of my time working with LLMs to design and implement those systems. That combination raises a question I kept coming back to: how much does the model actually know about AWS? Not "can it write a CloudFormation template" or "can it debug a Lambda timeout." Those are execution tests. I wanted something more fundamental. If I ask a model about VPC peering limits, ElastiCache shard maximums, or the four-step Secrets Manager rotation lifecycle, does it know the answer? Does it know the current answer, or is it stuck on a value from two years ago?
Leverage Record: February 26, 2026
Daily accounting of what Claude Opus 4.6 built today, measured against how long a senior engineer familiar with each codebase would need for the same work. Twenty tasks across six projects. The day split between building a custom patent diagram renderer from scratch, standing up an interactive learning frontend with multiple activity modes, implementing a server-side scoring engine, writing three architecture articles, and iterating on layout engine improvements. The patent diagrammer hit the session's highest leverage at 200x.
Leverage Record: February 25, 2026
Daily accounting of what Claude Opus 4.6 built today, measured against how long a senior engineer familiar with each codebase would need for the same work. Nine tasks across five projects. The production API implementation dominated the day in both scope and wall-clock time. Three architecture articles were written and deployed in parallel.
Giving Claude Code a Voice with ElevenLabs
I spend hours in Claude Code every day. Long sessions where I am reading, thinking, switching contexts, and occasionally glancing at the terminal to see if the agent finished a task. The problem: Claude Code is silent. It finishes a 10-minute build-and-deploy pipeline and just sits there, cursor blinking, waiting for me to notice. The whole concept here was inspired by J.A.R.V.I.S. from the Iron Man films, voiced by Paul Bettany. Tony Stark's AI assistant announces status, flags problems, and delivers dry commentary while Stark works on something else entirely. I wanted that. An AI assistant that speaks. That announces when it starts a task and summarizes what it accomplished when it finishes. Like a competent colleague who taps you on the shoulder and says "that deployment is done, here's what happened."
Leverage Record: February 24, 2026
Daily accounting of what Claude Opus 4.6 built today. Twenty-eight tasks across seven projects. The day was dominated by standing up four product vertical websites with full AWS deployments, fixing diagram rendering issues across a large document set, and building out cloud infrastructure and backend services. This was the highest-volume day so far.
The Leverage Factor: Measuring AI-Assisted Engineering Output
In finance, leverage is the use of borrowed capital to amplify returns. A trader with 10x leverage controls ten dollars of assets for every dollar of equity. The principle is straightforward: a small input controls a disproportionately large output. The same principle now applies to software engineering, and the ratios are significantly higher than anything a margin account offers.
Leverage Record: February 23, 2026
Daily accounting of what Claude Opus 4.6 built today, measured against how long a senior engineer familiar with each codebase would need for the same work. These are leverage factors, not time savings. Most of these projects are ones I would not have started without AI. The leverage factor measures how much more I can ship, not how much faster I finish.
Cloning GitHub in 49 Minutes
I cloned GitHub. The result is a full-featured, single-user Git hosting platform with repository management, code browsing with syntax highlighting, pull requests with three merge strategies, issues with labels and comments, releases, search, activity feeds, insights, dark mode, and 50+ API endpoints. 111 files. 18,343 lines of code. 155 passing tests. The whole thing took 49 minutes, entirely within the scope of a Claude subscription.
Using Claude to Clone Confluence in 16 Minutes
Day three. Another SaaS subscription, another Single Serving Application. I've now replaced Harvest (time tracking) and Trello (project management) with AI-generated clones. Today's target: Confluence, Atlassian's knowledge management and wiki platform. Claude Opus 4.6 built a fully functional Confluence clone in 16 minutes, consuming 106,000 tokens. That's the fastest build yet, down from 18 minutes for Harvest and 19 for Trello. The pattern holds: requirements in, working application out, no human intervention needed.
Single Serving Applications - The Clones
I'm systematically replacing my SaaS subscriptions with Single Serving Applications. These are purpose-built, AI-generated apps designed for an audience of one. Each clone is built by Claude Opus 4.6 from a requirements document, runs via Docker Compose, and costs essentially nothing to operate.
Using Claude to Clone Trello in 20 Minutes
Last week I had Claude Opus 4.6 and GPT-5.3-Codex race to build a Harvest clone. Claude won decisively. That experiment killed a $180/year SaaS subscription. Naturally, I started looking at my other subscriptions. Trello was next on the list. I've used it for years to manage personal projects, product roadmaps, and random ideas. Trello is a solid product, but it is also a multi-tenant, collaboration-heavy platform where I use maybe 20% of the features. A perfect candidate for a Single Serving Application. So I wrote a requirements document, handed it to Claude Opus 4.6, and walked away. 19 minutes and 137,000 tokens later, I had a fully functional Kanban board running on localhost.
Ephemeral Apps Are Almost Here
I recently built a Harvest clone in 18 minutes, a Trello clone in 19 minutes, and a Confluence clone in 16 minutes. All three were generated entirely by Claude Opus 4.6 from requirements documents. All run in Docker. All work.
The Single Serving Application
I recently had two AI models build a complete Harvest clone in under 20 minutes. The winning version covered 97% of Harvest's features. I'm seriously considering canceling my $180/year subscription and using it instead. That experiment got me thinking about something bigger than one app replacement. We're entering an era where a competent engineer with an AI coding assistant can generate a fully functional web application from a requirements document in the time it takes to eat lunch. That changes the economics of software in a fundamental way.
Claude Opus 4.6 vs. GPT-5.3-Codex: Building a Full Web App From Scratch
Last week was a big week for Anthropic and OpenAI. Both released new versions of their flagship coding models: Claude Opus 4.6 from Anthropic and GPT-5.3-Codex (Medium) from OpenAI. Any time new coding models are released, it's like an extra Christmas for me. There was some talk about Sonnet 5.0 being released also but so far, nothing. I suspect that has something to do with the most recent agentic coding benchmarks.
Rebuilding My Site with Narrative CMS
Twenty years ago, I built a blogging platform called Narrative. It was an ASP.NET-based CMS with advanced features like automatic page rebuilding, a sophisticated tagging system, and comment spam prevention. I used it to power this site from 2003 until 2008, when I abandoned it in favor of WordPress, saying "I am much more interested in blogging than the building of blogging software." That code sat shelved for nearly two decades. Then, in 2024, I discovered Claude Code and realized that with AI assistance, I could finally bring Narrative back to life as exactly the system I'd always envisioned. This post tells that story.
Overlooked Productivity Boosts with Claude Code
Most engineers who adopt Claude Code start with the obvious: "write me a function," "fix this bug," "add a test." Those are fine. They also miss at least half the value. The largest productivity gains come from activities engineers either do poorly, skip entirely, or never consider delegating. After months of tracking leverage factors across every task I give Claude Code, the data reveals where the real multipliers hide. Surprisingly few involve writing application code.