Skip to main content
Tag

AI Posts

Building renkara.com: A Corporate Site in the Age of AI

On March 28, 2026, I committed the first file for renkara.com. Twelve days later, the site had 54 pages, a WebGL particle animation system, a complete 18-year company timeline with 60+ milestones, 14 tool showcase pages with lightbox screenshot galleries, legacy product pages with original artwork from 2008, a blog, dark mode, and a build pipeline that minifies everything to a deployable dist/ directory. No framework. No SSR. No CMS. Pure HTML, CSS, and JavaScript.

Read more

Building a Private Tool Fleet: 14 Internal Applications in 45 Days

Between late February and early April 2026, I built 14 internal applications from scratch. Not prototypes. Not demos. Production systems with PostgreSQL backends, React frontends, full test suites, CI/CD pipelines, and MCP servers that let Claude Code operate them directly. The total codebase across all tools exceeds 60,000 lines of code.

Read more

Announcing The Deferral — My First Novel

I just finished writing a technothriller about artificial intelligence. Now I know what a lot of you are thinking: "AI wrote it." After all, most of my writing is generated by AI these days. My primary LLM, Claude Opus 4.6 from Anthropic, is a very capable writer. And Claude participated extensively in writing this novel. But not in the way you think.

Read more

Achieving Determinism with LLM Agents: An Architecture Guide

I run a fleet of LLM agents that audit 42 repositories every day. Same code, same prompts, same model. And for weeks, every single run produced different results. Not because the codebase was changing between runs, but because the agents were sampling instead of enumerating, summarizing instead of counting, and making subjective judgments instead of executing deterministic checks. The audit results were non-reproducible, which meant they were useless as an audit.

Read more

Day Off: Spring Break

No computer time today. Spent the day renting skis, buying lift tickets, stocking up on groceries, and getting everything sorted for spring break. Zero tasks, zero tokens, zero leverage.

Read more

Vector Database Architecture: How Vector Search Powers RAG Systems

I built my first vector search system with a flat numpy array and brute-force cosine similarity. Three hundred fifty chunks, 1024 dimensions, under 2MB. Search completed in microseconds. That works fine for a few hundred documents. It stops working when you hit millions of vectors, need sub-10ms latency at thousands of queries per second, and your index no longer fits in memory on a single node. That is where vector databases earn their place: they solve the hard problem of approximate nearest neighbor search at scale, and they form the retrieval backbone of every serious RAG (Retrieval-Augmented Generation) system in production today.

Read more

Building an Enterprise Chatbot: React, FastAPI, and WebSocket Architecture

Every enterprise wants an AI chatbot now. Most of the tutorials out there will get you a working prototype in an afternoon. Deploying that prototype to production for a Fortune 500 client with 10,000 concurrent users, strict data isolation requirements, and a CFO watching the LLM API bill? That is a different engineering problem entirely. I have built and operated chatbot systems at this scale across regulated industries (healthcare, financial services, government), and the gap between "it works on my laptop" and "it handles production load without bleeding money" is enormous. This article covers the architecture I have landed on after iterating through several generations of enterprise chatbot deployments: React on the frontend, FastAPI on the backend, WebSockets for real-time communication, and a layered storage and caching strategy that keeps costs sane.

Read more

Automated TDD with Claude Code: Testing Strategy for AI-Assisted Engineering

Every project I hand to Claude Code starts the same way: I write the testing strategy before the first line of application code exists. Not because I am a TDD purist (I have skipped tests on personal projects like anyone else), but because I learned the hard way that an AI agent without test constraints will produce code that works today and breaks tomorrow. The agent is fast, confident, and has zero memory of what it built yesterday. Tests are the only thing that survives between sessions and keeps the codebase honest.

Read more

The Leverage Factor, Part 2: Defending the Numbers

The Leverage Factor: Measuring AI-Assisted Engineering Output generated more direct messages than anything else I have published. Some of the feedback was enthusiastic. A significant portion was hostile. "Exaggerated." "False." "No way those numbers are real." Fair enough. I published extraordinary claims with data but without enough context for readers to evaluate the methodology. This article fills that gap. I am going to take specific time records, break them apart, defend the human estimates with engineering detail, and then show that the original leverage calculation actually understates the real multiplier.

Read more

Agentic Coding, FOMO, and Flow State Addiction

Last Monday I went into my office at 7 AM to kick off a few Claude sessions before taking the trash cans to the street. I sat down, wrote three prompts, and started reviewing the first batch of output. At noon I looked up and realized the garbage truck had come and gone four hours ago. I had not eaten breakfast. I had not taken the trash out. I had been sitting in the same chair for five hours without standing up, and the only reason I noticed was that someone texted to ask if I was still alive. The work was going so well that stopping felt physically wrong.

Read more

Agentic Coding and Decision Fatigue: The Cognitive Cost of Supervising AI

Recently during heavy Claude Code usage, I started noticing an uncomfortable trend. At 8 AM I could run three agent sessions at once, spot a bad abstraction in a 200-line diff, and push back on architectural shortcuts without hesitation. By 3 PM the same work felt like wading through concrete. My prompts got sloppy. I started approving diffs I would have questioned six hours earlier. Twice I caught myself closing a session just to avoid making a decision about it. Once I even prompted the following: "I know you can do better than this. Be thorough and just get it done, bro." The work had not gotten harder. My interest had not faded. I wanted to understand what had changed between 8 AM and 3 PM inside my skull.

Read more

Building a Cloud Knowledge Benchmark: Testing What LLMs Actually Know About AWS

I spend most of my time building production systems on AWS. I also spend a growing fraction of my time working with LLMs to design and implement those systems. That combination raises a question I kept coming back to: how much does the model actually know about AWS? Not "can it write a CloudFormation template" or "can it debug a Lambda timeout." Those are execution tests. I wanted something more fundamental. If I ask a model about VPC peering limits, ElastiCache shard maximums, or the four-step Secrets Manager rotation lifecycle, does it know the answer? Does it know the current answer, or is it stuck on a value from two years ago?

Read more

Giving Claude Code a Voice with ElevenLabs

I spend hours in Claude Code every day. Long sessions where I am reading, thinking, switching contexts, and occasionally glancing at the terminal to see if the agent finished a task. The problem: Claude Code is silent. It finishes a 10-minute build-and-deploy pipeline and just sits there, cursor blinking, waiting for me to notice. The whole concept here was inspired by J.A.R.V.I.S. from the Iron Man films, voiced by Paul Bettany. Tony Stark's AI assistant announces status, flags problems, and delivers dry commentary while Stark works on something else entirely. I wanted that. An AI assistant that speaks. That announces when it starts a task and summarizes what it accomplished when it finishes. Like a competent colleague who taps you on the shoulder and says "that deployment is done, here's what happened."

Read more

The Leverage Factor: Measuring AI-Assisted Engineering Output

In finance, leverage is the use of borrowed capital to amplify returns. A trader with 10x leverage controls ten dollars of assets for every dollar of equity. The principle is straightforward: a small input controls a disproportionately large output. The same principle now applies to software engineering, and the ratios are significantly higher than anything a margin account offers.

Read more

Cloning GitHub in 49 Minutes

I cloned GitHub. The result is a full-featured, single-user Git hosting platform with repository management, code browsing with syntax highlighting, pull requests with three merge strategies, issues with labels and comments, releases, search, activity feeds, insights, dark mode, and 50+ API endpoints. 111 files. 18,343 lines of code. 155 passing tests. The whole thing took 49 minutes, entirely within the scope of a Claude subscription.

Read more

Using Claude to Clone Confluence in 16 Minutes

Day three. Another SaaS subscription, another Single Serving Application. I've now replaced Harvest (time tracking) and Trello (project management) with AI-generated clones. Today's target: Confluence, Atlassian's knowledge management and wiki platform. Claude Opus 4.6 built a fully functional Confluence clone in 16 minutes, consuming 106,000 tokens. That's the fastest build yet, down from 18 minutes for Harvest and 19 for Trello. The pattern holds: requirements in, working application out, no human intervention needed.

Read more

Using Claude to Clone Trello in 20 Minutes

Last week I had Claude Opus 4.6 and GPT-5.3-Codex race to build a Harvest clone. Claude won decisively. That experiment killed a $180/year SaaS subscription. Naturally, I started looking at my other subscriptions. Trello was next on the list. I've used it for years to manage personal projects, product roadmaps, and random ideas. Trello is a solid product, but it is also a multi-tenant, collaboration-heavy platform where I use maybe 20% of the features. A perfect candidate for a Single Serving Application. So I wrote a requirements document, handed it to Claude Opus 4.6, and walked away. 19 minutes and 137,000 tokens later, I had a fully functional Kanban board running on localhost.

Read more

The Single Serving Application

I recently had two AI models build a complete Harvest clone in under 20 minutes. The winning version covered 97% of Harvest's features. I'm seriously considering canceling my $180/year subscription and using it instead. That experiment got me thinking about something bigger than one app replacement. We're entering an era where a competent engineer with an AI coding assistant can generate a fully functional web application from a requirements document in the time it takes to eat lunch. That changes the economics of software in a fundamental way.

Read more

Claude Opus 4.6 vs. GPT-5.3-Codex: Building a Full Web App From Scratch

Last week was a big week for Anthropic and OpenAI. Both released new versions of their flagship coding models: Claude Opus 4.6 from Anthropic and GPT-5.3-Codex (Medium) from OpenAI. Any time new coding models are released, it's like an extra Christmas for me. There was some talk about Sonnet 5.0 being released also but so far, nothing. I suspect that has something to do with the most recent agentic coding benchmarks.

Read more

Rebuilding My Site with Narrative CMS

Twenty years ago, I built a blogging platform called Narrative. It was an ASP.NET-based CMS with advanced features like automatic page rebuilding, a sophisticated tagging system, and comment spam prevention. I used it to power this site from 2003 until 2008, when I abandoned it in favor of WordPress, saying "I am much more interested in blogging than the building of blogging software." That code sat shelved for nearly two decades. Then, in 2024, I discovered Claude Code and realized that with AI assistance, I could finally bring Narrative back to life as exactly the system I'd always envisioned. This post tells that story.

Read more

Overlooked Productivity Boosts with Claude Code

Most engineers who adopt Claude Code start with the obvious: "write me a function," "fix this bug," "add a test." Those are fine. They also miss at least half the value. The largest productivity gains come from activities engineers either do poorly, skip entirely, or never consider delegating. After months of tracking leverage factors across every task I give Claude Code, the data reveals where the real multipliers hide. Surprisingly few involve writing application code.

Read more