Skip to main content

Agents Posts

Achieving Determinism with LLM Agents: An Architecture Guide

I run a fleet of LLM agents that audit 42 repositories every day. Same code, same prompts, same model. And for weeks, every single run produced different results. Not because the codebase was changing between runs, but because the agents were sampling instead of enumerating, summarizing instead of counting, and making subjective judgments instead of executing deterministic checks. The audit results were non-reproducible, which meant they were useless as an audit.

Read more