Harness Engineering

A series on the emerging discipline of building infrastructure for AI agents — architectural oversight, constraint systems, and the shift from writing code to directing it.

Code generation is now infinite and essentially free. The bottleneck has moved from writing code to directing it — and that changes everything about how we build software.

Harness Engineering is the practice of building the infrastructure, constraints, and oversight systems that allow AI agents to operate safely and productively at scale. It's the difference between a powerful horse running wild and one guided by an experienced rider.

Articles in this Series

Part 1 — The Scarcity Inversion: Why Your Keyboard is Becoming Obsolete

Code generation is now infinite and free. The bottleneck has moved from writing code to architectural oversight — and that changes everything.

Part 2 — 1,500 PRs in 5 Months: The OpenAI Harness Benchmark

OpenAI's Harness team shipped ~1 million lines of production code with zero manual code — 1,500 PRs, 3.5 PRs/day/engineer, 90% time savings. Here's what it means for the future of software delivery.

Part 3 — The 6-Layer Shield: Architecting for Agents

Agents don't read documentation — they hit walls. The 6-Layer Shield is an architectural model that turns your codebase into a habitat with mechanically enforced boundaries, so agents can't drift into chaos.

Part 4 — Context Rot: Why Your Agent Gets Dumber the Longer It Works

LLM reasoning decays as conversation history grows. The fix isn't prompting harder. It's decomposing features into subtasks where each one gets a fresh context window and reads its state from disk.

Part 5 — The Moltbook Lesson: Why AI-First Needs Human-Verified Security

In January 2026, agents built a social network and forgot to lock the front door. The Moltbook breach didn't just expose millions of API keys — it exposed a fundamental flaw in the hands-off AI development philosophy. Introducing the RIDER framework.

Part 6 — Killing Prompt-and-Pray: The Harness-First Roadmap

The series finale. Going harness-first is not a subscription or a tooling purchase. It is a five-layer operational shift, and this is the practical roadmap for running it at a real engineering org.

harness-engineeringaiengineering-leadershipagentic-workflow

Back to All Articles

Articles in this Series

Code generation is now infinite and free. The bottleneck has moved from writing code to architectural oversight — and that changes everything.

LLM reasoning decays as conversation history grows. The fix isn't prompting harder. It's decomposing features into subtasks where each one gets a fresh context window and reads its state from disk.

The series finale. Going harness-first is not a subscription or a tooling purchase. It is a five-layer operational shift, and this is the practical roadmap for running it at a real engineering org.