Where to invest first

Don't try to build all five layers at once. Pick the one with the highest marginal return.

The right first investment depends on whether you're on a managed platform (where layers 2, 3, 5 are already provided) or self-hosting your own harness from scratch. The coverage heatmap on the right shows where most teams actually are today.

Investment decision tree

Start here. Two questions get you to your first sprint of harness work.

? Are you on a managed agent platform?e.g. Anthropic Managed Agents, OpenAI Assistants, a vendor stack

Yes — managed L2 / L3 / L5 are platform-shipped. You inherit them.

L1 prevents the agent from violating your architectural rules; L4 catches what slips through. Highest ROI on a managed platform.

No — self-hosted You're on the hook for all five.

Verification is the fastest reliability win you can ship in a sprint. Then constraints. Then add lifecycle as scale forces it.

? Could you staff five layers cross-functionally today?L1 architecture · L2 dev · L3 platform · L4 QA · L5 SRE

Yes Assign each layer a named owner.

L1 in arch reviews. L2 in sprints. L3 on platform roadmap. L4 in DoD. L5 in runbooks.

No Focus on L4 only. Revisit later.

A single pre-stop hook + golden-case suite outperforms five half-built layers.

The default answer for most teams: ship L4 (Verification) first. It's the fastest path from "demo works" to "production reliable" — Boris Cherny: 2-3× output quality.

Coverage heatmap — where most teams actually are

Read across each row to see what a team profile typically has built. Look for empty cells; that's your gap.

L1
Constraint

L2
Context

L3
Execution

L4
Verification

L5
Lifecycle

Most teams2026

Median team1 yr in

Mature teamOpenAI/Anthropic class

Empty / ad-hoc Partial Built

The pattern. Most teams are partial on L2 (some agentfile exists) and L4 (some tests run), and empty on L1, L3, L5. Naming the gap is the first move — not adding more tools.