Your AI Agent Makes Four Bad Decisions a Smarter Model Won't Fix

June 25, 2026 · 3 min read · ai, agents, llm
Your AI Agent Makes Four Bad Decisions a Smarter Model Won't Fix

Your AI agent fails decisions for the same four reasons a bad manager does. A bigger model fixes none of them.

Not because the model is dumb. Because nothing in its loop forces it to widen its options, look for evidence it is wrong, or check itself before it reports “done.” It takes the first reading of your prompt and runs.

You know the shape. The agent confidently ships a plan, the plan was wrong three steps back, and the only signal you got was a fluent summary saying it worked. A reliability study this June put a number on it: the strongest models melt down most in long task chains, failure rates up to 19%, precisely because they chase the most ambitious strategies.

These four failures are not new. Chip and Dan Heath named them in Decisive, a 2013 book about human decisions. They call them the four villains.

Narrow framing. The agent treats a task as one path and never generates a second. No “what else could this mean.”

Confirmation bias. It defends its own first plan instead of testing it. It collects reasons it is right, not reasons it is wrong.

Short-term pull. For a human it is emotion. For an agent it is the cheapest token path: the answer fastest to produce, not the one that holds.

Overconfidence. The dangerous one. It marks work complete without verifying, then writes you a convincing story about it.

The Heaths’ answer is a process you can encode. Four steps, and all four fit in a system prompt as a gate every non-trivial decision passes through. The acronym is WRAP.

W, widen. Force at least two real options before committing. The cheap trigger: “if the obvious approach were banned, what would I do?” Put it in the prompt as a required step, not a suggestion.

R, reality-test. Ooch before you commit: run the change against fake data or a dry-run, not the whole thing live. And make the agent hunt for the disconfirming fact, not the confirming one.

A, attain distance. Tag the decision: reversible, or one-way door? Reversible runs autonomously. One-way doors stop and ask. That single line of policy buys back most of your blast radius.

P, prepare to be wrong. The step everyone skips. A premortem (“it is a week later and this broke, why?”) plus a tripwire: a concrete signal that triggers a halt. Call it a circuit breaker if that lands better. Without it, “autonomous” just means “fails silently for longer.”

This is not a book riff. In June 2026 Google DeepMind shipped its AI Control Roadmap, which treats internal agents as potentially misaligned and has a second trusted system watch the working one. That is reality-test and prepare-to-be-wrong, in production, at one of the labs building the models. The same week’s reliability research says the same thing from the other side: more capability, more meltdown.

So the lever is not the next model. The Heaths measured that a disciplined process contributes more to decision quality than added analysis. For agents that means the four steps belong in the prompt, not the model card.

Pull up your agent’s system prompt. Which of the four villains does it actually gate, and which one is it one bad tool call away from?

Before you pull up that system prompt: the agent playbook is the field guide for gating those four villains in production, with the patterns and tripwires worked out.

Read the agent playbook

Scope my automation in 24h

Two fields. I reply within 24h with a written scope: either “yes, fixed price X, duration Y” or “no, here’s why not”.

See what you get first: sample scope →
Not ready to write it up? Book a 30-min call instead →

Request received

You’ll hear from me within 24h with an honest assessment.

Prefer to talk? 30-min roadmap call →