Question 1

What is a production AI agent?

Accepted Answer

A production AI agent is one that serves real users with a measurable service-level objective, has a cost ceiling owned by a specific person, has a defined blast radius for errors, and emits enough telemetry that an outage at 3 AM would be noticed. Most demos hit zero of these four thresholds.

Question 2

What is the Router-Planner-Executor pattern?

Accepted Answer

Router-Planner-Executor splits agent work across three model tiers: a fast Haiku-class Router classifies the request, a Sonnet-class Planner decides the approach, and an Executor runs the tool-use loop. The split wins on cost (Routing on Sonnet wastes money), reliability (failure modes are isolated), and observability (three boundaries to assert correctness against).

Question 3

What is the difference between human-in-the-loop and human-on-the-loop?

Accepted Answer

Human-in-the-loop (HITL) means a human approves before each agent action, used for irreversible, regulated, or low-volume work. Human-on-the-loop (HOTL) means the agent acts autonomously while a human monitors and intervenes asynchronously, used for reversible, high-volume work where audit-after-the-fact is acceptable. Most production agents migrate from HITL to HOTL action-class by action-class as trust is earned.

Question 4

When should I build an AI agent in-house versus hire a freelancer?

Accepted Answer

Build in-house when you have one well-scoped workflow and two engineers free to learn for 6-10 weeks. Hire a freelancer for the first agent when your team has no LLM-engineering experience or when you need a single proof-of-concept in two weeks. For compliance-heavy workflows (finance, healthcare, GDPR), hire someone who has shipped a compliant agent before.

Question 5

How do I test an AI agent before production?

Accepted Answer

Use four eval layers: golden tasks (10-50 realistic inputs with expected behavior), snapshot assertions on output structure and tool-call shape, cost-budget assertions per task (catches silent prompt drift), and offline replay with mocked tools. Add a weekly shadow review where an engineer reviews 20 random production runs to find emerging patterns.

Production-AI-Agent-Architektur-Playbook Production AI Agent Architecture Playbook

Playbook herunterladen Download the playbook

Ihr Playbook ist bereit Your playbook is ready

Was drin ist What's inside

Was "Production-Agent" wirklich bedeutet What "production agent" actually means

Router-Planner-Executor-MusterRouter-Planner-Executor pattern

State- und Memory-RegelnState and memory rules

Tool-Design, das Traffic überlebtTool design that survives traffic

Tests vor ProductionTesting before production

Monitoring und Cost-ControlMonitoring and cost control

Human-in-the-Loop vs. Human-on-the-LoopHuman-in-the-loop vs human-on-the-loop

Build vs. Hire EntscheidungsmatrixBuild vs hire decision matrix