n8n AI Agent Workflow Examples: 5 Production Patterns

April 12, 2026 · 14 min read · n8n, ai-agents, claude, workflow, automation
n8n AI Agent Workflow Examples: 5 Production Patterns

I run n8n in production for content ops, email triage, and invoice parsing. The visual canvas is not the point. The point is that triggers, retries, queues, and credentials are free, and I can hand a workflow to a non-engineer to edit prompts without them breaking the integration layer.

This post is five n8n ai agent workflow examples I actually ship or have shipped for clients. Each one includes the node graph, the Claude prompt, the cost per run, and the production gotchas. No toy demos.

If you are still deciding between platforms, my Make vs n8n for production workloads guide covers that tradeoff. This post assumes you already picked n8n.

Why n8n for AI workflows

Writing an agent in pure Python or TypeScript is fine. It is also the wrong call for maybe 60% of the work I see. Here is when n8n wins.

Triggers are free. Gmail, Slack, webhooks, IMAP, schedule, Telegram, Airtable, Postgres row change. In code you would wire each of these yourself. In n8n it is a dropdown.

Retries and error branches are native. Every node has a retry config and an error output. I do not need to wrap every API call in a try/except.

Credentials are a first-class object. I am not leaking API keys into git by accident. Credentials live in n8n’s encrypted store and get injected at runtime.

Human-in-the-loop is cheap. The Telegram node, Gmail node, or a Wait-for-webhook pattern lets me pause a run, send a human a draft, and resume on approval. Building that in code is a weekend. In n8n it is 4 nodes.

LangChain nodes are built in. The n8n.ai LangChain integration gives you agent nodes, memory buffers, vector store retrievers, and tool binding out of the box. You can do RAG in 6 nodes without writing any LangChain yourself.

Business users can edit prompts. This is the real win. My clients edit the Claude system prompt without touching the integration. The engineer wires the pipeline once, the business owns the prompt.

When code still wins: ultra-low-latency chat UIs, complex agent loops with more than five tool calls per turn, and heavy file processing pipelines. More on that at the end.

The pattern behind all 5 workflows

Every production AI workflow I run in n8n follows the same shape:

Trigger > Normalize input > Classify (Haiku) > Branch > Generate (Sonnet) > Act > Log

The key moves:

  1. Haiku for classification, Sonnet for generation. Haiku is roughly 10x cheaper than Sonnet. Use it for routing decisions (“is this urgent”, “what category”, “is this spam”). Use Sonnet for the actual content work. I cover model tradeoffs in more depth in my Claude API tool use guide.
  2. Structured output via tool use, not free-form JSON. Telling Claude “respond in JSON” works most of the time. Tool use with a schema works every time. Every example below uses the tool-use pattern.
  3. Every workflow has an error path. I route failed executions to Telegram so I hear about them immediately. Silent failure is how agents die in prod.
  4. Token usage gets logged. I write every run’s input and output token count to Postgres. Without this you cannot see cost drift when a prompt gets longer over time.

With that scaffolding in place, here are the five workflows.

1. Incoming email triage with Claude

Problem: My inbox gets a mix of client questions, invoices, cold sales, and notifications. Manual triage is 30 to 45 minutes a day. I want the inbox to come to me already sorted, with drafts for the things that need a reply.

Trigger: Gmail node (OAuth) polling every 5 minutes. IMAP works if you do not want to grant OAuth.

Node graph:

Gmail Trigger > Code (extract subject, from, body, attachments)
 > Claude (Haiku, tool use) > Switch (by category)
   > [invoices]  Notion Create Page + Gmail auto-reply
   > [support]   TickTick Create Task + Slack notify
   > [sales]     Gmail Forward + Slack notify
   > [other]     Gmail Label "Later"
 > Postgres (log classification + token count)

Claude node config. I use Haiku 4.5 here because classification is cheap and I do 200 emails a day. System prompt plus tool definition:

{
  "model": "claude-haiku-4-5-20251001",
  "system": "You classify incoming emails. Return exactly one category. Never ask clarifying questions. Never invent facts about the sender.",
  "tools": [{
    "name": "classify_email",
    "description": "Classify the email and extract metadata",
    "input_schema": {
      "type": "object",
      "properties": {
        "category": {"type": "string", "enum": ["invoice", "support", "sales", "other"]},
        "priority": {"type": "string", "enum": ["low", "medium", "high"]},
        "needs_response": {"type": "boolean"},
        "summary": {"type": "string", "maxLength": 200}
      },
      "required": ["category", "priority", "needs_response", "summary"]
    }
  }],
  "tool_choice": {"type": "tool", "name": "classify_email"}
}

Gotchas.

  • Gmail Trigger polls. If you want real-time, use the Gmail push subscription via Pub/Sub. For most use cases 5-minute polling is fine.
  • Attachments are not passed to Claude by default. Extract text from PDFs first (see workflow 5) or the model hallucinates content.
  • Put an error branch on the Claude node that routes to Telegram. When rate limits hit, you want to know.

Cost. Haiku 4.5 at ~200 tokens in, ~80 tokens out per email. About $0.0002 per classification. 200 emails a day costs roughly $0.04. Negligible.

2. Document Q&A bot with RAG (Qdrant + Claude)

Problem: A client has 400 PDF policy documents. Sales reps need to answer customer questions against these docs without reading them all. I want a chat endpoint that takes a question and returns a grounded answer with citations.

Trigger: Webhook node. The frontend chat UI POSTs {question, user_id} to the webhook URL.

Node graph:

Webhook > Embed question (Voyage or OpenAI embed node)
 > Qdrant Search (top 5, score threshold 0.7)
 > Code (format context with [1], [2] markers and source file names)
 > Claude (Sonnet 4.6) > Code (inject citation URLs)
 > Respond to Webhook

Indexing is a separate workflow. I do not index on-the-fly. I run a second n8n workflow on a schedule that walks the document folder, chunks PDFs (512 tokens with 50 overlap), embeds, and upserts to Qdrant. Indexing once, querying many times.

Claude prompt. The trick is forcing the model to only use the retrieved context. My system block:

You answer questions using ONLY the provided context. Every factual claim must reference a source marker like [1] or [3]. If the context does not contain the answer, say "I could not find this in the provided documents" and stop. Do not use outside knowledge.

User message template:

Context:
[1] {chunks[0].text} (source: {chunks[0].source})
[2] {chunks[1].text} (source: {chunks[1].source})
...

Question: {question}

Gotchas.

  • Score threshold matters. Below 0.7 on Voyage embeddings I get junk context and the model hallucinates anyway. Test your threshold against a labeled set before shipping.
  • Chunk overlap is not optional. Without overlap, answers that span paragraph boundaries fail.
  • Cache the system prompt. Set cache_control: {"type": "ephemeral"} on the system block. On a high-traffic bot that cuts input cost by 90% per cached hit.
  • Never let the user inject into the context block directly. Sanitize.

Cost. Sonnet 4.6 at ~1500 tokens in (system + 5 chunks + question), ~300 tokens out. Roughly $0.009 per question. At 1000 queries a day, $9. Cache hits bring that to around $3.

For a deeper dive on the full RAG stack, I will cover the embed pipeline in the RAG pipeline tutorial.

3. Scheduled content generation pipeline

Problem: Every Monday morning I want a weekly business report in my inbox and in Slack. Revenue, key metrics, notable events. Manually pulling this took me 40 minutes every Monday.

Trigger: Schedule node, every Monday at 08:00 Europe/Madrid.

Node graph:

Schedule > HTTP Request (GA4 Data API, last 7 days)
         > HTTP Request (Stripe, last 7 days payments)
         > HTTP Request (Plausible, pageviews)
 > Merge (combine all three JSON blobs)
 > Code (flatten into a compact brief: top 10 metrics with deltas vs prior week)
 > Claude (Sonnet 4.6, tool use for structured report)
 > Split in Batches
   > Slack Post Message (#weekly-report)
   > Gmail Send (stakeholders list)
   > Notion Create Page (archive)

Prompt design. The mistake people make is dumping raw JSON into Claude. Do not. Pre-digest it first with a Code node into a compact markdown brief. Then the model writes narrative from clean input.

System prompt for this one:

You write weekly business reports for a SaaS founder. Lead with what changed vs last week. Use plain numbers, not percentages alone. No filler. No "we saw" or "we observed" openers. If a metric is flat, say so in one line and move on.

Tool schema returns {headline, kpi_table, highlights, risks, next_week_focus}. I render the tool output into markdown in a Code node before posting.

Gotchas.

  • Timezone. Schedule node uses the n8n server timezone. Set it explicitly in your config or your Monday 08:00 is someone else’s Sunday 22:00.
  • GA4 quota. The API throttles. Add a 2-second wait before each call if you run multiple reports back to back.
  • Slack’s message length limit is 40,000 chars. Notion block size is 2000 chars per paragraph block. Split accordingly.

Cost. Sonnet 4.6 at ~800 tokens in, ~500 tokens out. About $0.008 per report. Once a week, so $0.04 a year. The value is the 40 minutes a week I get back.

4. Customer support auto-response with approval

Problem: A client runs a SaaS help desk. Many tickets are repeat questions with known answers. They want AI to draft replies but a human must approve before anything goes out.

This is the human-in-the-loop pattern. The key move is the Telegram node acting as an approval gate.

Trigger: Webhook from the support form.

Node graph:

Webhook > Code (normalize ticket payload)
 > Claude (Haiku, classify urgency and category)
 > IF (urgency == "high")
   YES: Slack page on-call, stop workflow
   NO:  Claude (Sonnet, generate draft reply with tool use)
        > Telegram Send Message (draft + Approve/Edit/Reject buttons)
        > Wait for Webhook (approval callback)
        > Switch on approval decision
          > [approved] Gmail Send
          > [edited]   Gmail Send with edited text
          > [rejected] TickTick Create Task (escalate to human)

The Telegram approval trick. I send the draft with inline buttons. Each button hits a callback URL, which is another n8n webhook. The workflow resumes on the “Wait for Webhook” node using the resumeUrl pattern.

I wrote up the Telegram bot pattern in building a Telegram bot with Claude. Same architecture. The approval gate is just a Telegram message with callback buttons.

Prompt for draft generation.

You draft email replies to customer support tickets. Match the customer's tone (formal or casual). Never promise refunds, timelines, or compensation. If the issue requires engineering input, write a holding reply that acknowledges the ticket and says a team member will follow up within 24h.

Tool schema: {draft_reply, confidence, flags} where flags is an array of things like refund_request or legal_mention that should route to a human regardless.

Gotchas.

  • Wait-for-webhook workflows stay resident. If your n8n has a queue mode and a 10-minute execution timeout, set the Wait node’s timeout explicitly or you will lose runs.
  • Never auto-send without the human gate until you have logged at least 200 approved drafts and measured the edit rate. If edit rate is above 20%, the prompt is not ready.
  • Log the approver’s decision to Postgres. This is your training signal for improving the prompt.

Cost. Haiku classify ($0.0002) plus Sonnet draft ($0.01). $0.01 per ticket. At 500 tickets a month, $5. Labor saved, roughly 10 hours.

5. Invoice processing and categorization

Problem: Vendor invoices arrive as PDF attachments in email. I want them extracted into Airtable with vendor, amount, date, currency, and category fields, ready for reconciliation.

Trigger: Gmail with attachment filter (has:attachment filename:pdf).

Node graph:

Gmail Trigger > IF (attachment is PDF)
 > Extract from File (built-in PDF text node)
 > Claude (Sonnet 4.6, tool use for extraction)
 > IF (confidence < 0.8)
   YES: TickTick Create Task (manual review queue) + Telegram notify
   NO:  Airtable Append Record
        > Slack Post Message (#bookkeeping, summary line)
 > Gmail Add Label "processed"

The extraction tool.

{
  "name": "extract_invoice",
  "description": "Extract structured fields from an invoice PDF text dump",
  "input_schema": {
    "type": "object",
    "properties": {
      "vendor_name": {"type": "string"},
      "invoice_number": {"type": "string"},
      "issue_date": {"type": "string", "format": "date"},
      "due_date": {"type": "string", "format": "date"},
      "amount_total": {"type": "number"},
      "amount_tax": {"type": "number"},
      "currency": {"type": "string", "enum": ["EUR", "USD", "GBP", "CHF"]},
      "category": {"type": "string", "enum": ["hosting", "software", "contractor", "travel", "office", "other"]},
      "confidence": {"type": "number", "minimum": 0, "maximum": 1},
      "notes": {"type": "string"}
    },
    "required": ["vendor_name", "amount_total", "currency", "confidence"]
  }
}

System prompt forces conservative extraction:

Extract invoice fields from the text. If a field is missing or ambiguous, leave it empty and lower confidence. Never guess the amount. Never assume currency from vendor name alone. If the PDF text is clearly garbled (OCR failure), set confidence to 0 and explain in notes.

Gotchas.

  • n8n’s built-in PDF text extractor works on text-based PDFs. Scanned invoices (image PDFs) need OCR first. I use a Tesseract container via HTTP Request when needed. Budget for this if your vendors are image-heavy.
  • Always include confidence in the tool schema and route low-confidence runs to human review. Without this, bad extractions silently land in Airtable and poison your books.
  • The currency enum prevents the model from inventing “EURO” or “USDollar”. Enums are the cheapest validation in the world.
  • VAT rates vary by country. If you need deterministic VAT calculation, do it in a Code node after extraction, not in the prompt.

Cost. Sonnet 4.6 at ~1200 tokens in (invoice text), ~200 tokens out. ~$0.006 per invoice. At 100 invoices a month, $0.60. Manual entry was 90 seconds per invoice. AI is 8 seconds and it does not get tired at 4pm.

Common patterns across these workflows

Use Haiku for routing, Sonnet for generation. Every workflow above follows this. The 10x cost difference matters at volume. Haiku 4.5 handles classification, intent detection, urgency scoring. Sonnet 4.6 handles drafting, summarization, extraction.

Structured output via tool use. I never ask Claude for “JSON only” in the prompt. I give it a tool, force tool choice, and parse the tool input. My Claude API tool use guide covers this pattern in depth.

Always have an error branch to Telegram. Every n8n node has an “On Error” output. Wire it to a Telegram message. When something breaks in production you hear about it in 30 seconds, not when a client emails you asking where their report went.

Log token usage to Postgres. I add a final Code node that writes {workflow_id, run_id, model, input_tokens, output_tokens, cost_usd, timestamp} to a llm_usage table. Weekly rollup queries catch the moment a prompt got longer by accident and your monthly bill doubled.

Rate limit gracefully. When Claude rate limits you, the error has a retry-after header. Use n8n’s Wait node in the error branch and respect it. Do not hammer the API.

Cache aggressively. On any workflow where the system prompt is stable, add cache_control: {"type": "ephemeral"} on the system block. Cache hits are 10% the cost of a cold read. See prompt caching for the full pattern.

Version your prompts. Put the prompt text in a Set node or a JSON file committed to git. Do not edit prompts live in the Claude node. You want a diff trail.

When not to use n8n for AI

I ship a lot of code agents too. Here is when I skip n8n and go straight to code.

Ultra-low-latency chat UIs. Every n8n webhook adds 100 to 300ms of overhead. For a user-facing chat where latency matters, write the agent in Node, Go, or Python and stream tokens directly to the client. Webhooks break streaming.

Agent loops with more than 5 tool calls per turn. n8n can do agent loops via the LangChain agent node, but debugging gets hard when the loop has 20 iterations. In code, a while loop with structured logs is clearer. For deep agent architecture I wrote production AI agent architecture which covers when to go code-first.

Heavy file processing. If you are streaming 500MB CSVs or chunking terabytes of video, do it in a script with proper streaming. n8n loads items into memory per execution. You will OOM.

Real-time event processing at scale. Kafka consumers, high-throughput webhook fanouts, anything over ~50 events per second. n8n’s queue mode helps but dedicated infrastructure is cheaper and faster.

CLI-first workflows. When the trigger is “a developer runs a command”, skip n8n and write a CLI. Claude Code SDK agents covers that pattern for agent tooling.

For everything else, n8n ships faster and is easier to hand off.

Deploying these on self-hosted n8n

All five workflows run on a self-hosted n8n instance on a Hetzner VPS. The managed cloud works too, but self-hosting gives you unlimited executions at flat cost, which matters once you hit about 500 runs a day. My n8n self-hosting guide walks through the Docker Compose setup, Postgres, Redis for queue mode, and the reverse proxy.

The workflow JSON exports from each of these are stable across n8n versions. Import them, wire up your credentials (Claude API key, Gmail OAuth, Qdrant URL), and they run.

Download the AI Automation Checklist (PDF)