Telegram Bot Claude API: Build an AI Assistant in Your Pocket
I have a Telegram bot wired to Claude Opus 4.7 that I talk to from anywhere. Train, couch, cafe, bed. It reads my TickTick tasks, writes code against my repos, runs shell commands on my VPS, and sends me a morning briefing at 06:30 Madrid time. The whole thing is a bash script and a systemd unit. No frontend. No hosting bill. No auth pages to build.
This guide walks through exactly how to build one. Two architectures (bash long-polling and a TypeScript webhook server), full runnable code, attachment handling, MCP tool integration, and the security steps most tutorials skip. The primary stack is a Telegram bot Claude API wiring that runs on any Linux box with a few hundred megs of RAM.
If you want an AI assistant in your pocket, this is the shortest path.
Why Telegram as the AI interface
Before I landed on Telegram, I tried a React chat UI, a Slack app, and a Discord bot. All three wasted weekends. Telegram wins for a specific reason: the interface problem is already solved.
- One client, every surface. Native mobile, native desktop (macOS, Windows, Linux), and a web client. The same bot reaches all of them with no extra code.
- Push notifications are free and reliable. No APNs certificates, no FCM setup, no “notifications mostly work on Android” caveats.
- The Bot API is stable and free. No rate card, no enterprise tier gate, no deprecation treadmill. I have bots that have run untouched for three years.
- No auth flows to build. The chat ID is the identity. You allowlist your own ID and ship.
- Attachments are native. Photos, PDFs, voice notes, documents. All available via the same
getUpdatesendpoint.
The tradeoff: Telegram is not a compliance-ready channel for regulated data. Don’t pipe patient records or customer PII through a personal bot. For personal assistants, internal team tools, ops alerting, and content workflows, it is the fastest route to a working Telegram AI bot.
Two architectures, pick the one that fits
There are two ways to build a telegram bot with claude. Pick by use case, not by what feels more “professional”.
Long-polling bash daemon. One process, runs on your VPS, calls getUpdates in a loop, pipes each message through claude -p, sends the reply. Perfect for personal bots, internal team tools (up to maybe 20 users), ops bots. This is what I run.
Webhook server. Telegram POSTs every message to your HTTPS endpoint. You need a public domain, TLS, and a process that handles concurrent requests. Scales to thousands of users, survives bot restarts without losing messages, and plays nicely with serverless.
Rough decision matrix:
| Factor | Long-polling daemon | Webhook server |
|---|---|---|
| Users | 1 to 20 | 20 to unlimited |
| Setup time | 10 minutes | 1 to 2 hours |
| Infra | Any Linux box | Public domain + TLS |
| Concurrency | One message at a time | True parallel |
| Lost messages on restart | Yes (brief) | No |
| Cost | ~$0 (reuse existing VPS) | VPS + domain |
If you are reading this to build your first claude telegram integration for yourself, use long-polling. You can migrate to webhooks later if users pile up. Both versions are below.
Setup: bot token, chat ID, permissions
This part is the same regardless of stack.
Create the bot. In Telegram, search for @BotFather, send /newbot, pick a name and a username ending in bot. BotFather returns a token that looks like 7234567890:AAH.... Treat it like a password.
Disable privacy mode if you want the bot to read all messages in groups (optional, via /setprivacy). For personal use you don’t need it.
Save the token. I put mine in /home/user/.config/telegram-bot.env:
TELEGRAM_BOT_TOKEN=7234567890:AAH...
TELEGRAM_ALLOWED_CHAT_ID=123456789
ANTHROPIC_API_KEY=sk-ant-...
Then chmod 600 /home/user/.config/telegram-bot.env. The allowlist env var is non-negotiable; I’ll explain why in the security section.
Get your chat ID. Send any message to your bot, then:
source /home/user/.config/telegram-bot.env
curl -s "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/getUpdates" | jq '.result[0].message.chat.id'
The number you get back is your TELEGRAM_ALLOWED_CHAT_ID. Write it down. Without the allowlist, anyone who finds your bot username can rack up API bills on your account.
Bash version: 30 lines to a working bot
This is the pattern I actually run. It long-polls Telegram, routes messages through claude -p (which gives me Claude Code’s entire tool stack for free), and posts replies back. If you already have the Claude Code CLI installed on your VPS, you are ten minutes from a working AI Telegram bot.
#!/usr/bin/env bash
set -euo pipefail
source /home/user/.config/telegram-bot.env
API="https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}"
OFFSET=0
send() {
local chat="$1" text="$2"
# Telegram caps sendMessage at 4096 chars. Split long replies.
while [ -n "$text" ]; do
local chunk="${text:0:4000}"
text="${text:4000}"
curl -s -X POST "${API}/sendMessage" \
--data-urlencode "chat_id=${chat}" \
--data-urlencode "text=${chunk}" >/dev/null
done
}
handle_message() {
local chat="$1" user_text="$2"
if [ "$chat" != "$TELEGRAM_ALLOWED_CHAT_ID" ]; then
echo "$(date -Iseconds) rejected chat $chat" >&2
return
fi
# claude -p runs in print mode with the full Claude Code tool stack.
# Keep CWD consistent so MCP config is picked up.
local reply
reply=$(cd /home/user/claude && \
env -u CLAUDECODE -u CLAUDE_CODE_ENTRYPOINT \
claude -p --model sonnet --permission-mode ask \
--max-budget-usd 1 "$user_text" 2>&1) || reply="error: $reply"
send "$chat" "$reply"
}
while true; do
UPDATES=$(curl -s "${API}/getUpdates?offset=${OFFSET}&timeout=30")
echo "$UPDATES" | jq -c '.result[]?' | while read -r upd; do
OFFSET=$(($(echo "$upd" | jq '.update_id') + 1))
CHAT=$(echo "$upd" | jq '.message.chat.id')
TEXT=$(echo "$upd" | jq -r '.message.text // empty')
[ -n "$TEXT" ] && handle_message "$CHAT" "$TEXT"
echo "$OFFSET" > /tmp/telegram-bot.offset
done
OFFSET=$(cat /tmp/telegram-bot.offset 2>/dev/null || echo 0)
done
A few things to call out. The inner while read runs in a subshell so OFFSET doesn’t persist, which is why I round-trip through /tmp/telegram-bot.offset. claude -p in print mode inherits your MCP config, so if you have a TickTick MCP server wired up, the bot can manage tasks automatically. --max-budget-usd 1 caps cost per turn; raise it for heavier agent work. --permission-mode ask is a safety net for shell-executing tools (more in the security section).
A parse-mode note. If you set parse_mode=MarkdownV2, Telegram demands you escape every special character in the response: _*[]()~\>#+-=|{}.!. Miss one and the entire message 400s. I just send plain text. If you insist on formatting, use parse_mode=HTMLand escape withsed ’s/&/&/g; s/</</g; s/>/>/g’`.
The 4096 character limit is real. My send function chunks at 4000 to be safe. For monstrous outputs (logs, long analyses), consider uploading a file via sendDocument instead of splitting text. I use 4000 instead of the hard 4096 so I have headroom for emoji and wide Unicode that count for multiple bytes.
For more on running Claude Code non-interactively with full tool access, see my Claude Code SDK agents guide.
TypeScript version: more control
When you need streaming responses, fine-grained error handling, or proper concurrency, move to TypeScript. I use grammy for Telegram and @anthropic-ai/sdk directly for Claude.
npm init -y
npm install grammy @anthropic-ai/sdk dotenv
import "dotenv/config";
import { Bot } from "grammy";
import Anthropic from "@anthropic-ai/sdk";
const ALLOWED = new Set(
(process.env.TELEGRAM_ALLOWED_CHAT_IDS ?? "").split(",").map(Number)
);
const bot = new Bot(process.env.TELEGRAM_BOT_TOKEN!);
const claude = new Anthropic();
bot.on("message:text", async (ctx) => {
if (!ALLOWED.has(ctx.chat.id)) return;
// Send a placeholder so the user sees activity while Claude thinks.
const placeholder = await ctx.reply("...");
let buffer = "";
let lastEdit = Date.now();
const stream = claude.messages.stream({
model: "claude-sonnet-4-6",
max_tokens: 4000,
system: [
{
type: "text",
text: "You are a concise assistant replying inside Telegram. Plain text, no markdown.",
cache_control: { type: "ephemeral" },
},
],
messages: [{ role: "user", content: ctx.message.text }],
});
for await (const chunk of stream) {
if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
buffer += chunk.delta.text;
// Edit every 800ms so we don't hit Telegram's edit rate limit.
if (Date.now() - lastEdit > 800 && buffer.length > 20) {
await ctx.api.editMessageText(
ctx.chat.id,
placeholder.message_id,
buffer.slice(0, 4000)
);
lastEdit = Date.now();
}
}
}
// Final flush. Send a NEW message for notifications, or keep edit for silent updates.
await ctx.api.editMessageText(
ctx.chat.id,
placeholder.message_id,
buffer.slice(0, 4000)
);
});
bot.catch((err) => console.error("bot error", err));
bot.start();
The streaming edit pattern is the killer feature. Users see words appear in real time, which makes the bot feel fast even when Claude takes fifteen seconds to produce a long answer. Two subtleties: Telegram rate-limits editMessageText to roughly one edit per second per message, so I throttle. And edits do not trigger push notifications, which is great for streaming but bad for finished long-running tasks. Send a new message when work completes so the phone pings.
Prompt caching is free extra credit here. If your system prompt is more than a few hundred tokens (persona, instructions, tool list), marking it cache_control: ephemeral drops cost and latency on every follow-up within five minutes.
Handling attachments
Text-only bots get old fast. Telegram delivers photos and documents as file_id references; you fetch them in two hops.
bot.on("message:photo", async (ctx) => {
if (!ALLOWED.has(ctx.chat.id)) return;
const photo = ctx.message.photo.at(-1)!; // largest size
const file = await ctx.api.getFile(photo.file_id);
const url = `https://api.telegram.org/file/bot${process.env.TELEGRAM_BOT_TOKEN}/${file.file_path}`;
const bytes = Buffer.from(await (await fetch(url)).arrayBuffer());
const response = await claude.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1000,
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "base64",
media_type: "image/jpeg",
data: bytes.toString("base64"),
},
},
{ type: "text", text: ctx.message.caption ?? "What is in this image?" },
],
},
],
});
const text = response.content.find((b) => b.type === "text")?.text ?? "";
await ctx.reply(text.slice(0, 4000));
});
PDFs follow the same shape. Use message:document and send the bytes as a document content block (Claude supports PDF input up to 32MB). For voice notes, run the file through a transcription step first (Whisper, Deepgram, or AssemblyAI), then pass the transcript as text.
One gotcha: Telegram stores files for one year, but file_path expires in an hour. Download and either process immediately or stash to your own storage. For structured output from vision (item lists, receipts, schematics), pair this with the tool-use pattern from my Claude API structured output post.
Adding MCP tools via claude -p
This is where a telegram ai bot stops being a chatbot and starts being an assistant.
claude -p inherits MCP servers from ~/.claude.json or whichever config you pass via --mcp-config. When I send “what’s on my plate today” to my bot, the message goes through claude -p, which has access to my TickTick MCP server, which queries the live TickTick API. The bot responds with my actual task list, prioritized, with overdue items flagged. No scraping. No hardcoded API calls in the bot itself.
The architecture looks like this:
Telegram > bot daemon > claude -p > MCP server > external API
> filesystem tools
> bash tool
Wiring up a custom MCP server is out of scope for this post, but I wrote a full walkthrough in build an MCP server in TypeScript. Once your MCP server is registered, the bot picks it up with zero additional code. That is the point. You extend capabilities by adding tools to the shared config, not by rewriting the bot.
A sample MCP wiring in ~/.claude.json:
{
"mcpServers": {
"ticktick": {
"command": "node",
"args": ["/home/user/ticktick-mcp/ticktick-mcp-server.js"]
}
}
}
Restart the bot, send “move my haircut task to Saturday”, and Claude picks the right MCP tool and calls it. This is the same pattern I wrote about in I run 10 AI agents in production, they’re all bash scripts. Boring stack, heavy payoff.
Multi-turn memory that doesn’t burn tokens
Out of the box, each message to the bash bot is a fresh context. That’s fine for 80 percent of queries. When you want real conversations, you have three options.
Stateless, per-message. The default. Cheapest, simplest, no regressions on bugs. Good for task execution style bots where each request is self-contained (“add buy milk to tomorrow”, “what’s the weather”).
Per-chat history in SQLite. Keep a rolling window of the last N messages keyed by chat_id. Prepend to every Claude call. Ten lines of code:
import Database from "better-sqlite3";
const db = new Database("chat.db");
db.exec(`CREATE TABLE IF NOT EXISTS msgs (
chat_id INTEGER, role TEXT, content TEXT, ts INTEGER
)`);
function history(chatId: number, limit = 10) {
return db
.prepare("SELECT role, content FROM msgs WHERE chat_id = ? ORDER BY ts DESC LIMIT ?")
.all(chatId, limit)
.reverse();
}
Prompt caching with long context. For agent-style workflows where the same big system prompt and tool definitions hit every turn, enable cache_control: ephemeral on the system block. Follow-ups inside a five-minute window read cached tokens at a fraction of the input cost. I combine this with a short SQLite window for the user-specific turns. Best of both.
Rule of thumb. If you are building an ops bot or task runner, stateless. If you are building something people hold actual conversations with, SQLite plus prompt caching.
Security, the part people skip
Most “build a Telegram bot” tutorials skip straight from /newbot to sendMessage and leave the bot wide open. Don’t. Five rules.
Allowlist chat IDs, always. The single most important line in the entire bash script is the early-return when chat != TELEGRAM_ALLOWED_CHAT_ID. Telegram bot usernames are discoverable. Without an allowlist, random people find your bot, spam it, and every message is an API call you pay for.
Rate-limit even yourself. I have accidentally written loops that echo their own output back into the bot and racked up dollar-scale Claude bills in minutes. Cap incoming messages at something like five per minute per chat. A single counter in memory is fine.
Budget cap per call. Use --max-budget-usd on claude -p. If a prompt accidentally triggers a tool loop, the call aborts instead of grinding through your credit balance.
Log every in and out. Append every incoming message and outgoing reply to a local file with timestamps. When something goes wrong at 2am, you need a transcript. journalctl is fine; a plain log file is fine; just have one.
Never trust message text as shell input. If your bot executes shell commands on behalf of the user, use --permission-mode ask (the default is plan), pre-validate against an allowlist of commands, or run the bot in a container with a read-only filesystem. Treat every incoming message as hostile even if it is “just you” sending it. Your phone could be compromised. The bot shouldn’t be able to rm -rf /.
For the tool-use permissions model in more depth, see the Claude API tool use guide.
Running it as a systemd service
Long-polling daemons must restart on crash, boot with the server, and pipe logs somewhere you can read them. Systemd handles all three.
# /etc/systemd/system/telegram-bot.service
[Unit]
Description=Telegram Claude Bot
After=network-online.target
[Service]
Type=simple
User=debian
WorkingDirectory=/home/user/claude
EnvironmentFile=/home/user/.config/telegram-bot.env
ExecStart=/home/user/bin/telegram-bot.sh
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
Install it:
sudo systemctl daemon-reload
sudo systemctl enable --now telegram-bot
sudo journalctl -u telegram-bot -f
That’s it. The bot starts on boot, restarts five seconds after any crash, and logs go to journalctl. If you want a health check, a cron that greps systemctl is-active telegram-bot and alerts when it reports inactive is five lines. I cover the full systemd pattern for AI services in running AI servers with systemd.
Common failures
Things that have broken for me, in order of how often they break.
Bot hangs on a single request. A Claude call that triggers a huge tool loop can block the daemon for minutes. Add a timeout: timeout 120 claude -p .... The whole process returning 124 is preferable to a silent hang.
Telegram returns 429 Too Many Requests. Respect the retry_after in the JSON response and sleep. For streaming edits, the limit is roughly one per second per message. Back off, don’t hammer.
Updates get lost on restart. Without offset persistence, the bot either reprocesses old messages or skips new ones. The bash script above writes OFFSET to /tmp/telegram-bot.offset on every update and reads it on loop resume.
sendMessage 400s on the reply. Almost always a parse-mode escaping bug. Drop parse mode entirely or switch to HTML and escape three characters instead of fourteen.
Edit-message versus new-message confusion. Edits don’t notify. If a long Claude run finishes and you edit the placeholder, the user’s phone stays silent. For anything over ten seconds of work, send a new message for the final answer.
Claude call returns empty text. Almost always a tool-use response that your code didn’t handle. Iterate through content blocks and concatenate text blocks; tool blocks need to be executed and their results fed back.
MCP server not picked up. claude -p loads MCP config from the working directory’s .claude.json first. If the bot runs from /home/debian but your config is in /home/user/claude, the servers don’t load. Set WorkingDirectory= in the systemd unit.
Real patterns I actually use
Concrete claude telegram integration patterns running on my VPS right now.
AI assistant in your pocket. Send any question. The bot has access to my repos, my task list, and the open web via MCP tools. “Summarize the latest issues in the graffiti repo” works. So does “what meetings do I have tomorrow and which ones can I decline”.
Morning briefing at 06:30 Madrid. A cron runs telegram-notify.sh with today’s tasks, overdue items, weather, and a one-paragraph summary from Claude. Pushed before I even pick up the phone.
Task inbox. I forward Upwork email alerts to the bot. It classifies the job, drafts a cover letter, saves it to TickTick as a subtask under the Upwork ticket. Zero manual copy-paste.
Code question on the go. On the train, I send “does the ralph loop handle empty task lists correctly”. The bot does claude -p with file tools, reads the relevant files, and returns the answer. Sometimes with a diff.
Pipeline status. My weekly content pipeline posts a summary to the bot when it finishes. Success or failure, with log links. No need to SSH in to check.
None of these are app-store apps. None of them have a frontend. All of them run in a few hundred lines of bash and TypeScript on the same 10 euro VPS. That is the point of a Telegram bot with Claude. Low effort, high utility, one client that reaches me wherever I am.
The minute one of these crosses into team use, I migrate it to the webhook architecture and add a proper authentication layer. Until then, long-polling bash is the right answer for 95 percent of personal AI tools.