Linux VPS AI Development Setup: Debian, Claude Code, MCP

March 24, 2026 · 12 min read · linux, vps, devops, ai-development, debian
Linux VPS AI Development Setup: Debian, Claude Code, MCP

My laptop sleeps. My agents do not. That is the whole reason I run a Linux VPS AI development setup instead of coding AI agents against a local Python venv and calling it a day.

Everything I ship, the TickTick MCP server, the Telegram bot that long-polls Claude Opus, the cron driven morning briefings, the customer profiling pipeline, runs on one Debian box. No Kubernetes. No Docker Swarm. Just systemd, bash, and the Anthropic SDK. This tutorial is the exact sequence I use when I provision a new VPS for an AI project, from a fresh Hetzner image to a working Claude Code CLI with MCP clients wired up.

The verdict up front: Debian 13 on a Hetzner Cloud CX22 or CX32 is the right default for a linux vps for ai work. You do not need a GPU for agent orchestration, you need persistence, a public IP, and systemd. Reserve GPU spend for actual inference and keep the orchestration layer on a $5 to $20 per month box.

Why a Linux VPS for AI dev

A laptop is the wrong substrate for agents. The moment you close the lid, your cron jobs stop, your webhook endpoints go dark, and any overnight evaluation run dies at 23:47 when the battery saver kicks in.

A self-hosted ai development VPS solves five specific problems:

  1. Persistent state. Cron jobs fire whether or not you are at the keyboard. Long-running agents keep a Redis queue warm for days. Overnight evaluation runs finish while you sleep.
  2. Full runtime control. I can install whatever system package I want, mount a tmpfs for hot caches, set kernel parameters, and run three Node versions side by side. A sandboxed notebook host will not let you do any of that.
  3. Predictable cost. A CX22 is a few euros a month. An AX41 bare metal box is around 40. No surprise egress bills. No “you forgot to stop the notebook” charges.
  4. Always-on for bots and webhooks. My Telegram bot long-polls the Telegram API in a tight loop. Stripe and Claude webhooks need a public IP that answers on port 443. That is a VPS job, not a laptop job.
  5. One shell, many projects. Ten agents share the same Python toolchain, the same systemd, the same journald. Switching projects is cd, not a cloud console.

If you want context for how this scales, I wrote about exactly this pattern in I run 10 AI agents in production, they are all bash scripts.

Provider and OS choice

For most AI dev work, Hetzner Cloud in an EU region is what I reach for. It has honest network pricing, fast NVMe, and the API is clean enough to terraform. AWS, GCP, and Azure make sense only when you specifically need a managed service in the same region as your orchestration, Bedrock, Vertex, that class of thing. For a deeper cost and workload comparison read Hetzner vs AWS for AI workloads.

For the OS, pick Debian 13 (stable). Not Ubuntu, not Alpine.

  • Debian over Ubuntu because Ubuntu is a derivative that drifts. Snap pushes updates you did not ask for, the default kernel ships extra telemetry, and apt pinning is more work than it is worth. Debian stable moves slowly on purpose, which is what you want under a bunch of long-running agents.
  • Debian over Alpine because musl breaks Python and Node wheel installs in annoying, hard-to-debug ways. psycopg2-binary, cryptography, grpcio, all happier on glibc. A debian ai dev environment is the path of least resistance.

Provision the VPS with your SSH public key pre-seeded. Do not use password login from minute one.

First 10 minutes: hardening

This is the exact block I run after first SSH, paste, wait, done.

# 1. Update and install core tools
sudo apt update && sudo apt upgrade -y
sudo apt install -y ufw fail2ban unattended-upgrades \
  htop btop tmux vim git curl jq ripgrep fd-find

# 2. Create a non-root user (if your provider didn't)
sudo adduser --disabled-password --gecos "" user
sudo usermod -aG sudo user
sudo mkdir -p /home/user/.ssh
sudo cp ~/.ssh/authorized_keys /home/user/.ssh/
sudo chown -R user:user /home/user/.ssh
sudo chmod 700 /home/user/.ssh
sudo chmod 600 /home/user/.ssh/authorized_keys

# 3. Lock down SSH
sudo sed -i 's/^#*PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/^#*PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sudo systemctl restart ssh

# 4. Firewall
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw --force enable

# 5. fail2ban for SSH brute force
sudo systemctl enable --now fail2ban

# 6. Automatic security updates
sudo dpkg-reconfigure -plow unattended-upgrades

# 7. Timezone (I run Madrid locally, UTC in scripts)
sudo timedatectl set-timezone Europe/Madrid

One opinionated note on timezone. Vixie cron does not honor inline TZ= per line. If you need timezone-aware scheduling, use systemd timers, not cron. I learned that the hard way when a DST switch moved my morning briefing an hour off for three days.

Core dev tooling

Once the box is locked down, install the language toolchains. Keep system Python alone, use venvs everywhere.

# Node via nvm (pick LTS)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
source ~/.bashrc
nvm install --lts
nvm alias default lts/*

# Python
sudo apt install -y python3 python3-venv python3-pip pipx
# Optional, much faster than pip
pipx install uv

# Go (apt lags, grab the tarball)
GO_VERSION=1.23.4
curl -LO https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz
sudo rm -rf /usr/local/go
sudo tar -C /usr/local -xzf go${GO_VERSION}.linux-amd64.tar.gz
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc

# Extras I actually use daily
sudo apt install -y httpie sqlite3

For git, sign your commits. I keep a dedicated GPG key on each VPS, never reuse the laptop key.

gpg --full-generate-key   # ed25519, no expiry on this key, or set 2y
KEYID=$(gpg --list-secret-keys --keyid-format=long | grep sec | awk '{print $2}' | cut -d/ -f2)
git config --global user.signingkey $KEYID
git config --global commit.gpgsign true
git config --global tag.gpgsign true

Installing Claude Code and an Anthropic stack

This is the part people overthink. Two commands.

npm install -g @anthropic-ai/claude-code
claude --version

Auth. Either export the API key, or run the interactive login once:

# Option A: env var (good for headless)
echo 'export ANTHROPIC_API_KEY=sk-ant-...' >> ~/.bashrc
source ~/.bashrc

# Option B: interactive auth (if you have a browser on the VPS, you usually do not)
claude auth

On a headless box, the env var is simpler. Test end to end:

claude -p "Say hello in one word" --model claude-sonnet-4-6

If that returns a word, you are wired up. I use this exact pattern from cron jobs. The Telegram bot on this box pipes incoming messages into claude -p --model claude-opus-4-7 with MCP tools enabled and returns the result to the chat.

The Python SDK for scripted work:

python3 -m venv ~/.venvs/ai
source ~/.venvs/ai/bin/activate
uv pip install anthropic openai pydantic python-dotenv httpx

A baseline script that proves the stack:

# ~/ai-test/hello.py
import os
from anthropic import Anthropic

client = Anthropic()
resp = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=128,
    messages=[{"role": "user", "content": "Return the current UTC hour as an integer."}],
)
print(resp.content[0].text)
print("usage:", resp.usage.input_tokens, "->", resp.usage.output_tokens)

Log those token counts. We will use them for cost tracking later.

MCP clients on the VPS

Claude Code on a VPS reads its config from ~/.claude.json. That is where you register Model Context Protocol servers so claude -p can call custom tools. A minimal block:

{
  "mcpServers": {
    "ticktick": {
      "command": "node",
      "args": ["/home/user/ticktick-mcp/ticktick-mcp-server.js"],
      "env": {
        "TICKTICK_USERNAME": "[email protected]",
        "TICKTICK_PASSWORD": "$TICKTICK_PASSWORD"
      }
    }
  }
}

That is the shape I use for the TickTick MCP server I wrote. It runs as a systemd service and claude -p talks to it over stdio. If you want the full walk-through on writing your own, I documented it in Build an MCP server in TypeScript. For orchestrating agents end to end, Claude Code SDK agents covers the programmatic path.

Secrets and env

Rule one: never commit a .env file. Rule two: never paste secrets into a systemd unit.

I keep one .env per project, loaded via EnvironmentFile= in the service unit:

# /etc/systemd/system/myagent.service
[Service]
EnvironmentFile=/etc/myagent.env
ExecStart=/home/user/myagent/run.sh
User=user
Restart=on-failure

The .env file lives outside the repo, owned by root, mode 600:

sudo install -o root -g root -m 600 /dev/null /etc/myagent.env
sudo editor /etc/myagent.env

For Python loading, use python-dotenv in dev and EnvironmentFile in prod so the service unit is the single source of truth.

If you run more than three or four projects on one VPS, graduate to pass (GPG-backed) or age for encrypted secret files. Both integrate with systemd via a ExecStartPre= that decrypts into /run/ (a tmpfs), which means the cleartext never touches disk.

Running persistent services

Every long-running thing on the VPS becomes a systemd unit. No nohup, no screen, no pm2. The Telegram bot, the MCP server, the webhook receiver, all of them.

I wrote a focused companion piece on the exact pattern I use, unit template, Restart=on-failure, journald log rotation, timer based schedules. Read systemd services for AI servers for the full template. The short version:

sudo systemctl daemon-reload
sudo systemctl enable --now myagent.service
sudo systemctl status myagent
journalctl -u myagent -f

For scheduled jobs, use systemd timers over cron. Timers honor timezones, survive DST, and log to the same journal as the service they trigger. The cron I keep is only for a handful of legacy scripts.

A typical cron or timer pattern for an AI agent on this box looks like this:

#!/usr/bin/env bash
# /home/user/bin/morning-briefing.sh
set -euo pipefail
LOG=/var/log/morning-briefing.log

PROMPT="Summarize my TickTick tasks due today in 5 bullets."
OUTPUT=$(env -u CLAUDECODE -u CLAUDE_CODE_ENTRYPOINT \
  claude -p --model claude-sonnet-4-6 "$PROMPT" 2>>"$LOG")

curl -s -X POST "https://api.telegram.org/bot${TG_TOKEN}/sendMessage" \
  -d chat_id="${TG_CHAT_ID}" \
  --data-urlencode "text=$OUTPUT" >>"$LOG" 2>&1

Rotate those logs. One line in /etc/logrotate.d/ai-agents:

/var/log/*-briefing.log /var/log/*-followups.log {
    weekly
    rotate 8
    compress
    missingok
    notifempty
    create 0644 user user
}

Monitoring and cost tracking

Live view, htop or btop. Service logs, journalctl -u <unit> -f. For a browser dashboard with under one percent overhead, install netdata:

bash <(curl -sS https://my-netdata.io/kickstart.sh) --dont-wait
sudo ufw allow from <your-ip> to any port 19999

That gives you CPU, memory, disk, and per-service stats in a web UI. Do not expose 19999 to the public internet, allow only your home IP.

The piece most people skip is LLM cost tracking. Log every call into a sqlite file, query weekly:

# ~/ai-test/track.py
import sqlite3, datetime
DB = "/home/user/.llm-usage.db"

def log_call(model: str, input_t: int, output_t: int, project: str):
    con = sqlite3.connect(DB)
    con.execute("""
      CREATE TABLE IF NOT EXISTS calls (
        ts TEXT, model TEXT, project TEXT,
        input_tokens INT, output_tokens INT
      )
    """)
    con.execute(
      "INSERT INTO calls VALUES (?,?,?,?,?)",
      (datetime.datetime.utcnow().isoformat(), model, project, input_t, output_t),
    )
    con.commit()
    con.close()

Wrap every client.messages.create(...) call with it. At the end of the week run one aggregate query per model and you know exactly where your spend went. This is the only way I have found to stop one misbehaving agent from silently doubling a monthly bill.

Backups

restic to Backblaze B2 or any S3 compatible store. Encrypted, deduplicated, cheap.

sudo apt install -y restic
export RESTIC_REPOSITORY=b2:my-bucket:/vps
export RESTIC_PASSWORD_FILE=/root/.restic-pw
restic init

# Daily snapshot of the things that matter
restic backup \
  /home/user \
  /etc/systemd/system \
  /etc/nginx \
  --exclude '/home/user/.cache' \
  --exclude '/home/user/.venvs'

Wire it to a systemd timer, nightly at 03:00. If you run Postgres or SQLite databases for agents, add a pg_dump or sqlite3 .backup step before the restic run and snapshot the dump directory.

Test restore once a quarter on a throwaway VPS. A backup you have not restored is a hope, not a backup.

Deploying a personal AI app

Two patterns I actually use.

Pattern A, the simple one. SSH in, git pull, systemctl restart. That is it.

ssh user@vps "cd ~/myagent && git pull && sudo systemctl restart myagent"

For a one-person project with a handful of deploys a week, this is fine. Do not over-engineer it.

Pattern B, the push deploy. A bare git repo on the VPS with a post-receive hook. You git push vps main from your laptop and the hook builds and restarts.

# On the VPS, once
mkdir -p ~/repos/myagent.git && cd $_
git init --bare

cat > hooks/post-receive <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
WORK=/home/user/myagent
git --work-tree=$WORK --git-dir=$PWD checkout -f main
cd $WORK
npm ci --omit=dev
sudo systemctl restart myagent
EOF
chmod +x hooks/post-receive

# On your laptop
git remote add vps [email protected]:repos/myagent.git
git push vps main

Add user ALL=(ALL) NOPASSWD: /bin/systemctl restart myagent to /etc/sudoers.d/myagent so the hook does not prompt for a password.

What not to do

A short list of mistakes I have made on a hetzner vps ai setup so you do not have to:

  • Do not run LLM inference on a CPU VPS. A 7B model on a CX32 will give you two tokens a second and heat the datacenter. Use an API for orchestration, rent a GPU instance on demand for inference, keep them separate.
  • Do not expose MCP servers on a public port without auth. An MCP server over stdio on localhost is safe. An MCP server bound to 0.0.0.0 with no token is a prompt-injection pipeline from the open internet to your shell.
  • Do not hardcode API keys in systemd units. They end up in journalctl output, in your backup, in pastebins when you ask for help. Use EnvironmentFile= and chmod 600.
  • Do not rely on cron for timezone-sensitive jobs. systemd timers handle DST. Cron does not.
  • Do not skip the firewall. ufw on day one. I have seen fresh VPS instances get brute-forced within six hours of provisioning.

Which setup should you choose

If you are starting a new AI project this week, the default stack is: Debian 13, Hetzner CX22, the hardening block above, Node LTS, Python 3.12 with uv, Go from tarball, Claude Code CLI, systemd for every long-running thing, .env files loaded via EnvironmentFile=, restic to B2 nightly. You can be shipping agents in under an hour from ssh root@....

Scale up the box only when you see sustained CPU over 70 percent or your agent count pushes RAM past 4 GB. A single CX32 has carried 10 agents, a Telegram bot, an MCP server, an nginx reverse proxy, and a Postgres instance for me without complaint. You almost certainly do not need more hardware, you need better systemd units and tighter cron windows.

Download the AI Automation Checklist (PDF)