Systemd Services for AI Servers: Production Setup on Linux
I run a TickTick MCP server, a Telegram bot that routes through Claude Opus, and ten scheduled AI agents on a single Debian VPS. None of them run in Docker. All of them run as systemd services or systemd timers.
This is the setup guide for running systemd services for AI servers the way I actually do it in production. Unit files, logs, timers, resource limits, and the security hardening that matters. No container orchestration, no Kubernetes, no Docker Compose YAML. Just systemd, because for single-host AI workloads it is the right tool.
If you are wiring up an LLM inference server, an MCP server, or a fleet of bash-based agents, this is the pattern that keeps things alive across reboots, OOM kills, and the DST transition nobody remembered to test.
Why systemd for small AI services
Docker is the default answer for “how do I deploy this”. For single-host AI services it is usually the wrong one. Here is why I moved everything off containers and onto systemd units.
Single-host apps do not need container isolation. My MCP server is one Node process. My Telegram bot is one bash loop. Wrapping each in a container adds a layer of indirection that solves a problem I do not have.
Logs integrate with the OS. journalctl -u ticktick-mcp -f gives me live tail with no extra tooling. No docker logs wrapper, no log driver config, no volume mount for persistence. Rotation is automatic and driven by journald.conf.
Reboots are free. systemctl enable --now ticktick-mcp means the service comes back after a kernel upgrade, a VPS migration, or a 3am OOM event. Docker with --restart=always gets you most of the way, but you still need the Docker daemon up first.
Resource limits without container overhead. MemoryMax=2G on a unit file is one line. No cgroup v2 yak-shaving, no Docker runtime flags.
Timers handle timezones natively. This is the big one for scheduled AI jobs. OnCalendar=*-*-* 06:30:00 Europe/Madrid handles DST automatically. Vixie cron on Debian does not accept inline TZ= and will silently run your 6:30am job at 5:30am twice a year.
When to still reach for Docker: multi-tenant deployments, CI pipelines that ship container images as artifacts, apps that need hard process isolation for security reasons, or when you deploy the same workload across heterogeneous hosts. For a single VPS running services I own, systemd wins.
If you are still picking hardware, see the Hetzner vs AWS comparison for AI workloads before committing to a platform.
A unit file, explained line by line
Here is the actual unit file that runs my TickTick MCP server in production. Drop it at /etc/systemd/system/ticktick-mcp.service.
[Unit]
Description=TickTick MCP server
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=debian
WorkingDirectory=/home/user/ticktick-mcp
ExecStart=/usr/bin/node /home/user/ticktick-mcp/ticktick-mcp-server.js
Restart=on-failure
RestartSec=5
Environment=NODE_ENV=production
EnvironmentFile=/home/user/ticktick-mcp/.env
[Install]
WantedBy=multi-user.target
Walking through each directive:
Description: shows up insystemctl statusand journal output. Keep it short and identifiable.After=network-online.target: wait until the network stack is up before starting. Critical for anything that makes outbound API calls on boot.Wants=network-online.target: pulls in the network-online target so it actually runs.Afteralone does nothing if the target is not activated.Type=simple: the process inExecStartis the main service process. UseType=notifyif your app sendssd_notify("READY=1"),Type=forkingfor daemons that double-fork.User=debian: never run AI services as root. Drop to a dedicated user with only the file access it needs.WorkingDirectory: resolves relative paths in your app. Skipping this is the single most common reason “it works when I run it manually” services fail under systemd.ExecStart: absolute path to the interpreter, then absolute path to the script. Relative paths fail even whenWorkingDirectoryis set.which nodegives you the binary path.Restart=on-failure: restart if the process exits with non-zero status. UseRestart=alwaysfor long-running inference servers that can leak memory and need periodic bounces.RestartSec=5: wait 5 seconds between restarts. Prevents hot-loop crashes from hammering the system.Environment=: inline env vars.EnvironmentFile=: load secrets from a file not checked into git. OneKEY=valueper line.WantedBy=multi-user.target: the standard boot target for non-graphical servers. This is whatenablehooks into.
For the TypeScript-specific details on what goes inside the MCP server process itself, see how to build an MCP server in TypeScript.
Logs are free and good
journalctl is the entire logging story. I never set up a log aggregator for my VPS because I do not need one.
# Tail live
journalctl -u ticktick-mcp -f
# Last 50 lines
journalctl -u ticktick-mcp -n 50
# Since a timestamp
journalctl -u ticktick-mcp --since "10 min ago"
journalctl -u ticktick-mcp --since "2026-04-17 06:00"
# Only errors
journalctl -u ticktick-mcp -p err
# Everything from this boot
journalctl -u ticktick-mcp -b
Rotation happens automatically based on /etc/systemd/journald.conf. By default journald caps disk usage at 10% of the filesystem. For a VPS with AI services logging structured JSON, I bump SystemMaxUse=1G and move on.
Logging to stdout from your AI service is the right default. Do not write to /var/log/myapp.log yourself. Let journald handle it, let journalctl query it.
Resource limits you actually need
LLM inference servers leak memory. Agent loops hold onto Claude API responses longer than they should. Bash scripts shelling out to Python can fork more than you expect. Three directives cover 90% of what matters:
[Service]
MemoryMax=4G
CPUQuota=200%
TasksMax=512
MemoryMax=4G: hard cap. If the service hits it, the OOM killer takes it down and systemd restarts per yourRestart=policy. This is how you survive a memory leak in a long-running inference process.CPUQuota=200%: at most 2 full CPU cores. Use this when one service should not starve the others on the host.TasksMax=512: cap on threads and processes. Catches runaway fork bombs from shell agents that call subprocesses in a loop.
For an inference server with a known working set, I set MemoryMax to 80% of what the model weights plus a reasonable KV cache need, and pair it with Restart=always. Bounces happen, service stays up.
For more on sizing the underlying VPS, see my Linux VPS setup for AI development.
Timers beat cron for scheduled AI jobs
I moved every cron job on my VPS to systemd timers. The reason is timezone handling.
Vixie cron on Debian does not honor inline TZ= directives. You can set TZ=Europe/Madrid at the top of the crontab and it gets passed as an env var to the script, but it does not affect scheduling. A job set to run at 6:30 runs at 6:30 UTC. Twice a year, at the DST switch, your “morning briefing” fires an hour early or late.
Systemd timers take a timezone directly. Here is the morning briefing I run at 6:30 Madrid time, every day.
/etc/systemd/system/morning-briefing.service:
[Unit]
Description=Morning briefing via Claude
[Service]
Type=oneshot
User=debian
ExecStart=/home/user/ticktick-mcp/morning-briefing.sh
/etc/systemd/system/morning-briefing.timer:
[Unit]
Description=Run morning briefing daily
[Timer]
OnCalendar=*-*-* 06:30:00 Europe/Madrid
Persistent=true
[Install]
WantedBy=timers.target
Then:
sudo systemctl daemon-reload
sudo systemctl enable --now morning-briefing.timer
systemctl list-timers morning-briefing.timer
Persistent=true means if the machine was off at 6:30, the job runs on next boot. OnCalendar with Europe/Madrid handles DST. systemctl list-timers shows the next trigger time, which is what you verify before walking away.
This is the pattern I use for every scheduled agent. The ten AI agents running as bash scripts in production are all timer-plus-service pairs. The agents themselves are covered in more depth in the writeup on building agents with the Claude Code SDK.
Security hardening that matters
You do not need to enable every sandboxing directive. These five give you most of the isolation benefit without breaking your app.
[Service]
NoNewPrivileges=true
ProtectSystem=strict
PrivateTmp=true
ReadWritePaths=/home/user/ticktick-mcp /var/log/ticktick-mcp
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
NoNewPrivileges=true: the service cannot gain privileges through setuid binaries. Always safe.ProtectSystem=strict: the entire filesystem is read-only to this service, except/dev,/proc,/sys.PrivateTmp=true: gets its own/tmpnamespace. Prevents temp-file collisions and leaks between services.ReadWritePaths=: explicit list of directories the service can write to. This is what you carve out after settingProtectSystem=strict.RestrictAddressFamilies=: only allow IPv4, IPv6, and Unix sockets. Blocks raw sockets, Netlink, etc.
Add these after the service works. Debugging “my app cannot write to its config file” while you are also writing the initial unit file is a bad time.
Deployment and updates
The deployment flow, every time:
# 1. Drop the unit file
sudo cp ticktick-mcp.service /etc/systemd/system/
# 2. Reload systemd to pick up the new unit
sudo systemctl daemon-reload
# 3. Enable (start on boot) and start now
sudo systemctl enable --now ticktick-mcp
# 4. Verify
systemctl status ticktick-mcp
journalctl -u ticktick-mcp -n 50
Updating a running service:
# Edit your app code or unit file
sudo systemctl daemon-reload # only if the unit file changed
sudo systemctl restart ticktick-mcp
journalctl -u ticktick-mcp -f
Never skip the daemon-reload after editing a unit file. Systemd caches the parsed unit. Restarting without reload restarts the old version.
Failure modes
Three failure patterns cover most of what I see in practice.
Service keeps restarting. Check journalctl -u <name> -n 100. Usually an environment variable is missing, the working directory does not exist, or a config file path is wrong. The restart loop is the symptom; the actual error is in the log above the “Stopped” line.
start-limit-hit. Too many restart failures in a short window. Systemd gives up and marks the unit failed. Fix the underlying issue, then:
sudo systemctl reset-failed ticktick-mcp
sudo systemctl start ticktick-mcp
Binary not found. Type=simple with a relative path in ExecStart fails. Always use absolute paths. /usr/bin/node, not node. /home/user/app/run.sh, not ./run.sh. WorkingDirectory sets the CWD for the process, but the binary lookup happens before that.
Permission denied on write. You added ProtectSystem=strict and forgot to put the log directory in ReadWritePaths=. Either add it or drop the strict mode until the app works.
When to use this pattern vs containers
Use systemd services when:
- You run one or two hosts and you own them.
- Your services are long-lived processes or scheduled jobs.
- You want OS-level log aggregation and resource limits without extra tooling.
- You need timezone-aware scheduling.
Use containers when:
- You deploy the same workload across heterogeneous hosts or clouds.
- Your CI pipeline produces container images as the release artifact.
- You need strong process isolation for multi-tenant reasons.
- Your team is larger than one and container orchestration is how you coordinate deployments.
For a single VPS running MCP servers, inference endpoints, and scheduled agents, systemd is enough. The tooling is already installed, the logs are already rotating, the reboot-survival story works by default.