Tagged: Vllm

2 posts

Claude Code with Local LLMs and ANTHROPIC_BASE_URL: Ollama, LM Studio, llama.cpp, vLLM

April 29, 2026 · 16 min read · guides

Run Claude Code on a local LLM via ANTHROPIC_BASE_URL. Native Anthropic endpoints for Ollama, LM Studio, llama.cpp, vLLM. 32K context floor.

Self-Hosted LLM on Kubernetes: A Production vLLM Deployment

April 5, 2026 · 16 min read · blog

Complete self-hosted LLM Kubernetes guide. Deploy vLLM on GPU nodes with manifests, HPA, monitoring, and cost modeling. Practitioner notes included. Download the free AI Automation Checklist.