Monatliche Tokens eingeben. Sofort sehen, wann Claude/OpenAI-API, self-hosted vLLM oder ein Hybrid-Muster tatsächlich am günstigsten ist — mit Batch-Rabatt, Prompt-Cache und realer GPU-Auslastung. Enter your monthly tokens. See instantly when Claude/OpenAI API, self-hosted vLLM, or a hybrid pattern is actually cheapest — with batch discount, prompt cache, and real GPU utilization factored in.
Methodik und vollständiger Kostenaufriss: Self-Hosted LLM vs. API: Break-Even-Analyse Methodology and full cost teardown: Self-Hosted LLM vs API Cost: Break-Even Analysis
API pricing is straightforward: you pay for tokens. Self-hosting is not straightforward: you pay for a GPU, for utilization, for engineering time, and for the operational tail. This calculator models all four costs against your real volume so the crossover point shows up where it actually sits, not where a vendor blog claims.
Methodology and the full cost teardown: Self-Hosted LLM vs API Cost: Break-Even Analysis. Related: LLM API Cost Comparison, Self-Hosted LLM on Kubernetes, How to Choose an LLM for Production.
Sie sehen den Schnittpunkt. Wenn Sie einen konkreten Deployment-Plan wollen — vLLM-Config, Autoscaling, Fallback zur API, Monitoring — ich liefere ihn schriftlich innerhalb von 24 Stunden. You see the crossover point. If you want a concrete deployment plan — vLLM config, autoscaling, fallback to API, monitoring — I deliver it in writing within 24 hours.
Mein Konzept anfragen Request my scopeVier Kostenpfade, gegen Ihr monatliches Token-Volumen modelliert. Four cost paths, modeled against your monthly token volume.
USD/EUR: 0,92. GPU-Spot-Raten: L40S $0,86/h, H100 $2,50/h. Throughput-Annahmen (Output-Tokens/Sek. bei 60% Auslastung): 8B-Klasse ~1.500, 70B-Klasse ~400, 405B-Klasse ~120. Eigene Raten vor Commit prüfen — Spot-Preise ändern sich wöchentlich. USD/EUR: 0.92. GPU spot rates: L40S $0.86/h, H100 $2.50/h. Throughput assumptions (output tokens/sec at 60% utilization): 8B class ~1,500, 70B class ~400, 405B class ~120. Verify your own rates before committing — spot pricing moves weekly.