<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Rag on René Zander | AI Automation Consultant</title><link>https://renezander.com/tags/rag/</link><description>Recent content in Rag on René Zander | AI Automation Consultant</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 08 May 2026 10:00:00 +0200</lastBuildDate><atom:link href="https://renezander.com/tags/rag/index.xml" rel="self" type="application/rss+xml"/><item><title>Pinecone vs RunPod: Vector DB vs GPU Host (You Probably Need Both)</title><link>https://renezander.com/guides/pinecone-vs-runpod/</link><pubDate>Fri, 08 May 2026 10:00:00 +0200</pubDate><guid>https://renezander.com/guides/pinecone-vs-runpod/</guid><description>&lt;p>People search &amp;ldquo;pinecone vs runpod&amp;rdquo; because they&amp;rsquo;ve heard both names, both seem AI-related, and they&amp;rsquo;re trying to pick one. The premise is wrong. They aren&amp;rsquo;t competitors. They sit in different layers of a typical AI stack and most production RAG systems use one of each.&lt;/p>
&lt;p>This guide untangles what each is, what you should actually be comparing, and when your architecture needs both.&lt;/p>
&lt;h2 id="pinecone-in-one-paragraph">Pinecone in one paragraph&lt;/h2>
&lt;p>Pinecone is a managed vector database. You give it embeddings (high-dimensional numeric vectors that represent text, images or audio) and metadata, and it returns the nearest neighbours to a query vector. The pitch is that you get billion-vector scale, sub-100ms p99 latency, and hybrid search with filters, without running your own database. You pay per index size and query volume.&lt;/p></description></item><item><title>Qdrant vs Pinecone vs Weaviate: Production Vector DB Comparison 2026</title><link>https://renezander.com/guides/qdrant-vs-pinecone-vs-weaviate/</link><pubDate>Fri, 03 Apr 2026 13:00:00 +0200</pubDate><guid>https://renezander.com/guides/qdrant-vs-pinecone-vs-weaviate/</guid><description>&lt;p>Three vector databases keep showing up on every RAG stack in 2026: Qdrant, Pinecone, and Weaviate. I get asked which one to pick at least once a week, usually by someone who already spent two days reading benchmarks and still has no answer.&lt;/p>
&lt;p>The short version, because you have real work to do: &lt;strong>Qdrant for most self-hosted production RAG in 2026. Pinecone when the requirement is &amp;ldquo;managed, don&amp;rsquo;t touch the servers&amp;rdquo;. Weaviate when you need the extra primitives like GraphQL or the module ecosystem.&lt;/strong> I run Qdrant in production for Teedian and recommend it to most consulting clients. The reasons are below, including the edge cases where I pick something else.&lt;/p></description></item><item><title>RAG Pipeline Tutorial: Build a Production Document Q&amp;A System with Qdrant and Claude</title><link>https://renezander.com/blog/rag-pipeline-tutorial/</link><pubDate>Wed, 01 Apr 2026 09:00:00 +0200</pubDate><guid>https://renezander.com/blog/rag-pipeline-tutorial/</guid><description>&lt;p>Most RAG tutorials ship a toy. You paste a PDF, it answers one question, and the moment you point it at 500 documents the retrieval goes sideways and Claude hallucinates half the citations. This one is the opposite. I am going to walk through the pipeline I actually run in production, line by line, with the tradeoffs called out where they bit me.&lt;/p>
&lt;p>The verdict first. If your corpus is under 200k tokens and rarely changes, skip RAG and stuff it all into Claude&amp;rsquo;s context window. If your corpus is larger, changes often, or you need hard citations, build this RAG pipeline tutorial end to end with Qdrant, a local embedding model, and Claude Sonnet 4.6. That is the sweet spot for cost and quality in 2026.&lt;/p></description></item></channel></rss>