<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>RAG on Arshad Siddiqui</title><link>https://arshadhs.github.io/tags/rag/</link><description>Recent content in RAG on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://arshadhs.github.io/tags/rag/index.xml" rel="self" type="application/rss+xml"/><item><title>Retrieval-Augmented Generation (RAG)</title><link>https://arshadhs.github.io/docs/ai/genai/rag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/rag/</guid><description>&lt;h1 id="retrieval-augmented-generation-rag">
 Retrieval-Augmented Generation (RAG)
 
 &lt;a class="anchor" href="#retrieval-augmented-generation-rag">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Retrieval-Augmented Generation (RAG)&lt;/strong> is a system design pattern that improves an LLM’s answers by:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Retrieving&lt;/strong> relevant information from an external knowledge source, and then&lt;/li>
&lt;li>&lt;strong>Augmenting&lt;/strong> the LLM prompt with that retrieved context before generating the final response.&lt;/li>
&lt;/ol>
&lt;p>RAG helps an LLM &lt;strong>look things up first&lt;/strong>, then &lt;strong>answer using evidence&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="why-rag-is-useful">
 Why RAG is Useful
 
 &lt;a class="anchor" href="#why-rag-is-useful">#&lt;/a>
 
&lt;/h2>
&lt;p>RAG is commonly used when:&lt;/p>
&lt;ul>
&lt;li>Your knowledge is in &lt;strong>private documents&lt;/strong> (PDFs, policies, internal wiki)&lt;/li>
&lt;li>You need &lt;strong>up-to-date information&lt;/strong> (things not in the model’s training data)&lt;/li>
&lt;li>You want fewer &lt;strong>hallucinations&lt;/strong> by grounding answers in retrieved sources&lt;/li>
&lt;li>You want &lt;strong>traceability&lt;/strong> (show “where the answer came from”)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>RAG does not change the model weights.&lt;br>
It changes what the model &lt;em>sees&lt;/em> at inference time by adding retrieved context.&lt;/p></description></item></channel></rss>