Skip to content

RAG Flow with Index Creation

The diagram now shows two main processes:

  1. RAG Index Creation Process (top flow):
  2. Start with a document collection
  3. Split documents into smaller chunks
  4. Generate embeddings for each chunk using an embedding model
  5. Store these vectors in a vector database
  6. Create the RAG index from these vectors

  7. Query Processing Flow (bottom flow):

  8. User submits a query ("What is TCP?")
  9. The query is embedded using the same embedding model
  10. The system performs vector similarity search using the pre-created RAG index
  11. Retrieves the top-K most similar chunks
  12. Feeds these chunks along with the original query into the LLM's prompt
  13. LLM generates the final response

The critical connection between these processes is shown by the dotted line from "RAG Index Creation" to "Vector Similarity Search," indicating that the index created in the first process is used during the retrieval step of the second process. This diagram provides a more complete view of the RAG architecture, showing both how the knowledge base is prepared and how it's used at query time.

## 1. RAG Index Creation Process
+----------------------+     +----------------------+     +----------------------+
|                      |     |                      |     |                      |
| Document Collection  +---->+  Document Chunking   +---->+   Chunk Embedding   |
|                      |     |                      |     |                      |
+----------------------+     +----------------------+     +-----------+----------+
                                                                     |
                                                                     |
+----------------------+     +----------------------+                |
|                      |     |                      |                |
|  RAG Index Creation  +<----+   Vector Storage     +<---------------+    
|                      |     |                      |     
+----------+-----------+     +----------------------+     
           |
           |
           |  Index is used for
           |
           v
## 2. Query Processing Flow
+----------------------+     +----------------------+     +----------------------+
|                      |     |                      |     |                      |
|     User Query       +---->+   Embed the Query    +---->+ Vector Similarity    |
|   "What is TCP?"     |     |                      |     |       Search         |
|                      |     |                      |     |                      |
+----------------------+     +----------------------+     +-----------+----------+
                                                                     |
                                                                     v
+----------------------+     +----------------------+     +----------------------+
|                      |     |                      |     |                      |
|  Generate Response   |<----+  Feed Chunks+Query   |<----+  Retrieve Top-K      |
| "TCP is a trans...  "|     |   into LLM Prompt    |     |      Chunks          |
|                      |     |                      |     |                      |
+----------------------+     +----------------------+     +----------------------+