<Bold_Prompts> #8
Retrieve and analyze information from your documents using RAG (Retrieval Augmented Generation)
Welcome to the 8th issue of <Bold Prompts>: the weekly newsletter that sharpens your AI skills, one clever prompt at a time.
Every week, I send you an advanced prompt inspired by a real-world application. Think of these emails as mini-courses in prompt engineering.
Today’s prompt is about retrieving information from long documents.
In a way, all LLMs do is retrieve information. After training on massive data, they build models of “human knowledge” and use them to formulate their answers.
Whenever you ask your LLM a question, it searches the space of possible answers and then produces the most likely one— and it does it one token at a time. Each token is selected based on the model’s probability distribution learned through training and fine-tuning. That’s why you often hear about “next token prediction.”
Two important comments here:
This “information retrieval” logic applies to the LLM’s training data—the data it “already knows” so to speak.
LLMs can also apply a similar principle to external data —data it hasn’t “seen” during the training phase.
How does RAG work anyway?
Suppose you copy-paste a meeting memo in your LLM’s context window. RAG allows you to prompt your LLM to search for relevant information inside that memo. You can also ask for a summary, specific edits, and translation.
Endless possibilities.
The only limitation is the length of your documents. But it’s a double limitation: you can’t paste all of Wikipedia in the context window, and even if you could, the longer your document, the lower the performance of your LLM.
Don’t worry too much about it though. AI labs continue to make the context window bigger and tweak their models to make them handle multiple formats (multimodality).
There are already several models that allow you to paste lengthy documents into the context window — and your documents don’t have to always be raw text. You can attach PDF documents, images, and even PowerPoint presentations.
From there, you can instruct your model to transform the input in some way.
The most common framework used to carry out such a transformation is called RAG, or Retrieval Augmented Generation.
RAG is a fancy way to say your AI system goes on a treasure hunt through a vast library of information to find the most relevant pieces —and then combine these precious info pieces together to create a relevant answer.
The first time I used RAG was to help an acquaintance who was studying to become a doctor. She was having trouble finding information inside a 400-page document.
“Hey, have you tried to use an LLM?”
She did and then sent me a screenshot. The first two messages were polite questions. Then she started “yelling” at the poor AI. She got angry because the model would often make up answers out of thin air and back them up with imaginary quotes.
“You should try harder with your prompts.”
But as with most people, she didn’t bother. Instead, she chose to stay stuck in 2022, the pre-LLM era when people used “Ctrl + F” to find information inside a document.
Today’s prompt will take you to 2024.
In the present, you can interact with your documents and retrieve accurate information using LLMs. You’ll be able to locate information, extract exact quotes, and generate accurate summaries.
The prompt will help you:
Extract key points from a document to answer your question.
Locate the key points inside the document and extract exact quotes.
Identify related information for a subsequent round of retrieval.
The last point makes all the difference because it compensates for potential losses of information during the initial search.
One potential weakness of RAG is that it may overlook useful information if that particular information doesn’t rank “high enough” in the system’s retrieval mechanisms.
Subsequent rounds of retrieval are what make today’s prompt special compared to other RAG prompts.
Exciting isn’t it?
Let’s get into the prompt itself:
Keep reading with a 7-day free trial
Subscribe to The Bald Prompter to keep reading this post and get 7 days of free access to the full post archives.


