Give your agent access to your KB, product docs, playbooks, and internal procedures—without stuffing everything into the prompt.
Why RAG matters (especially in production)
Large language models know a lot about the world, but they won’t reliably know the latest specifics about your organization—your product changes, your policies, your troubleshooting steps, or your internal routing rules. Retrieval‑augmented generation (RAG) solves that by letting an agent *retrieve* relevant text from your approved knowledge sources and then respond using those facts.
In FlowbotAI, RAG is implemented by calling the queryCorpus knowledge lookup tool and prompting the agent to use it whenever accuracy matters.
How FlowbotAI RAG works at a high level
FlowbotAI organizes knowledge into "Corpora" (knowledgebase Collections). Each corpus contains one or more "Sources", for example: a set of web pages to crawl or uploaded documents to ingest. Once ingested, the agent can query the corpus using the built‑in "queryCorpus" tool.
At runtime, the pattern looks like this:
- Caller asks a question.
- Agent uses queryCorpus with a focused query.
- Tool returns the most relevant text snippets.
- Agent answers using the returned content (and asks a follow‑up if the corpus doesn’t contain enough detail).
What to include in a voice‑ready knowledge base
Here are practical examples of content that works well for common voice agent use cases:
Customer Success & Support
- Product documentation: user guides, FAQs, common errors, troubleshooting steps, limitation notes
- Onboarding materials: getting started guides, best practices, curated training notes, implementation checklists
Customer Acquisition (sales + pre‑sales)
- Product information: features, packaging, pricing tiers, supported integrations, competitive positioning
- Sales enablement: qualification questions, discovery flows, objection handling, industry‑specific talk tracks
Operations (internal enablement)
- Internal processes: routing rules, escalation paths, department directories, SLAs, ticket triage matrices
- Survey materials: question banks, follow‑ups, rating scales, post‑call workflows
Voice tip: keep documents “answerable.” Short sections, clear headings, and explicit steps are easier to retrieve and speak back accurately.
Supported Knowledge Sources
- URL's: Add one or more URLS per RAG source and manage the crawl depth.
- Files: PDF, DOC, DOCX, TXT, MD, PPT, PPTX
Set up RAG in the FlowbotAI Portal
The fastest way to create a knowledge base is through the FlowbotAI Portal. You’ll create a Collection, add at least one source, then call queryCorpus from your agent.
- Go to the Knowledge tab within the Agent configuration.
- Create a new Collection (corpus).
- Give it a clear Name and Description (this helps the model with context on the content).
- Add a source (website or documents)
- Select the Collection you created.
- Give it a clear Name and Description (this helps the model with context on the content).
- Choose Web or Document and enter the URLs or upload the docs you want FlowbotAI to crawl and ingest.
- Save, then wait a few moments for crawling + ingestion to complete.
Use the queryCorpus tool when the caller is asking for anything that should be answered using your Knowledgebase, not the model’s general training. You can also use it to retrieve steps when the agent needs to perform specific tasks like troubleshooting.
Examples of suggested “must look up” topics:
- step-by-step instructions and workflows
- policies, limits, supported/unsupported behavior
- troubleshooting steps, known issues, error messages
- product pricing, configuration rules, defaults, and constraints
If the question, answer, or task is “official” or “specific,” the agent should retrieve first, then answer or perform based on the retrieved snippets.
To perform a knowledge lookup, the agent must be explicitly prompted to call the queryCorpus tool and be given the corpus ID. The agent should never guess the corpus ID.
prompt must include:
use Tool queryCorpus ID=###
Recommended Prompting Behavior Rules
Use these rules as the default behavior for any agent that has access to a Knowledge (RAG) corpus.
1. Retrieve first for “official” questions
If the caller asks something that sounds like policy / steps / troubleshooting / limits, the agent should query the corpus before answering and respond using only the retrieved snippets.
2. If retrieval is weak, clarify and retry
If results are thin, conflicting, or missing, the agent should ask one targeted clarifying question, then run queryCorpus again using a refined query.
3. If the answer isn’t in the corpus, say so
If the tool returns nothing useful, the agent should be transparent that the KB doesn’t contain the answer and offer the next step (clarify, escalate, or point to the correct resource).
4. Keep the response voice-friendly (don’t dump raw text)
If retrieval returns long paragraphs, dense instructions, or large blocks of configuration details, the agent should summarize first and deliver information in small chunks:
- Lead with a 1–2 sentence direct answer
- Give only 2–4 steps at a time
- Pause and ask whether to continue before reading more
Prompt Examples You Can Build From
Basic “retrieve then answer” (simple default)
Use this for general KB accuracy and basic troubleshooting.
If the user asks anything policy/steps/troubleshooting related, lookup first.
use Tool queryCorpus ID=123
If results are weak, ask one question then retry (reliable in production)
Use this to prevent guessing and force clarification + a second lookup.
If lookup is empty or unclear: ask one clarifying question, then lookup again.
use Tool queryCorpus ID=123
Voice-safe output control (prevents paragraph dumping)
Use this when calls are live and you want short, human-paced instructions.
When you use retrieved KB content, do not read long paragraphs. Give a 1–2 sentence answer, then only 2–4 steps. If more steps exist, ask if the caller wants to continue. If retrieved content is large, summarize into 2–3 bullets and ask what they’re trying to do.
use Tool queryCorpus ID=123
Troubleshooting mode (best for error messages and known issues)
Use this to force “retrieve-first” behavior for incident handling.
For any troubleshooting question or error message, you MUST use the KB first. Extract the exact steps from the KB and deliver them in small chunks (2–4 steps), pausing to confirm progress. If the KB has multiple causes, ask one question to narrow it down before continuing.
use Tool queryCorpus ID=123
Policy / Limits mode (best for “is this supported?” questions)
Use this when the caller needs an “official” answer and you want strict no-hallucination behavior.
For policy, limits, supported/unsupported behavior, or any “official” answer: lookup first and answer only from retrieved KB content. If the KB does not contain the answer, say so and ask one clarifying question or propose escalation. Do not guess.
use Tool queryCorpus ID=123
Simple but complete prompt insert (recommended system prompt snippet)
Replace XXXXXXX and your corpus ID. The tool line is repeated because Llama-style agents tend to follow repeated hard constraints more consistently.
Knowledge Lookup (RAG):
- For any question about official steps, policies, limits, troubleshooting, or “how XXXXXXX works”, you MUST look it up before answering.
- Use the knowledge tool with the provided corpus. Never guess IDs.
- Tool instruction (required): use Tool queryCorpus ID=###
- Query style: short + specific keywords (feature name, task, error text).
- After results: answer using the retrieved content. If unclear or missing, ask one clarifying question and then retry lookup. Do not invent details.
- Output handling (voice-safe): never read full paragraphs; give a 1–2 sentence answer, then 2–4 steps max, then ask to continue.
use Tool queryCorpus ID=###
“Gold standard” output rules (what good looks like)
When the agent answers after retrieval, it should:
- lead with the direct answer (1–2 sentences)
- then give steps/bullets exactly as KB supports
- call out constraints/limits explicitly (if present)
- include a quick “If you don’t see X, do Y” fallback (only if KB supports it)
- never dump a full paragraph or more than a few steps at once
Using an external vector database or search service
If your documentation is already indexed in your own vector database (or a search service behind an API), you can still use the same RAG pattern in FlowbotAI—just create a custom tool that calls your lookup endpoint, then require the agent to use it for facts.
Design notes for external RAG tools:
• Return short, well‑scoped snippets (not entire pages).
• Include enough surrounding context so the agent can answer without hallucinating.
• If you can, return metadata (document title/URL/section) so humans can verify answers quickly.
Tuning for accuracy and speed
RAG quality is usually a balancing act between relevance, completeness, and latency. The fastest wins come from tuning retrieval settings and content structure.
Key knobs you’ll use most:
- Corpus scope: keep corpora focused (support KB separate from sales talk tracks) to reduce off‑topic retrieval.
- Voice‑first content hygiene
- Prefer short sections with explicit steps and clear headings.
- Avoid giant “wall of text” pages—chunking improves retrieval.
- Update docs proactively; RAG is only as good as the source content.
Quick checklist before you ship
- Does the agent know when to use retrieval vs answer from general knowledge?
- Is corpus_id set as an override (so the agent never guesses identifiers)?
- Do test queries return the expected snippets? (Try 10–20 real questions.)
- Are you filtering weak results with minimum_score?
- Do you have a fallback when retrieval is empty (clarify or escalate)?