Knowledge (RAG) Overview

Knowledge (RAG) Overview

Give your agent access to your KB, product docs, playbooks, and internal procedures—without stuffing everything into the prompt.

Why RAG matters (especially in production)

Large language models know a lot about the world, but they won’t reliably know the latest specifics about your organization—your product changes, your policies, your troubleshooting steps, or your internal routing rules. Retrieval‑augmented generation (RAG) solves that by letting an agent *retrieve* relevant text from your approved knowledge sources and then respond using those facts.

In FlowbotAI, RAG is implemented by calling the queryCorpus knowledge lookup tool and prompting the agent to use it whenever accuracy matters.


How FlowbotAI RAG works at a high level

FlowbotAI organizes knowledge into "Corpora" (knowledgebase Collections). Each corpus contains one or more "Sources", for example: a set of web pages to crawl or uploaded documents to ingest. Once ingested, the agent can query the corpus using the built‑in "queryCorpus" tool.

At runtime, the pattern looks like this:
  1. Caller asks a question.
  2. Agent uses queryCorpus with a focused query.
  3. Tool returns the most relevant text snippets.
  4. Agent answers using the returned content (and asks a follow‑up if the corpus doesn’t contain enough detail).


What to include in a voice‑ready knowledge base

Here are practical examples of content that works well for common voice agent use cases:

Customer Success & Support
  1. Product documentation: user guides, FAQs, common errors, troubleshooting steps, limitation notes
  2. Onboarding materials: getting started guides, best practices, curated training notes, implementation checklists

Customer Acquisition (sales + pre‑sales)
  1. Product information: features, packaging, pricing tiers, supported integrations, competitive positioning
  2. Sales enablement: qualification questions, discovery flows, objection handling, industry‑specific talk tracks

Operations (internal enablement)
  1. Internal processes: routing rules, escalation paths, department directories, SLAs, ticket triage matrices
  2. Survey materials: question banks, follow‑ups, rating scales, post‑call workflows
Info
Voice tip: keep documents “answerable.” Short sections, clear headings, and explicit steps are easier to retrieve and speak back accurately.


Supported Knowledge Sources

  1. URL's: Add one or more URLS per RAG source and manage the crawl depth.
  2. Files: PDF, DOC, DOCX, TXT, MD, PPT, PPTX


Set up RAG in the FlowbotAI Portal

The fastest way to create a knowledge base is through the FlowbotAI Portal. You’ll create a Collection, add at least one source, then call queryCorpus from your agent.
  1. Go to the Knowledge tab within the Agent configuration.
  2. Create a new Collection (corpus).
    1. Give it a clear Name and Description (this helps the model with context on the content).
  3. Add a source (website or documents)
    1. Select the Collection you created.
    2. Give it a clear Name and Description (this helps the model with context on the content).
    3. Choose Web or Document and enter the URLs or upload the docs you want FlowbotAI to crawl and ingest.
  4. Save, then wait a few moments for crawling + ingestion to complete.


Using the queryCorpus Tool to lookup Knowledge

Use the queryCorpus tool when the caller is asking for anything that should be answered using your Knowledgebase, not the model’s general training. You can also use it to retrieve steps when the agent needs to perform specific tasks like troubleshooting.

Examples of suggested “must look up” topics:
  1. step-by-step instructions and workflows
  2. policies, limits, supported/unsupported behavior
  3. troubleshooting steps, known issues, error messages
  4. product pricing, configuration rules, defaults, and constraints
If the question, answer, or task is “official” or “specific,” the agent should retrieve first, then answer or perform based on the retrieved snippets.

The one required instruction (the tool + corpus ID)

To perform a knowledge lookup, the agent must be explicitly prompted to call the queryCorpus tool and be given the corpus ID. The agent should never guess the corpus ID.
Notes
prompt must include:
use Tool queryCorpus ID=###

Use these rules as the default behavior for any agent that has access to a Knowledge (RAG) corpus.

1. Retrieve first for “official” questions
If the caller asks something that sounds like policy / steps / troubleshooting / limits, the agent should query the corpus before answering and respond using only the retrieved snippets.

2. If retrieval is weak, clarify and retry
If results are thin, conflicting, or missing, the agent should ask one targeted clarifying question, then run queryCorpus again using a refined query.

3. If the answer isn’t in the corpus, say so
If the tool returns nothing useful, the agent should be transparent that the KB doesn’t contain the answer and offer the next step (clarify, escalate, or point to the correct resource).

4. Keep the response voice-friendly (don’t dump raw text)
If retrieval returns long paragraphs, dense instructions, or large blocks of configuration details, the agent should summarize first and deliver information in small chunks:
  1. Lead with a 1–2 sentence direct answer
  2. Give only 2–4 steps at a time
  3. Pause and ask whether to continue before reading more

Prompt Examples You Can Build From

Basic “retrieve then answer” (simple default)
Use this for general KB accuracy and basic troubleshooting.
Notes
If the user asks anything policy/steps/troubleshooting related, lookup first.
use Tool queryCorpus ID=123

If results are weak, ask one question then retry (reliable in production)
Use this to prevent guessing and force clarification + a second lookup.
Notes
If lookup is empty or unclear: ask one clarifying question, then lookup again.
use Tool queryCorpus ID=123

Voice-safe output control (prevents paragraph dumping)
Use this when calls are live and you want short, human-paced instructions.
Notes
When you use retrieved KB content, do not read long paragraphs. Give a 1–2 sentence answer, then only 2–4 steps. If more steps exist, ask if the caller wants to continue. If retrieved content is large, summarize into 2–3 bullets and ask what they’re trying to do.
use Tool queryCorpus ID=123

Troubleshooting mode (best for error messages and known issues)
Use this to force “retrieve-first” behavior for incident handling.
Notes
For any troubleshooting question or error message, you MUST use the KB first. Extract the exact steps from the KB and deliver them in small chunks (2–4 steps), pausing to confirm progress. If the KB has multiple causes, ask one question to narrow it down before continuing.
use Tool queryCorpus ID=123

Policy / Limits mode (best for “is this supported?” questions)
Use this when the caller needs an “official” answer and you want strict no-hallucination behavior.
Notes
For policy, limits, supported/unsupported behavior, or any “official” answer: lookup first and answer only from retrieved KB content. If the KB does not contain the answer, say so and ask one clarifying question or propose escalation. Do not guess.
use Tool queryCorpus ID=123

Simple but complete prompt insert (recommended system prompt snippet)
Replace XXXXXXX and your corpus ID. The tool line is repeated because Llama-style agents tend to follow repeated hard constraints more consistently.
Notes
Knowledge Lookup (RAG):
- For any question about official steps, policies, limits, troubleshooting, or “how XXXXXXX works”, you MUST look it up before answering.
- Use the knowledge tool with the provided corpus. Never guess IDs.
- Tool instruction (required): use Tool queryCorpus ID=### 
- Query style: short + specific keywords (feature name, task, error text).
- After results: answer using the retrieved content. If unclear or missing, ask one clarifying question and then retry lookup. Do not invent details.
- Output handling (voice-safe): never read full paragraphs; give a 1–2 sentence answer, then 2–4 steps max, then ask to continue.

use Tool queryCorpus ID=### 


“Gold standard” output rules (what good looks like)

When the agent answers after retrieval, it should:
  1. lead with the direct answer (1–2 sentences)
  2. then give steps/bullets exactly as KB supports
  3. call out constraints/limits explicitly (if present)
  4. include a quick “If you don’t see X, do Y” fallback (only if KB supports it)
  5. never dump a full paragraph or more than a few steps at once


Using an external vector database or search service

If your documentation is already indexed in your own vector database (or a search service behind an API), you can still use the same RAG pattern in FlowbotAI—just create a custom tool that calls your lookup endpoint, then require the agent to use it for facts.

Info
Design notes for external RAG tools:
Return short, well‑scoped snippets (not entire pages).
Include enough surrounding context so the agent can answer without hallucinating.
If you can, return metadata (document title/URL/section) so humans can verify answers quickly.


Tuning for accuracy and speed

RAG quality is usually a balancing act between relevance, completeness, and latency. The fastest wins come from tuning retrieval settings and content structure.

Key knobs you’ll use most:
  1. Corpus scope: keep corpora focused (support KB separate from sales talk tracks) to reduce off‑topic retrieval.
  2. Voice‑first content hygiene
    1. Prefer short sections with explicit steps and clear headings.
    2. Avoid giant “wall of text” pages—chunking improves retrieval.
    3. Update docs proactively; RAG is only as good as the source content.


Quick checklist before you ship

  1. Does the agent know when to use retrieval vs answer from general knowledge?
  2. Is corpus_id set as an override (so the agent never guesses identifiers)?
  3. Do test queries return the expected snippets? (Try 10–20 real questions.)
  4. Are you filtering weak results with minimum_score?
  5. Do you have a fallback when retrieval is empty (clarify or escalate)?


    • Related Articles

    • FlowbotAI Tools Overview

      What tools unlock in FlowbotAI Tools are the “action layer” for FlowbotAI voice agents. They let an agent do more than talk — it can look things up, update systems, and trigger workflows while keeping the conversation natural and on-track. In plain ...
    • Noise & VAD Overview

      In this article we describe how FlowbotAI detects speech, suppresses noise, and keeps conversations fast and natural—without constant tuning. What this feature does in plain English Noise handling and Voice Activity Detection (VAD) are the “ears” of ...
    • Explore Built-in Tools

      What “built-in tools” mean in FlowbotAI FlowbotAI ships with a small set of built-in tools that cover common voice-application needs out of the box. They behave the same way as tools you create yourself: your agent can invoke them mid-conversation, ...
    • Agent Prompting 101

      Why prompts matter in AI Agents Prompting is the primary controller for how a voice agent behaves. It defines the agent’s role, tone, boundaries, and how it handles real-world situations like confusion, edge cases, and tool usage. When you’re aiming ...