Skip to content

Knowledge Bases & RAG

Fluxbase provides built-in support for Retrieval-Augmented Generation (RAG), allowing you to create knowledge bases that provide context to AI chatbots. This enables chatbots to answer questions based on your custom documentation, product information, or any text content.

Knowledge bases in Fluxbase enable:

  • RAG-Powered Chatbots: Chatbots automatically retrieve relevant context from knowledge bases
  • Vector Search: Semantic similarity search using pgvector
  • Document Management: Upload, chunk, and embed documents automatically
  • Multiple Chunking Strategies: Recursive, sentence, paragraph, or fixed-size chunking
  • Flexible Linking: Connect multiple knowledge bases to a single chatbot
graph LR
A[User Query] --> B[Embed Query]
B --> C[Vector Search]
C --> D[Retrieve Chunks]
D --> E[Build Context]
E --> F[LLM with RAG Context]
F --> G[Response]
subgraph Knowledge Base
H[Documents] --> I[Chunking]
I --> J[Embedding]
J --> K[(pgvector)]
end
C --> K
style B fill:#10a37f,color:#fff
style K fill:#336791,color:#fff
style F fill:#3178c6,color:#fff

The RAG pipeline:

  1. Documents are chunked into smaller segments
  2. Each chunk is embedded using an embedding model (e.g., text-embedding-3-small)
  3. Embeddings are stored in PostgreSQL using pgvector
  4. When a user asks a question, the query is embedded
  5. Similar chunks are retrieved via vector similarity search
  6. Retrieved context is injected into the chatbot’s system prompt
  7. The LLM generates a response using the provided context

Before using knowledge bases, ensure:

  1. pgvector Extension: Install the pgvector extension in PostgreSQL
  2. Embedding Provider: Configure an embedding provider (OpenAI, Azure, or Ollama)
  3. AI Feature Enabled: Enable the AI feature in Fluxbase settings
CREATE EXTENSION IF NOT EXISTS vector;

Automatic Fallback: If you already have an AI provider configured for chatbots (e.g., OpenAI API key set), embeddings will automatically work using those same credentials. No additional configuration is needed.

Explicit Configuration: For fine-grained control or to use a different provider for embeddings, set these environment variables:

Terminal window
# OpenAI (explicit configuration)
FLUXBASE_AI_EMBEDDING_ENABLED=true
FLUXBASE_AI_EMBEDDING_PROVIDER=openai
FLUXBASE_AI_EMBEDDING_MODEL=text-embedding-3-small
FLUXBASE_AI_OPENAI_API_KEY=sk-...
# Or Azure OpenAI
FLUXBASE_AI_EMBEDDING_ENABLED=true
FLUXBASE_AI_EMBEDDING_PROVIDER=azure
FLUXBASE_AI_AZURE_API_KEY=...
FLUXBASE_AI_AZURE_ENDPOINT=https://your-resource.openai.azure.com
FLUXBASE_AI_AZURE_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002
# Or Ollama (local)
FLUXBASE_AI_EMBEDDING_ENABLED=true
FLUXBASE_AI_EMBEDDING_PROVIDER=ollama
FLUXBASE_AI_OLLAMA_ENDPOINT=http://localhost:11434
FLUXBASE_AI_EMBEDDING_MODEL=nomic-embed-text

Default Models (when using AI provider fallback):

  • OpenAI: text-embedding-3-small
  • Azure: text-embedding-ada-002
  • Ollama: nomic-embed-text
  1. Navigate to Knowledge Bases in the sidebar
  2. Click Create Knowledge Base
  3. Configure:
    • Name: Unique identifier (e.g., product-docs)
    • Description: What content this KB contains
    • Chunk Size: Characters per chunk (default: 512)
    • Chunk Overlap: Overlap between chunks (default: 50)
  4. Click Create
Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "product-docs",
"description": "Product documentation",
"chunk_size": 512,
"chunk_overlap": 50
}'

Once you have a knowledge base, add documents to it. Documents are automatically chunked and embedded.

Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/documents \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "Getting Started Guide",
"content": "# Getting Started\n\nWelcome to our product! This guide will help you get started.",
"metadata": {
"category": "guides",
"version": "1.0"
}
}'

Documents are processed asynchronously. Status values: pending, processing, indexed, failed.

In addition to pasting text content, you can upload document files directly. Fluxbase automatically extracts text from various file formats.

FormatExtensionMIME Type
PDF.pdfapplication/pdf
Plain Text.txttext/plain
Markdown.mdtext/markdown
HTML.html, .htmtext/html
CSV.csvtext/csv
Word Document.docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
Excel Spreadsheet.xlsxapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Rich Text.rtfapplication/rtf
EPUB.epubapplication/epub+zip
JSON.jsonapplication/json

Maximum file size: 50MB

  1. Navigate to Knowledge Bases and select your knowledge base
  2. Click Add Document
  3. Select the Upload File tab
  4. Drag and drop a file or click Browse Files
  5. Optionally provide a custom title
  6. Click Upload Document
Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/documents/upload \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-F "file=@document.pdf" \
-F "title=My Document"
  1. Clean PDFs: Ensure PDFs are text-based, not scanned images (OCR not supported)
  2. Simple formatting: Documents with simpler formatting extract more cleanly
  3. File size: Smaller files process faster; split very large documents if needed
  4. Text density: Avoid uploading files with mostly images or charts

Choose the chunking strategy that best fits your content:

StrategyDescriptionBest For
recursiveSplits by paragraphs, then sentences, then charactersGeneral text, documentation
sentenceSplits by sentence boundariesQ&A content, conversational text
paragraphSplits by paragraph (double newlines)Well-structured documents
fixedFixed character count splitsCode, logs, structured data
  • Smaller chunks (256-512): More precise retrieval, better for specific facts
  • Larger chunks (1024-2048): More context per chunk, better for complex topics
  • Overlap (10-20% of chunk size): Prevents losing context at chunk boundaries

Connect knowledge bases to chatbots to enable RAG.

Add RAG annotations to your chatbot definition:

/**
* Product Support Bot
*
* @fluxbase:description Product support chatbot with RAG
* @fluxbase:knowledge-base product-docs
* @fluxbase:knowledge-base faq
* @fluxbase:rag-max-chunks 5
* @fluxbase:rag-similarity-threshold 0.7
*/
export default `You are a helpful product support assistant.
Use the provided context to answer questions about our product.
If you don't find relevant information in the context, say so honestly.
Current user ID: {{user_id}}
`;
AnnotationDescriptionDefault
@fluxbase:knowledge-baseName of knowledge base to use (can specify multiple)-
@fluxbase:rag-max-chunksMaximum chunks to retrieve5
@fluxbase:rag-similarity-thresholdMinimum similarity score (0.0-1.0)0.7

The SDK provides methods to manage knowledge base links on chatbots:

import { createClient } from "@nimbleflux/fluxbase-sdk";
const client = createClient("http://localhost:8080", "service-role-key");
// Link a knowledge base to a chatbot
const { data, error } = await client.admin.ai.linkKnowledgeBase("chatbot-id", {
knowledge_base_id: "kb-id",
priority: 1,
max_chunks: 5,
similarity_threshold: 0.7,
});
// Update link settings
await client.admin.ai.updateChatbotKnowledgeBase("chatbot-id", "kb-id", {
max_chunks: 10,
enabled: true,
});
// List linked knowledge bases
const { data: links } =
await client.admin.ai.listChatbotKnowledgeBases("chatbot-id");
// Unlink a knowledge base
await client.admin.ai.unlinkKnowledgeBase("chatbot-id", "kb-id");

When a user sends a message to a RAG-enabled chatbot:

  1. Query Embedding: The user’s message is embedded using the same model as documents
  2. Similarity Search: pgvector finds the most similar chunks across linked knowledge bases
  3. Context Building: Retrieved chunks are formatted into a context section
  4. Prompt Injection: Context is added to the system prompt before the LLM call
  5. Response Generation: The LLM uses the context to generate an informed response

The chatbot receives a system prompt like:

[Original System Prompt]
## Relevant Context
The following information was retrieved from the knowledge base and may be relevant to the user's question:
### From: Getting Started Guide
Welcome to our product! This guide will help you get started...
### From: FAQ
Q: How do I reset my password?
A: Navigate to Settings > Account > Reset Password...
---
Use this context to answer the user's question. If the context doesn't contain relevant information, say so.
  1. Navigate to Knowledge Bases
  2. Click on your knowledge base
  3. Use the Search tab to test queries
  4. Review similarity scores and retrieved content
Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/search \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "how do I reset my password",
"max_chunks": 5,
"threshold": 0.5
}'
  1. Clean Content: Remove unnecessary formatting, headers, footers
  2. Consistent Structure: Use consistent heading styles and formatting
  3. Complete Information: Ensure documents contain full context
  4. Regular Updates: Keep knowledge bases current with product changes
  1. Match Content Type: Use appropriate chunking for your content
  2. Test Different Sizes: Experiment to find optimal chunk size
  3. Monitor Retrieval: Check if retrieved chunks are relevant
  1. Index Size: Keep knowledge bases focused on relevant content
  2. Similarity Threshold: Higher thresholds (0.7-0.8) reduce noise
  3. Chunk Limit: Limit retrieved chunks to avoid context overflow
  1. Access Control: Use RLS policies on knowledge base tables if needed
  2. Sensitive Content: Avoid storing sensitive data in knowledge bases
  3. User Context: Consider user-specific knowledge bases for personalized content
Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "support-kb",
"description": "Customer support documentation",
"chunk_size": 512,
"chunk_overlap": 50
}'
Terminal window
KB_ID="<knowledge-base-id-from-step-1>"
curl -X POST "http://localhost:8080/api/v1/admin/ai/knowledge-bases/${KB_ID}/documents" \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "Frequently Asked Questions",
"content": "## Account Questions\n\n### How do I create an account?\nVisit signup.example.com and fill out the registration form...\n\n### How do I reset my password?\nClick \"Forgot Password\" on the login page..."
}'
curl -X POST "http://localhost:8080/api/v1/admin/ai/knowledge-bases/${KB_ID}/documents" \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "Troubleshooting Guide",
"content": "## Common Issues\n\n### Error: Connection Failed\n1. Check your internet connection\n2. Verify the service is running\n3. Clear browser cache..."
}'

Create chatbots/support-bot/index.ts:

/**
* Customer Support Bot
*
* @fluxbase:description AI-powered customer support with knowledge base
* @fluxbase:knowledge-base support-kb
* @fluxbase:rag-max-chunks 5
* @fluxbase:rag-similarity-threshold 0.7
* @fluxbase:allowed-tables support_tickets,users
* @fluxbase:allowed-operations SELECT
* @fluxbase:rate-limit 30
* @fluxbase:public true
*/
export default `You are a friendly customer support assistant.
## Your Role
- Answer questions using the provided knowledge base context
- Help users troubleshoot common issues
- Look up support ticket status when asked
## Guidelines
1. Always check the provided context first
2. If you can't find an answer in the context, say so politely
3. Offer to escalate to human support for complex issues
4. Be friendly and professional
## Available Actions
- Answer questions from knowledge base
- Look up user's support tickets (use execute_sql)
Current user ID: {{user_id}}
`;
Terminal window
# Sync chatbot
curl -X POST http://localhost:8080/api/v1/admin/ai/chatbots/sync \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY"

Or using the SDK:

import { createClient } from "@nimbleflux/fluxbase-sdk";
const client = createClient("http://localhost:8080", "service-role-key");
await client.admin.ai.sync();

Use the REST API to check knowledge base statistics:

Terminal window
curl http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY"
Terminal window
curl http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/documents \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY"
  • Verify FLUXBASE_AI_EMBEDDING_ENABLED=true
  • Confirm API keys are valid
  • Check provider endpoint is accessible
  • Lower the similarity threshold for testing (e.g., 0.3)
  • Ensure documents are properly chunked
  • Verify content is relevant to expected queries
  • Try different chunking strategies

Verify the chatbot has linked knowledge bases:

const client = createClient("http://localhost:8080", "service-role-key");
const { data: links } =
await client.admin.ai.listChatbotKnowledgeBases("chatbot-id");
console.log("Linked KBs:", links);

Check annotation syntax:

// Correct
* @fluxbase:knowledge-base my-kb-name
// Wrong (no asterisk in multi-line comment)
@fluxbase:knowledge-base my-kb-name

Fluxbase automatically extracts entities and relationships from documents when they are processed. This enables knowledge graph capabilities and improves search relevance.

The following entity types are automatically extracted from documents:

TypeDescriptionExamples
personPeople and namesJohn Smith, Dr. Jane Doe, CEO Bob
organizationCompanies and organizationsGoogle, Microsoft, Corp, LLC
locationGeographic locationsNew York, London, Paris, California
conceptAbstract concepts and ideasMachine Learning, Democracy
productProducts and servicesiPhone, AWS, Docker, Kubernetes
eventEvents and time-based occurrencesOlympics, World War II
tableDatabase tablesauth.users, public.orders
urlURLs and linkshttps://example.com, www.fluxbase.com/docs
api_endpointREST/GraphQL/RPC endpointsPOST /api/v1/users, GET /api/v1/auth/...
datetimeDates, times, durations2025-02-18, 2h 30m, next week
code_referenceFile paths, repos, code snippetsinternal/ai/kb.go, github.com/user/repo
errorError codes, exceptionsErrNotFound, 404 Unauthorized

Entities are connected by the following relationship types:

TypeDescriptionExample
works_atPerson works at organization”John works at Google”
located_inEntity located in location”Office in San Francisco”
founded_byOrganization founded by person”Apple founded by Steve Jobs”
ownsOwnership relationship”Company owns product”
part_ofHierarchical/part-of relationship”Chapter is part of book”
related_toGeneral relationship”Concept A related to Concept B”
knowsPerson knows person”Alice knows Bob”
foreign_keyDatabase foreign keyorders.user_id → users.id
depends_onDependency relationship”View depends on table”

Entities and relationships are extracted automatically when documents are processed:

  1. Documents are uploaded to a knowledge base
  2. Document content is chunked for embedding
  3. Rule-based entity extraction identifies entities and relationships
  4. Entities are stored in the knowledge graph
  5. Document-entity mentions are tracked

Export database tables as knowledge base documents with embedded schema information. This enables AI assistants to understand your database structure and answer schema-related questions.

import { createClient } from "@nimbleflux/fluxbase-sdk";
const client = createClient("http://localhost:8080", "service-role-key");
const { data, error } = await client.admin.ai.getTableDetails("auth", "users");
if (data) {
console.log("Columns:", data.columns.map((c) => c.name));
console.log("Primary key:", data.primary_key);
}
Terminal window
# List exportable tables
curl "http://localhost:8080/api/v1/admin/ai/tables?schema=auth" \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY"
# Export table to knowledge base
curl -X POST http://localhost:8080/api/v1/admin/ai/tables/auth/users/export \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"knowledge_base_id": "kb-uuid",
"include_foreign_keys": true,
"include_indexes": true,
"include_sample_rows": false
}'

For each exported table, three artifacts are created:

  1. Document: Markdown schema documentation with columns, types, primary keys
  2. Entity: Graph representation of the table
  3. Relationships: Foreign key connections to other tables
  • Schema Documentation: Automatically document your database structure
  • AI Assistant Context: Help AI understand your database schema
  • Relationship Discovery: Explore table relationships through the knowledge graph
  • Onboarding: Quickly share database structure with team members

Fluxbase provides Model Context Protocol (MCP) tools for interacting with the knowledge graph from AI assistants.

Query entities in the knowledge graph with filtering:

{
"name": "query_knowledge_graph",
"arguments": {
"knowledge_base_id": "kb-uuid",
"entity_type": "location",
"search_query": "San Francisco",
"limit": 50,
"include_relationships": true
}
}

Find entities related to a starting entity using graph traversal:

{
"name": "find_related_entities",
"arguments": {
"knowledge_base_id": "kb-uuid",
"entity_id": "entity-uuid",
"max_depth": 2,
"relationship_types": ["works_at", "located_in"],
"limit": 100
}
}

Browse the knowledge graph from a starting entity:

{
"name": "browse_knowledge_graph",
"arguments": {
"knowledge_base_id": "kb-uuid",
"start_entity": "entity-id-or-name",
"direction": "both",
"limit": 50
}
}

Knowledge graph tools require the read:vectors scope for authorization.

For multi-tenant applications, knowledge bases support user-scoped document isolation. When a chatbot retrieves context from a knowledge base, it can filter documents by user ID, ensuring users only see their own documents.

  1. Adding Documents with User Context: When adding documents, include user_id in the metadata
  2. Filter Expression: Configure chatbot-knowledge base links with filter expressions
  3. Automatic Filtering: RAG retrieval automatically applies user context during search
Terminal window
curl -X POST http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/documents \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "User Travel Notes",
"content": "My favorite restaurants in Tokyo...",
"metadata": {
"user_id": "user-123",
"category": "travel"
}
}'

Link a knowledge base with filtered access to enable user-scoped retrieval:

import { createClient } from "@nimbleflux/fluxbase-sdk";
const client = createClient("http://localhost:8080", "service-role-key");
await client.admin.ai.linkKnowledgeBase("chatbot-id", {
knowledge_base_id: "kb-id",
max_chunks: 5,
similarity_threshold: 0.7,
});

When a user chats with a RAG-enabled chatbot:

  1. The chat handler extracts the user ID from the authentication context
  2. RAG retrieval passes the user ID to the search function
  3. Documents with matching user_id in metadata are returned
  4. Documents without user_id are excluded unless explicitly allowed

Delete all documents for a user (e.g., account deletion):

Terminal window
curl -X DELETE "http://localhost:8080/api/v1/admin/ai/knowledge-bases/KB_ID/documents?user_id=user-to-delete" \
-H "Authorization: Bearer YOUR_SERVICE_ROLE_KEY"

The Fluxbase CLI provides commands for managing knowledge bases.

Terminal window
# List knowledge bases
fluxbase kb list
# Get knowledge base details
fluxbase kb get <kb-id>
# Create knowledge base
fluxbase kb create my-kb --description "My knowledge base"
# Update knowledge base
fluxbase kb update <kb-id> --description "Updated description"
# Delete knowledge base
fluxbase kb delete <kb-id>
# Show status and statistics
fluxbase kb status <kb-id>
Terminal window
# List documents
fluxbase kb documents <kb-id>
# Add document from text
fluxbase kb add <kb-id> --content "Document content" --title "My Doc"
# Add document from file
fluxbase kb add <kb-id> --from-file ./doc.txt --title "My Doc"
# Add document from stdin
echo "Content" | fluxbase kb add <kb-id> --title "Piped Doc"
# Add with user isolation
fluxbase kb add <kb-id> --content "..." --metadata '{"user_id":"user-123"}'
# Get document details
fluxbase kb documents get <kb-id> <doc-id>
# Delete single document
fluxbase kb documents delete <kb-id> <doc-id>
# Bulk delete by filter
fluxbase kb documents delete-by-filter <kb-id> --metadata-filter "user_id=user-123"
Terminal window
# Upload document file
fluxbase kb upload <kb-id> ./document.pdf --title "My PDF"
# Upload with OCR languages (for scanned PDFs)
fluxbase kb upload <kb-id> ./scanned.pdf --ocr-languages "eng,deu"
Terminal window
# Search knowledge base
fluxbase kb search <kb-id> "how to reset password"
# Search with options
fluxbase kb search <kb-id> "pricing" --limit 5 --threshold 0.7
Terminal window
# List exportable tables
fluxbase kb tables
fluxbase kb tables auth # Filter by schema
# Export table as document
fluxbase kb export-table <kb-id> --schema public --table users \
--include-fks --include-indexes --sample-rows 5
Terminal window
# View knowledge graph data
fluxbase kb graph <kb-id>
# List entities
fluxbase kb entities <kb-id>
fluxbase kb entities <kb-id> --type person
fluxbase kb entities <kb-id> --search "John"
# List chatbots using KB
fluxbase kb chatbots <kb-id>