Transform your documents into an intelligent voice assistantβask questions naturally and get accurate, context-aware answers drawn from your own knowledge base.
VoiceRAG is an end-to-end voice-powered knowledge assistant that bridges the gap between natural conversation and document intelligence. Simply speak your questions and receive instant, accurate answers synthesized from your uploaded documents.
Upload your documents (PDFs, text files, reports, guides) and VoiceRAG automatically:
- Processes and embeds them into a searchable vector database
- Understands your spoken questions using advanced AI
- Retrieves the most relevant information from your documents
- Synthesizes natural, conversational responses
- Delivers answers back to you through voice
- π£οΈ Voice Interaction - ElevenLabs Voice Agents for natural conversation
- π§ AI Intelligence - OpenAI GPT models for context-aware responses
- π Document Retrieval - Vector embeddings and semantic search with Cohere reranking
- β‘ Automation - n8n workflows for seamless data pipeline orchestration
- πΎ Storage - Google Drive for documents, Supabase for vector database
- Personal knowledge base (research papers, notes, documentation)
- Business information lookup (policies, procedures, product specs)
- Customer support (FAQs, troubleshooting guides)
- Educational assistant (study materials, course content)
- Legal/medical document consultation
User Voice Input β ElevenLabs Agent β n8n Webhook β RAG Search (Supabase) β Cohere Rerank β GPT Response β Voice Output
β
Embedded Documents
β
Google Drive Files β n8n Pipeline
- n8n (self-hosted or cloud)
- Supabase account
- OpenAI API key
- Cohere API key (for reranking)
- ElevenLabs account
- Google Cloud project (for Drive integration)
-
Create Google OAuth Credentials
- Follow the n8n Google OAuth guide
- Enable Google Drive API in your Google Cloud Console
-
Connect to n8n
- Add Google Drive credentials in n8n
- Create a dedicated folder for VoiceRAG documents
- Set up file watch triggers for automatic processing
Run the following SQL in your Supabase SQL Editor:
-- Enable the pgvector extension for vector similarity search
CREATE EXTENSION IF NOT EXISTS vector;
-- Create documents table with vector embeddings
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB DEFAULT '{}',
embedding VECTOR(1536),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Create index for faster similarity searches
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Create function to search documents by semantic similarity
CREATE OR REPLACE FUNCTION match_documents (
query_embedding VECTOR(1536),
match_count INT DEFAULT 5,
filter JSONB DEFAULT '{}'
)
RETURNS TABLE (
doc_id BIGINT,
content TEXT,
metadata JSONB,
similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
d.id AS doc_id,
d.content,
d.metadata,
1 - (d.embedding <=> query_embedding) AS similarity
FROM documents d
WHERE d.metadata @> filter
ORDER BY d.embedding <=> query_embedding
LIMIT match_count;
END;
$$;- pgvector extension: Enables vector similarity operations
- documents table: Stores text chunks with their vector embeddings (1536 dimensions for OpenAI)
- match_documents function: Performs semantic search and returns most relevant documents
The repository includes pre-configured n8n workflows with sensitive information redacted:
- File:
n8n_workflow_redacted.json - All API keys, credentials, and sensitive data replaced with
REDACTED - Import into your n8n instance and replace placeholders with your actual credentials
Purpose: Automatically process uploaded files and store them as searchable embeddings
- Google Drive Trigger - Watches for new files (.txt, .pdf)
- File Content Extraction - Reads and parses document content
- OpenAI Embeddings - Converts document content to 1536-dim vectors
- Supabase Insert - Stores embeddings with metadata in database
- Embedding Model:
text-embedding-3-small - Supported Formats: .txt, .pdf
Purpose: Handles real-time voice queries by retrieving relevant context and generating responses
- Webhook Trigger - Receives query from ElevenLabs agent
- Query Embedding - Converts user question to vector using OpenAI
- Supabase Match - Finds top-k most relevant document chunks
- Cohere Reranking - Re-ranks results for improved relevance
- Context Assembly - Formats retrieved documents as context
- GPT Generation - Creates answer using retrieved knowledge with system prompt:
# OBJECTIVE - Retrieve and synthesize the most relevant information from the vector database, which serves as your primary knowledge source. - Use this information to generate accurate, context-aware, and helpful responses to the user's requests. - Prioritize precision, clarity, and relevance in all outputs. - Response - Returns formatted answer to voice agent
- Method: POST
- Timeout: 30 seconds (for complex queries)
-
Navigate to: ElevenLabs Agent Dashboard
-
Create New Agent:
- Choose voice and language
- Select conversation model: GPT-4o-mini (recommended for speed/cost) or GPT-4o (for complex reasoning)
-
Add RAG Tool:
- Tool Name:
rag-knowledge-tool - Description: "Search the knowledge base for relevant information to answer user questions"
- Webhook URL: Your n8n webhook endpoint from Workflow 2
- Method: POST
- Tool Name:
-
System Prompt Example:
You are a helpful and knowledgeable assistant designed to use the 'rag_knowledge_tool' to answer user queries.
# Objective
- Leverage the 'rag_knowledge_tool' to retrieve and synthesize relevant information.
- Provide accurate, concise, and context-aware responses.
- Maintain a cooperative and informative tone.
# Personality
- Friendly, clear, and confident in explanations.
- Curious and proactive in finding the best possible answer.
- Professional and respectful, yet conversational and easy to follow.
- Focused on usefulness and precision rather than verbosity.
# Guidelines
1. Always query the 'rag_knowledge_tool' when additional context or factual information is needed.
2. When responding:
- Be clear and well-structured.
- Cite or reference retrieved knowledge naturally.
- Keep the tone professional, approachable, and helpful.
3. If information cannot be found, acknowledge that and provide the best reasoning possible.
# Goal
Deliver precise, knowledge-grounded responses that help the user efficiently reach their objective.
- Use the test interface to verify RAG tool calls
- Check that document retrieval is working
- Tune the system prompt based on response quality
- Frontend Code: Located in the
frontend/folder - n8n Workflows:
n8n_workflow_redacted.json(import and configure with your credentials)
- Direct integration with ElevenLabs Voice Agent SDK
- Visual feedback during voice interactions
- Document upload interface
- Conversation history
- Clone the Repository
git clone https://github.com/petermartens98/VoiceRAG-AI-Powered-Voice-Assistant-with-Knowledge-Retrieval.git
cd VoiceRAG-AI-Powered-Voice-Assistant-with-Knowledge-Retrieval/frontend- Install Dependencies
npm install-
Configure Your Agent ID
- Open
frontend/src/App.js - Replace the placeholder
agentIdwith your own ElevenLabs Agent ID:
const agentId = "your-agent-id-here"; // Replace with your actual agent ID
- Open
-
Run the Application
npm start# Pull latest changes
git pull origin main
# Install any new dependencies
npm install
# Restart the application
npm startUser: "What were the key findings from the Q4 report?"
System Flow:
- Voice β ElevenLabs β n8n webhook
- n8n embeds query β searches Supabase
- Retrieves relevant Q4 report chunks
- Cohere re-ranks results for best relevance
- GPT synthesizes answer from top context
- Response β ElevenLabs β Voice output
Assistant: "According to your Q4 report, the key findings were: revenue grew 23% year-over-year, customer retention improved to 94%, and the new product line exceeded targets by 18%."
Contributions are welcome! Please open an issue or submit a pull request.