Skip to content

VoiceRAG combines voice interaction with retrieval-augmented generation (RAG) for intelligent, conversational access to your stored knowledge. It integrates ElevenLabs Voice Agents, OpenAI GPT models, Supabase/Vector databases, and n8n automations for smooth knowledge-to-voice workflows.

Notifications You must be signed in to change notification settings

petermartens98/VoiceRAG-AI-Powered-Voice-Assistant-with-Knowledge-Retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ VoiceRAG: AI-Powered Voice Assistant with Knowledge Retrieval

Transform your documents into an intelligent voice assistantβ€”ask questions naturally and get accurate, context-aware answers drawn from your own knowledge base.

🌟 Overview

VoiceRAG is an end-to-end voice-powered knowledge assistant that bridges the gap between natural conversation and document intelligence. Simply speak your questions and receive instant, accurate answers synthesized from your uploaded documents.

What It Does

Upload your documents (PDFs, text files, reports, guides) and VoiceRAG automatically:

  • Processes and embeds them into a searchable vector database
  • Understands your spoken questions using advanced AI
  • Retrieves the most relevant information from your documents
  • Synthesizes natural, conversational responses
  • Delivers answers back to you through voice

Technology Stack

  • πŸ—£οΈ Voice Interaction - ElevenLabs Voice Agents for natural conversation
  • 🧠 AI Intelligence - OpenAI GPT models for context-aware responses
  • πŸ“š Document Retrieval - Vector embeddings and semantic search with Cohere reranking
  • ⚑ Automation - n8n workflows for seamless data pipeline orchestration
  • πŸ’Ύ Storage - Google Drive for documents, Supabase for vector database

Use Cases

  • Personal knowledge base (research papers, notes, documentation)
  • Business information lookup (policies, procedures, product specs)
  • Customer support (FAQs, troubleshooting guides)
  • Educational assistant (study materials, course content)
  • Legal/medical document consultation

πŸ—οΈ Architecture

User Voice Input β†’ ElevenLabs Agent β†’ n8n Webhook β†’ RAG Search (Supabase) β†’ Cohere Rerank β†’ GPT Response β†’ Voice Output
                                                            ↑
                                                    Embedded Documents
                                                            ↑
                                          Google Drive Files β†’ n8n Pipeline

πŸš€ Getting Started

Prerequisites


πŸ“ 1. Google Drive Setup

Enable File Storage for Documents

  1. Create Google OAuth Credentials

  2. Connect to n8n

    • Add Google Drive credentials in n8n
    • Create a dedicated folder for VoiceRAG documents
    • Set up file watch triggers for automatic processing

πŸ—„οΈ 2. Supabase Database Setup

Initialize Vector Database

Run the following SQL in your Supabase SQL Editor:

-- Enable the pgvector extension for vector similarity search
CREATE EXTENSION IF NOT EXISTS vector;

-- Create documents table with vector embeddings
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding VECTOR(1536),
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- Create index for faster similarity searches
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- Create function to search documents by semantic similarity
CREATE OR REPLACE FUNCTION match_documents (
  query_embedding VECTOR(1536),
  match_count INT DEFAULT 5,
  filter JSONB DEFAULT '{}'
)
RETURNS TABLE (
  doc_id BIGINT,
  content TEXT,
  metadata JSONB,
  similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    d.id AS doc_id,
    d.content,
    d.metadata,
    1 - (d.embedding <=> query_embedding) AS similarity
  FROM documents d
  WHERE d.metadata @> filter
  ORDER BY d.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

What This Does

  • pgvector extension: Enables vector similarity operations
  • documents table: Stores text chunks with their vector embeddings (1536 dimensions for OpenAI)
  • match_documents function: Performs semantic search and returns most relevant documents

βš™οΈ 3. n8n Workflow Setup

πŸ“¦ Import Pre-built Workflows

The repository includes pre-configured n8n workflows with sensitive information redacted:

  • File: n8n_workflow_redacted.json
  • All API keys, credentials, and sensitive data replaced with REDACTED
  • Import into your n8n instance and replace placeholders with your actual credentials

πŸ”Ή Workflow 1: File Upload β†’ Vector Embedding Pipeline

Purpose: Automatically process uploaded files and store them as searchable embeddings

n8n upload flow

Flow Steps:

  1. Google Drive Trigger - Watches for new files (.txt, .pdf)
  2. File Content Extraction - Reads and parses document content
  3. OpenAI Embeddings - Converts document content to 1536-dim vectors
  4. Supabase Insert - Stores embeddings with metadata in database

Key Configuration:

  • Embedding Model: text-embedding-3-small
  • Supported Formats: .txt, .pdf

πŸ”Ή Workflow 2: Voice Agent RAG Webhook

Purpose: Handles real-time voice queries by retrieving relevant context and generating responses

n8n webhook flow

Flow Steps:

  1. Webhook Trigger - Receives query from ElevenLabs agent
  2. Query Embedding - Converts user question to vector using OpenAI
  3. Supabase Match - Finds top-k most relevant document chunks
  4. Cohere Reranking - Re-ranks results for improved relevance
  5. Context Assembly - Formats retrieved documents as context
  6. GPT Generation - Creates answer using retrieved knowledge with system prompt:
    # OBJECTIVE
    - Retrieve and synthesize the most relevant information from the vector database, 
      which serves as your primary knowledge source.
    - Use this information to generate accurate, context-aware, and helpful responses 
      to the user's requests.
    - Prioritize precision, clarity, and relevance in all outputs.
    
  7. Response - Returns formatted answer to voice agent

Webhook Configuration:

  • Method: POST
  • Timeout: 30 seconds (for complex queries)

πŸ—£οΈ 4. ElevenLabs Voice Agent Configuration

Setup Your Voice Agent

  1. Navigate to: ElevenLabs Agent Dashboard

  2. Create New Agent:

    • Choose voice and language
    • Select conversation model: GPT-4o-mini (recommended for speed/cost) or GPT-4o (for complex reasoning)
  3. Add RAG Tool:

    • Tool Name: rag-knowledge-tool
    • Description: "Search the knowledge base for relevant information to answer user questions"
    • Webhook URL: Your n8n webhook endpoint from Workflow 2
    • Method: POST
  4. System Prompt Example:

You are a helpful and knowledgeable assistant designed to use the 'rag_knowledge_tool' to answer user queries.

# Objective
- Leverage the 'rag_knowledge_tool' to retrieve and synthesize relevant information.
- Provide accurate, concise, and context-aware responses.
- Maintain a cooperative and informative tone.

# Personality
- Friendly, clear, and confident in explanations.
- Curious and proactive in finding the best possible answer.
- Professional and respectful, yet conversational and easy to follow.
- Focused on usefulness and precision rather than verbosity.

# Guidelines
1. Always query the 'rag_knowledge_tool' when additional context or factual information is needed.
2. When responding:
   - Be clear and well-structured.
   - Cite or reference retrieved knowledge naturally.
   - Keep the tone professional, approachable, and helpful.
3. If information cannot be found, acknowledge that and provide the best reasoning possible.

# Goal
Deliver precise, knowledge-grounded responses that help the user efficiently reach their objective.

Testing Your Agent

  • Use the test interface to verify RAG tool calls
  • Check that document retrieval is working
  • Tune the system prompt based on response quality

πŸ’» 5. React Frontend (Optional)

image

Repository Structure

  • Frontend Code: Located in the frontend/ folder
  • n8n Workflows: n8n_workflow_redacted.json (import and configure with your credentials)

Features

  • Direct integration with ElevenLabs Voice Agent SDK
  • Visual feedback during voice interactions
  • Document upload interface
  • Conversation history

Installation & Setup

  1. Clone the Repository
git clone https://github.com/petermartens98/VoiceRAG-AI-Powered-Voice-Assistant-with-Knowledge-Retrieval.git
cd VoiceRAG-AI-Powered-Voice-Assistant-with-Knowledge-Retrieval/frontend
  1. Install Dependencies
npm install
  1. Configure Your Agent ID

    • Open frontend/src/App.js
    • Replace the placeholder agentId with your own ElevenLabs Agent ID:
    const agentId = "your-agent-id-here"; // Replace with your actual agent ID
  2. Run the Application

npm start

Updating the App

# Pull latest changes
git pull origin main

# Install any new dependencies
npm install

# Restart the application
npm start

πŸ“Š Usage Example

User: "What were the key findings from the Q4 report?"

System Flow:

  1. Voice β†’ ElevenLabs β†’ n8n webhook
  2. n8n embeds query β†’ searches Supabase
  3. Retrieves relevant Q4 report chunks
  4. Cohere re-ranks results for best relevance
  5. GPT synthesizes answer from top context
  6. Response β†’ ElevenLabs β†’ Voice output

Assistant: "According to your Q4 report, the key findings were: revenue grew 23% year-over-year, customer retention improved to 94%, and the new product line exceeded targets by 18%."



πŸ“š Resources


🀝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.

About

VoiceRAG combines voice interaction with retrieval-augmented generation (RAG) for intelligent, conversational access to your stored knowledge. It integrates ElevenLabs Voice Agents, OpenAI GPT models, Supabase/Vector databases, and n8n automations for smooth knowledge-to-voice workflows.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published