Quickstart
I3K RAG Enterprise is a self-hosted RAG platform that runs 100% on your own infrastructure. No US cloud dependencies, no data leaving your perimeter, AGPL-3.0 licensed. The one-command installer brings up the full stack — Qdrant, Ollama, the FastAPI backend, the React frontend and the OCR pipeline — in roughly one hour, or about 15 minutes on a fast connection.
This guide walks you from a clean Ubuntu host to your first RAG query.
Requirements
- OS: Ubuntu 20.04+ (22.04 recommended)
- RAM: 16 GB minimum, 32 GB recommended
- Storage: 50 GB or more
- GPU: NVIDIA CUDA (8–16 GB VRAM recommended), AMD ROCm, or CPU-only
- Network: 80+ Mbit/s recommended for the initial model pull
GPU vs CPU
GPU is strongly recommended for usable end-to-end latency on Qwen3:14b-q4_K_M. CPU-only works for development and small corpora; expect higher response times.
Installation (1 command)
Clone the repository and run the installer:
git clone https://github.com/I3K-IT/RAG-Enterprise.git
cd RAG-Enterprise
./install.shThe installer is interactive. It asks you two things:
- GPU type — NVIDIA, AMD or CPU. It configures Ollama and the embedding runtime accordingly.
- LLM model —
Qwen3:14b-q4_K_M(default, best quality on 16 GB VRAM) orMistral 7B Q4(lighter, fits on 8 GB).
The script then runs unattended. Total time is around one hour on a typical connection, ~15 minutes on a fast link.
What the script does
- Pulls and configures Qdrant as the vector store (port 6333)
- Installs Ollama (port 11434) and downloads the chosen LLM
- Sets up the FastAPI backend (port 8000) with the I3K retrieval orchestrator pipeline
- Builds and serves the React + Vite frontend (port 3000)
- Initializes the SQLite user database with JWT auth and the three roles (User, Super User, Admin)
- Installs Apache Tika and Tesseract for document parsing and OCR
- Downloads the BAAI/bge-m3 embedding model (29 languages)
When it finishes, the installer prints the admin credentials it generated. Save them.
First login
Open the frontend:
http://localhost:3000
Log in with the admin account printed by the installer. From the left sidebar, go to Documents and upload your first file. Supported formats: PDF (with OCR for scanned pages), DOCX/DOC, PPTX/PPT, XLSX/XLS, TXT, MD, ODT, RTF, HTML, XML.
First upload and query
On upload, the backend pipeline runs:
- Extraction — Apache Tika parses the file; Tesseract handles scanned PDFs via OCR.
- Chunking — the text is split into semantic chunks.
- Embedding — each chunk is encoded with
BAAI/bge-m3(multilingual, 29 languages). - Indexing — vectors are written to Qdrant with their metadata.
Once indexing completes, ask a question from the chat UI. The backend retrieves the relevant chunks from Qdrant, hands them to Ollama running your chosen LLM, and returns a grounded answer with citations to the source documents.
The same query path is also exposed as a REST API by the FastAPI backend on port 8000, so you can integrate I3K RAG Enterprise into your own applications. Endpoints are protected by JWT and honour the User / Super User / Admin role boundaries.
Next steps
You now have a working single-node deployment. Read the architecture overview to understand how the components fit together, or jump to deployment topologies for multi-node, backup with rclone, and production hardening.