Features

A self-hosted RAG stack designed for regulated EU workloads.

I3K RAG Enterprise is a production-grade RAG platform that runs 100% locally inside your perimeter. FastAPI, React, Qdrant, Ollama and an in-house retrieval orchestrator — no external dependencies, no US cloud.

The stack

Every component is open-source, runs locally and can be audited end to end. No black boxes, no hidden network calls.

FastAPI backend

REST API on :8000. JWT auth, user management, query orchestration.

React + Vite frontend

Modern UI on :3000 with real-time updates and document management.

Qdrant vector store

On :6333. RBAC filtering applied at the storage layer.

Ollama LLM server

On :11434. Local inference for Qwen3:14b-q4_K_M and Mistral 7B Q4.

SQLite user DB

User accounts, roles, sessions. Single-file, easy to back up.

Apache Tika + Tesseract

Text extraction and OCR for 10+ document formats.

The RAG pipeline

Four steps, end to end. The retrieval orchestrator is written in-house by I3K — no third-party agent framework, no US-based middleware in the request path.

01
Ingest
Upload via web UI or API. Apache Tika + Tesseract extract text from PDF, DOCX, PPTX, XLSX, ODT, RTF, HTML, XML and scanned documents (OCR).
Tika · OCR
02
Embed & store
Documents are chunked and embedded with BAAI/bge-m3 (29 languages). Vectors stored in Qdrant with metadata for RBAC filtering.
bge-m3 · Qdrant
03
Retrieve
Our in-house retrieval orchestrator runs semantic retrieval from Qdrant. Configurable relevance threshold and top-K. Role-based filtering at retrieval layer.
I3K orchestrator
04
Generate
Retrieved chunks passed to EuLLM (default) or compatible LLM for grounded answer generation. Default models: Qwen3:14b-q4_K_M, Mistral 7B Q4. Fully local, zero external calls.
EuLLM · Mistral 7B

Powered by EuLLM

I3K RAG Enterprise ships with EuLLM as the recommended generation backend. Open-source LLM inference engine built in the EU. No US cloud dependencies in your AI stack — and full transparency on weights, training and governance.

Visit eullm.eu View on GitHub

EuLLM

Default inference engine — EU-trained foundation models

www.eullm.eu

Supported models

Defaults that work out of the box. Swap any model via the setup flow.

Role	Default	Notes
Embedding model	BAAI/bge-m3	29 languages out of the box. No per-language fine-tuning.
Generation LLM (recommended)	EuLLM	EU-sovereign inference engine. eullm.eu
Generation LLM (alternative)	Qwen3:14b-q4_K_M	Default Ollama model. Strong general capabilities, 8–16 GB VRAM.
Generation LLM (lighter)	Mistral 7B (Q4)	Smaller footprint, lower VRAM. Good for evaluation or CPU-only setups.

Document formats

Apache Tika and Tesseract extract structured text from everything you've got, including scanned PDFs and image-based documents.

PDF (with OCR)
DOCX / DOC
PPTX / PPT
XLSX / XLS
TXT
MD
ODT
RTF
HTML
XML

Structured extraction

Pro & Cloud only

Beyond retrieval, the Pro and Cloud editions ship a structured extraction engine: given a target schema (JSON, table columns, named fields), the engine pulls typed values out of unstructured documents with provenance back to the source chunk. Useful for contract analytics, KYC pipelines and regulatory reporting.

JSON-Schema as the contract between extraction request and result
Citation back to the source chunk for every extracted field
Confidence scores and rejection thresholds tunable per field
Batch and streaming modes

Backup & restore (built in)

Full system backup with rclone — 70+ cloud and traditional providers supported. Cron scheduling, retention policies, zero-downtime. No add-on, no extra licence.

Cloud

S3 / MinIO
MEGA
Google Drive
OneDrive
Dropbox
Backblaze B2
pCloud

Traditional

WebDAV / Nextcloud
FTP
SFTP

Deployment topologies

One-command install on Ubuntu 20.04+. Scale to multi-server or run air-gapped on bare metal.

Single host

Recommended for most teams. Setup script ~1 hour. Production-ready for 10,000+ documents.

Air-gapped

Offline install from pre-staged packages. For defense, healthcare, critical infrastructure.

Multi-server

Qdrant and Ollama on dedicated GPU nodes, FastAPI backend separate. Manual post-install wiring.

One-command install

installbash

git clone https://github.com/I3K-IT/RAG-Enterprise.git
cd RAG-Enterprise
./install.sh

Compliance & security

GDPR and the EU AI Act are first-class engineering requirements, not a marketing checkbox. AGPL-3.0 source means everything is auditable.

JWT authentication with role-based access (User, Super User, Admin)
GDPR Art. 32 controls: TLS in front, password hashing (bcrypt), session expiration
Audit trail at application level
Right-to-erasure: per-document and per-user deletion
Data residency enforced: no outbound calls in the request path
AGPL-3.0 source — full transparency, full auditability

Ready to run RAG on your own infrastructure?

Start with the open-source Community edition, or talk to us about Pro with structured extraction, SSO, audit log and SLA.

Star on GitHub Book a call