Features
A self-hosted RAG stack designed for regulated EU workloads.
I3K RAG Enterprise is a production-grade RAG platform that runs 100% locally inside your perimeter. FastAPI, React, Qdrant, Ollama and an in-house retrieval orchestrator — no external dependencies, no US cloud.
The stack
Every component is open-source, runs locally and can be audited end to end. No black boxes, no hidden network calls.
FastAPI backend
REST API on :8000. JWT auth, user management, query orchestration.
React + Vite frontend
Modern UI on :3000 with real-time updates and document management.
Qdrant vector store
On :6333. RBAC filtering applied at the storage layer.
Ollama LLM server
On :11434. Local inference for Qwen3:14b-q4_K_M and Mistral 7B Q4.
SQLite user DB
User accounts, roles, sessions. Single-file, easy to back up.
Apache Tika + Tesseract
Text extraction and OCR for 10+ document formats.
The RAG pipeline
Four steps, end to end. The retrieval orchestrator is written in-house by I3K — no third-party agent framework, no US-based middleware in the request path.
- 01
Ingest
Upload via web UI or API. Apache Tika + Tesseract extract text from PDF, DOCX, PPTX, XLSX, ODT, RTF, HTML, XML and scanned documents (OCR).
Tika · OCR
- 02
Embed & store
Documents are chunked and embedded with BAAI/bge-m3 (29 languages). Vectors stored in Qdrant with metadata for RBAC filtering.
bge-m3 · Qdrant
- 03
Retrieve
Our in-house retrieval orchestrator runs semantic retrieval from Qdrant. Configurable relevance threshold and top-K. Role-based filtering at retrieval layer.
I3K orchestrator
- 04
Generate
Retrieved chunks passed to EuLLM (default) or compatible LLM for grounded answer generation. Default models: Qwen3:14b-q4_K_M, Mistral 7B Q4. Fully local, zero external calls.
EuLLM · Mistral 7B
Powered by EuLLM
I3K RAG Enterprise ships with EuLLM as the recommended generation backend. Open-source LLM inference engine built in the EU. No US cloud dependencies in your AI stack — and full transparency on weights, training and governance.
EuLLM
Default inference engine — EU-trained foundation models
www.eullm.eu
Supported models
Defaults that work out of the box. Swap any model via the setup flow.
| Role | Default | Notes |
|---|---|---|
| Embedding model | BAAI/bge-m3 | 29 languages out of the box. No per-language fine-tuning. |
| Generation LLM (recommended) | EuLLM | EU-sovereign inference engine. eullm.eu |
| Generation LLM (alternative) | Qwen3:14b-q4_K_M | Default Ollama model. Strong general capabilities, 8–16 GB VRAM. |
| Generation LLM (lighter) | Mistral 7B (Q4) | Smaller footprint, lower VRAM. Good for evaluation or CPU-only setups. |
Document formats
Apache Tika and Tesseract extract structured text from everything you've got, including scanned PDFs and image-based documents.
- PDF (with OCR)
- DOCX / DOC
- PPTX / PPT
- XLSX / XLS
- TXT
- MD
- ODT
- RTF
- HTML
- XML
Structured extraction
Pro & Cloud onlyBeyond retrieval, the Pro and Cloud editions ship a structured extraction engine: given a target schema (JSON, table columns, named fields), the engine pulls typed values out of unstructured documents with provenance back to the source chunk. Useful for contract analytics, KYC pipelines and regulatory reporting.
- JSON-Schema as the contract between extraction request and result
- Citation back to the source chunk for every extracted field
- Confidence scores and rejection thresholds tunable per field
- Batch and streaming modes
Backup & restore (built in)
Full system backup with rclone — 70+ cloud and traditional providers supported. Cron scheduling, retention policies, zero-downtime. No add-on, no extra licence.
Cloud
- S3 / MinIO
- MEGA
- Google Drive
- OneDrive
- Dropbox
- Backblaze B2
- pCloud
Traditional
- WebDAV / Nextcloud
- FTP
- SFTP
Deployment topologies
One-command install on Ubuntu 20.04+. Scale to multi-server or run air-gapped on bare metal.
Single host
Recommended for most teams. Setup script ~1 hour. Production-ready for 10,000+ documents.
Air-gapped
Offline install from pre-staged packages. For defense, healthcare, critical infrastructure.
Multi-server
Qdrant and Ollama on dedicated GPU nodes, FastAPI backend separate. Manual post-install wiring.
One-command install
git clone https://github.com/I3K-IT/RAG-Enterprise.git
cd RAG-Enterprise
./install.shCompliance & security
GDPR and the EU AI Act are first-class engineering requirements, not a marketing checkbox. AGPL-3.0 source means everything is auditable.
- JWT authentication with role-based access (User, Super User, Admin)
- GDPR Art. 32 controls: TLS in front, password hashing (bcrypt), session expiration
- Audit trail at application level
- Right-to-erasure: per-document and per-user deletion
- Data residency enforced: no outbound calls in the request path
- AGPL-3.0 source — full transparency, full auditability
Ready to run RAG on your own infrastructure?
Start with the open-source Community edition, or talk to us about Pro with structured extraction, SSO, audit log and SLA.