Features

A self-hosted RAG stack designed for regulated EU workloads.

I3K RAG Enterprise is a production-grade RAG platform that runs 100% locally inside your perimeter. FastAPI, React, Qdrant, Ollama and an in-house retrieval orchestrator — no external dependencies, no US cloud.

The stack

Every component is open-source, runs locally and can be audited end to end. No black boxes, no hidden network calls.

FastAPI backend

REST API on :8000. JWT auth, user management, query orchestration.

React + Vite frontend

Modern UI on :3000 with real-time updates and document management.

Qdrant vector store

On :6333. RBAC filtering applied at the storage layer.

Ollama LLM server

On :11434. Local inference for Qwen3:14b-q4_K_M and Mistral 7B Q4.

SQLite user DB

User accounts, roles, sessions. Single-file, easy to back up.

Apache Tika + Tesseract

Text extraction and OCR for 10+ document formats.

The RAG pipeline

Four steps, end to end. The retrieval orchestrator is written in-house by I3K — no third-party agent framework, no US-based middleware in the request path.

  1. 01

    Ingest

    Upload via web UI or API. Apache Tika + Tesseract extract text from PDF, DOCX, PPTX, XLSX, ODT, RTF, HTML, XML and scanned documents (OCR).

    Tika · OCR

  2. 02

    Embed & store

    Documents are chunked and embedded with BAAI/bge-m3 (29 languages). Vectors stored in Qdrant with metadata for RBAC filtering.

    bge-m3 · Qdrant

  3. 03

    Retrieve

    Our in-house retrieval orchestrator runs semantic retrieval from Qdrant. Configurable relevance threshold and top-K. Role-based filtering at retrieval layer.

    I3K orchestrator

  4. 04

    Generate

    Retrieved chunks passed to EuLLM (default) or compatible LLM for grounded answer generation. Default models: Qwen3:14b-q4_K_M, Mistral 7B Q4. Fully local, zero external calls.

    EuLLM · Mistral 7B

Powered by EuLLM

I3K RAG Enterprise ships with EuLLM as the recommended generation backend. Open-source LLM inference engine built in the EU. No US cloud dependencies in your AI stack — and full transparency on weights, training and governance.

EuLLM

Default inference engine — EU-trained foundation models

www.eullm.eu

Supported models

Defaults that work out of the box. Swap any model via the setup flow.

RoleDefaultNotes
Embedding modelBAAI/bge-m329 languages out of the box. No per-language fine-tuning.
Generation LLM (recommended)EuLLMEU-sovereign inference engine. eullm.eu
Generation LLM (alternative)Qwen3:14b-q4_K_MDefault Ollama model. Strong general capabilities, 8–16 GB VRAM.
Generation LLM (lighter)Mistral 7B (Q4)Smaller footprint, lower VRAM. Good for evaluation or CPU-only setups.

Document formats

Apache Tika and Tesseract extract structured text from everything you've got, including scanned PDFs and image-based documents.

  • PDF (with OCR)
  • DOCX / DOC
  • PPTX / PPT
  • XLSX / XLS
  • TXT
  • MD
  • ODT
  • RTF
  • HTML
  • XML

Structured extraction

Pro & Cloud only

Beyond retrieval, the Pro and Cloud editions ship a structured extraction engine: given a target schema (JSON, table columns, named fields), the engine pulls typed values out of unstructured documents with provenance back to the source chunk. Useful for contract analytics, KYC pipelines and regulatory reporting.

  • JSON-Schema as the contract between extraction request and result
  • Citation back to the source chunk for every extracted field
  • Confidence scores and rejection thresholds tunable per field
  • Batch and streaming modes

Backup & restore (built in)

Full system backup with rclone — 70+ cloud and traditional providers supported. Cron scheduling, retention policies, zero-downtime. No add-on, no extra licence.

Cloud

  • S3 / MinIO
  • MEGA
  • Google Drive
  • OneDrive
  • Dropbox
  • Backblaze B2
  • pCloud

Traditional

  • WebDAV / Nextcloud
  • FTP
  • SFTP

Deployment topologies

One-command install on Ubuntu 20.04+. Scale to multi-server or run air-gapped on bare metal.

Single host

Recommended for most teams. Setup script ~1 hour. Production-ready for 10,000+ documents.

Air-gapped

Offline install from pre-staged packages. For defense, healthcare, critical infrastructure.

Multi-server

Qdrant and Ollama on dedicated GPU nodes, FastAPI backend separate. Manual post-install wiring.

One-command install

installbash
git clone https://github.com/I3K-IT/RAG-Enterprise.git
cd RAG-Enterprise
./install.sh

Compliance & security

GDPR and the EU AI Act are first-class engineering requirements, not a marketing checkbox. AGPL-3.0 source means everything is auditable.

  • JWT authentication with role-based access (User, Super User, Admin)
  • GDPR Art. 32 controls: TLS in front, password hashing (bcrypt), session expiration
  • Audit trail at application level
  • Right-to-erasure: per-document and per-user deletion
  • Data residency enforced: no outbound calls in the request path
  • AGPL-3.0 source — full transparency, full auditability

Ready to run RAG on your own infrastructure?

Start with the open-source Community edition, or talk to us about Pro with structured extraction, SSO, audit log and SLA.