High-Performance Multi-Model Database with Native AI/LLM Integration
"ThemisDB keeps its own llamas." β Run LLaMA, Mistral, Phi-3 directly in your database, no API calls needed.
π§ Native LLM Integration with llama.cpp (Optional)
"ThemisDB keeps its own llamas." β Run AI/LLM workloads directly in your database - no external API costs!
[!NOTE] LLM integration is an optional feature that requires: -# LLM Features (When Enabled)
| Feature | Description | Status |
|---|---|---|
| π§ Embedded LLM Engine | llama.cpp integration for LLaMA/Mistral/Phi-3 (1B-70B params) | β |
| πΌοΈ Image Analysis AI | Multi-backend plugins (llama.cpp Vision, ONNX CLIP, OpenCV DNN) | β |
| β‘ GPU Acceleration | NVIDIA CUDA support with significant speedup | β |
| πΎ PagedAttention | Advanced memory management | β |
| π― Continuous Batching | Handle concurrent inference requests | β |
| π§ Quantization | Q4_K_M, Q5_K_M, Q8_0 for efficient memory usage | β |
| π Monitoring | Grafana dashboards with metrics and alerts | β |
| π Plugin Architecture | Extensible LLM and image analysis backends | β |
| π Distributed RPC | Inter-shard communication for distributed LLM ops | β |
[!TIP] GPU acceleration provides significant speedup over CPU with PagedAttention memory savings.
- β‘ Significant speedup with GPU acceleration vs CPU
- πΎ Memory savings with PagedAttention and prefix caching
- π Kernel fusion for additional performance gains
- β Comprehensive test coverage with unit tests
π Documentation:
- π§ LLM Complete Setup Guide (DE) - VollstΓ€ndiger Guide fΓΌr LLM-Setup und Inferencing
- π― Overview - System architecture and design
ThemisDB is a production-ready multi-model database that combines relational, graph, vector, and document models in a single system with full ACID transaction support. Built on RocksDB with advanced security and compliance features.
Available Editions
| Edition | License | Features |
|---|---|---|
| π Community | Open Source (MIT) | Full-featured single-node database with all core capabilities |
| π Enterprise | Commercial | + Horizontal scaling, advanced analytics, HA/replication, and more |
Database Capabilities
| Feature | Description | Community | Enterprise |
|---|---|---|---|
| π Quick Start |
# Pull and run the latest version
docker pull themisdb/themisdb:latest
# Run with Docker
docker run -d \
--name themis \
-p 8080:8080 \
-p 18765:18765 \
-p 4318:4318 \
-v themis_data:/data \
themisdb/themisdb:latest
# Or use Docker Compose
docker compose up -d
# Verify installation
curl http://localhost:8080/health[!TIP] Use Docker Compose for production deployments with proper configuration.
| Port | Protocol | Description |
|---|---|---|
8080 |
HTTP/1.1 | REST API, GraphQL |
18765 |
Binary | Wire Protocol, gRPC |
4318 |
HTTP | OpenTelemetry/Prometheus |
[!NOTE] Complete Port Reference: See [v1.3.0+)
- β Image Analysis - Multi-backend AI plugins (v1.3.0+)
- β GNN Embeddings - Graph Neural Network support
π Modern Protocols
| Protocol | Status | Description |
|---|---|---|
| HTTP/1.1 | β | REST API, GraphQL |
| HTTP/2 | β | Server Push for CDC |
| HTTP/3 | π§ | QUIC (experimental) |
| WebSocket | β | Bidirectional streaming |
| gRPC | β | Binary RPC |
| MQTT | β | IoT messaging |
| PostgreSQL Wire | β | BI tool compatibility |
| MCP | β | Model Context Protocol |
| SSE | β | Server-Sent Events |
π Transparency & Attribution
ThemisDB is built on proven open-source foundations with clear attribution:
- β Transparent Attribution - Clear documentation of all dependencies
- β Innovation Documentation - ThemisDB's unique contributions vs third-party features
- β License Compliance - Full license information for all components
- π ACID Transactions - Full snapshot isolation with MVCC
- π Multi-Model - Relational, Graph, Vector, Document in one database
- π High Performance - 45K writes/s, 120K reads/s, GPU-accelerated vector search
- π‘οΈ Security - TLS 1.3, RBAC, field-level encryption, audit logging (Enterprise: HSM integration)
- π Analytics - Time-series, aggregations (Enterprise: OLAP, CEP, materialized views)
- π Distribution - Single-node optimized (Enterprise: horizontal sharding, replication, Kubernetes)
- π§ AI-Ready - Hybrid search (RAG), embedding cache, FAISS integration, optional LLM engine with llama.cpp (v1.3.0+), image analysis AI plugins (v1.3.0+)
- π Modern Protocols - HTTP/1.1, GraphQL, SSE, gRPC (v1.3.0), HTTP/2 with Server Push β , WebSocket β , MQTT β , HTTP/3 π§, PostgreSQL Wire β , MCP β
- π Transparent Attribution - Clear documentation of third-party dependencies vs ThemisDB innovations (see ATTRIBUTIONS.md)
- πΌοΈ Image Analysis - Multi-backend AI plugin architecture (llama.cpp Vision, ONNX CLIP, OpenCV DNN)
# Pull and run the latest version
docker pull themisdb/themisdb:latest
docker run -d \
-p 8080:8080 \
-p 18765:18765 \
-p 4318:4318 \
-v themis_data:/data \
themisdb/themisdb:latest
# Or use Docker Compose
docker compose up -d
# Check health
curl http://localhost:8080/healthDefault Ports:
8080- HTTP/REST API, GraphQL18765- Binary Wire Protocol, gRPC4318- OpenTelemetry/Prometheus metrics
π Complete Port Reference: See docs/deployment/PORT_REFERENCE.md for all ports including optional protocols (MQTT, PostgreSQL Wire, MCP).
# Clone repository
git clone https://github.com/makr-code/ThemisDB.git
cd ThemisDB
# Setup and build (Linux/macOS)
./scripts/setup.sh
./scripts/build.sh
# Setup and build (Windows)
.\scripts\setup.ps1
.\scripts\build.ps1
# Start server
./build/themis_server --config config.yamlOptional Protocol Support (Security: Opt-In by Default):
# Enable HTTP/2 with Server Push (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_HTTP2=ON
# Enable WebSocket with CDC (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_WEBSOCKET=ON
# Enable MQTT broker (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_MQTT=ON
# Enable PostgreSQL Wire Protocol (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_POSTGRES_WIRE=ON
# Enable MCP for LLM integration (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_MCP=ON
# Enable HTTP/3 (explicit opt-in for security)
cmake -B build -S . -DTHEMIS_ENABLE_HTTP3=ON
# Default build only includes HTTP/1.1, GraphQL, SSE, gRPC (minimal attack surface)See Protocol Documentation for details.
# OPTIONAL: FΓΌr LLM-UnterstΓΌtzung - lokaler Clone von llama.cpp erforderlich
if (!(Test-Path "llama.cpp")) {
git clone https://github.com/ggerganov/llama.cpp.git llama.cpp
}
# MSVC Release-Build mit LLM-UnterstΓΌtzung
powershell -File scripts/build-themis-server-llm.ps1
# Sanity-Check
./build-msvc/bin/themis_server.exe --helpHinweise:
- LLM-UnterstΓΌtzung ist optional und erfordert
-DTHEMIS_ENABLE_LLM=ONbeim Build llama.cpp/liegt als lokaler Clone im Projekt-Root und ist per.gitignoreund.dockerignoreausgeschlossen (wird nicht committed oder in Docker kopiert)- Der Build-Skript setzt Visual Studio 2022 (
-G "Visual Studio 17 2022") und-A x64, bindet die vcpkg-Toolchain ein und behebt MSVCβspezifischechar8_tβFehler amllamaβTarget
β Comprehensive Build Documentation | Build-Varianten, Plattformen, Troubleshooting
Linux (Debian/Ubuntu):
wget https://github.com/makr-code/ThemisDB/releases/latest/download/themisdb_1.3.0-1_amd64.deb
sudo apt install ./themisdb_1.3.0-1_amd64.deb
sudo systemctl start themisdbmacOS (Homebrew):
brew install themisdb
brew services start themisdbWindows (Chocolatey):
choco install themisdb# 1. Check server health
curl http://localhost:8765/health
# 2. Create an entity
curl -X PUT http://localhost:8765/entities/users:alice \
-H "Content-Type: application/json" \
-d '{"blob":"{\"name\":\"Alice\",\"age\":30,\"city\":\"Berlin\"}"}'
# 3. Create an index
curl -X POST http://localhost:8765/index/create \
-H "Content-Type: application/json" \
-d '{"table":"users","column":"city"}'
# 4. Query by index
curl -X POST http://localhost:8765/query \
-H "Content-Type: application/json" \
-d '{"table":"users","predicates":[{"column":"city","value":"Berlin"}],"return":"entities"}'
# 5. View metrics
curl http://localhost:8765/metricsThemisDB uses a unified storage architecture with specialized projection layers:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Query Layer (AQL) β
β SQL-like β’ Graph Traversals β’ Vector Search β’ Analyticsβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Projection Layers β
β Secondary Indices β’ Graph Adjacency β’ HNSW Vector β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Canonical Storage (Base Entity) β
β RocksDB LSM-Tree β’ MVCC Transactions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Core Components:
- Storage Engine: RocksDB TransactionDB with LSM-Tree
- Transaction Manager: MVCC with snapshot isolation
- Query Engine: Advanced Query Language (AQL) with graph/vector support
- Index Manager: Automatic maintenance of secondary, graph, and vector indexes
- Security: TLS 1.3, RBAC, field encryption, audit logging
- Observability: Prometheus metrics, OpenTelemetry tracing
β Full Architecture Documentation
- Relational: SQL-like queries with secondary indexes
- Graph: BFS, Dijkstra, A* traversals with path constraints
- Vector: HNSW and FAISS for similarity search (GPU-accelerated)
- Document: JSON storage with flexible schema
- Time-Series: Gorilla compression, continuous aggregates
- Full ACID guarantees with snapshot isolation
- Write-write conflict detection
- Atomic updates across all index types
- Session-based and direct API
- CEP Engine: Complex Event Processing with pattern matching
- OLAP: CUBE, ROLLUP, window functions
- Time-Series: Compression, retention policies, aggregates
- Hybrid Search: BM25 + vector for RAG workflows
- TLS 1.3 with mTLS support
- Role-Based Access Control (RBAC)
- Field-level encryption
- Audit logging with SIEM integration
- Certificate pinning for HSM/TSA
- Secrets management (HashiCorp Vault)
- Horizontal sharding with consistent hashing
- Leader-follower and multi-master replication
- RAID-like redundancy (MIRROR, STRIPE, PARITY)
- Kubernetes operator with CRDs
- Auto-rebalancing and cloud deployment
- 10 backend options: CUDA, Vulkan, HIP, OpenCL, DirectX, OneAPI, ZLUDA
- 10-50x speedup for vector search
- Automatic platform detection and fallback
Getting Started:
Core Concepts:
Features:
Operations:
Development:
Full Documentation: https://makr-code.github.io/ThemisDB/
Getting Started
- π Quick Start
- π 5-Minute Tutorial
- π³ Docker Deployment
- π§ Building from Source
Core Concepts
- ποΈ Architecture Overview
- πΎ Multi-Model Design
- π Transaction Management
- π AQL Query Language
Features
- π― Vector Search
- πΈοΈ Graph Operations
- π Time-Series Engine
- π Security & Compliance
- β‘ Feature Overview
Operations
- βοΈ Configuration Guide
- π Monitoring & Metrics
- πΎ Backup & Recovery
- β‘ Performance Tuning
Development
- π¨ Build Guide
- π€ Contributing
- π API Reference
- π¦ Client SDKs
Enterprise & Strategy
- ποΈ CMS Strategy Paper (DE) - ThemisDB fΓΌr Content Management in Government und Enterprise
- πΌ Enterprise Edition - Enterprise features and licensing
- π Governance - Data governance and policies
[!NOTE] Full Documentation: https://makr-code.github.io/ThemisDB/
Production-Ready Features
- β ACID transactions with MVCC
- β Multi-model support (relational, graph, vector, document)
- β Horizontal sharding and replication
- β GPU acceleration (10 backends)
- β Enterprise security features
- β Client SDKs (7 languages)
- β Kubernetes operator
- β Native LLM integration (optional)
- β Modern protocol support (HTTP/2, WebSocket, gRPC, MQTT, PostgreSQL Wire, MCP)
- π§ Query Optimizer - Advanced query optimization and execution plans
- π§ Multi-Datacenter - Cross-region deployment support
- π§ Advanced ML/GNN - Enhanced machine learning features
- π§ Production Hardening - Additional stability and performance improvements
- π Modular Architecture - Split monolithic core into 11 focused libraries
- π Real-Time Views - Materialized views with automatic updates
- π Cross-Region Replication - Global data distribution
- π Advanced Compliance - SOC 2, HIPAA certification
- π Cloud-Native Optimizations - Enhanced cloud provider integrations
π Detailed Planning:
Test Environment: Release build, Windows x64, 20 cores @ 3696 MHz
| Operation | Throughput | Latency (avg) | Notes |
|---|---|---|---|
| π Entity PUT | 45,000 ops/s | 0.02 ms | Write throughput |
| π Entity GET | 120,000 ops/s | 0.008 ms | Read throughput |
| π Indexed Query | 3.4M queries/s | 0.29 ΞΌs | AQL WHERE clause |
| πΈοΈ Graph Traverse | 9.56M ops/s | 0.105 ΞΌs | BFS (depth=3) |
| π― Vector Search (RGB) | 59.7M queries/s | 0.017 ΞΌs | Simple 3D vectors |
| π Vector Insert (384D) | 411k vectors/s | 2.44 ΞΌs | Typical embeddings |
| π§ RAG Search (Top-50) | 7.17M queries/s | 0.14 ΞΌs | LLM retrieval |
[!IMPORTANT] Performance Disclaimer: Benchmarks represent optimal conditions. Actual performance varies based on:
- Hardware configuration (CPU, RAM, storage)
- Data size and complexity
- Concurrent workload patterns
- Build configuration and optimizations
π Detailed Analysis:
- Complete Benchmark Results
- [π€ Community & Support
| Resource | Description | Link |
|---|---|---|
| π Documentation | Complete guides and API reference | Docs Site |
| π Issues | Report bugs or request features | GitHub Issues |
| π¬ Discussions | Community Q&A and discussions | GitHub Discussions |
| π€ Contributing | How to contribute to ThemisDB | Contributing Guide |
| π Security | Responsible disclosure policy | Security Policy |
License Information
ThemisDB Community Edition is released under the MIT License.
- β Free to use, modify, and distribute
- β Commercial use allowed
- β Full feature set for single-node deployments
ThemisDB Enterprise Edition features (horizontal sharding, advanced analytics, HA/replication, etc.) are available under a commercial license.
Enterprise Inquiries: [email protected]
ThemisDB builds upon and is inspired by these excellent projects:
Inspirations & Foundations
| Project | Influence | Area |
|---|---|---|
| ArangoDB | Multi-model architecture | Design Philosophy |
| CozoDB | Hybrid relational-graph-vector | Data Models |
| Azure Cosmos DB | Multi-model with unified API | API Design |
| RocksDB | High-performance LSM-Tree storage | Storage Engine |
| FAISS | Efficient similarity search | Vector Search |
[!NOTE] For a complete list of third-party libraries and detailed feature attributions, see ATTRIBUTIONS.md.
Built with β€οΈ for the database community
β Star us on GitHub Β· π Read the Docs Β· π€ Contribute
Built with β€οΈ for the database community