On-Premises Deployment
Running Compass in your own environment for air-gapped and regulated use cases.
For organisations that cannot send data to external cloud services, Compass offers on-premises deployment on the Enterprise plan. This puts the entire stack — connectors, AI service, and application — inside your own infrastructure.
When to Consider On-Premises
- Air-gapped environments — No internet connectivity allowed
- Regulated industries — Financial services, healthcare, government with strict data handling rules
- Data sovereignty — All data must remain within your country or network boundary
- Internal policy — Corporate security policy prohibits external SaaS for sensitive data
What Gets Deployed
The on-premises deployment includes the full Compass stack:
| Component | Purpose |
|---|---|
| Web Application | Next.js frontend and API backend |
| AI Service | Python-based report generation engine |
| Connector Services | MCP servers for your IAM integrations |
| Database | PostgreSQL for reports, users, and configuration |
| Local LLM | Ollama or vLLM for AI processing without external API calls |
Deployment Options
Docker Compose
For smaller deployments and proof-of-concept:
# docker-compose.yml (simplified)
services:
web:
image: compass/web:latest
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://...
- AI_SERVICE_URL=http://ai:8000
ai:
image: compass/ai:latest
ports:
- "8000:8000"
environment:
- LLM_PROVIDER=ollama
- OLLAMA_BASE_URL=http://ollama:11434
ollama:
image: ollama/ollama:latest
volumes:
- ollama-data:/root/.ollama
db:
image: postgres:16
volumes:
- pg-data:/var/lib/postgresql/dataHelm Chart (Kubernetes)
For production deployments with high availability:
- Horizontal scaling for the web application and AI service
- Database connection pooling
- Health checks and auto-restart
- Resource limits and requests
- Ingress configuration with TLS
Local LLM Support
On-premises deployments use local language models instead of external API calls:
| Provider | Models | GPU Required |
|---|---|---|
| Ollama | Llama 3, Mistral, Mixtral | Recommended (works on CPU) |
| vLLM | Any HuggingFace model | Yes (CUDA) |
Hardware Requirements
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 8 cores | 16 cores |
| RAM | 32 GB | 64 GB |
| GPU | None (CPU inference) | NVIDIA A10 / L4 (24GB VRAM) |
| Storage | 100 GB SSD | 500 GB NVMe SSD |
GPU acceleration significantly improves report generation speed (15–30 seconds vs 2–5 minutes on CPU).
Network Requirements
Fully Air-Gapped
- No internet access required after initial deployment
- LLM models are pre-loaded into the local provider
- Container images transferred via offline registry or USB
Partially Connected
- Internet access only for container image updates
- All IAM system access stays on your internal network
- Optional: external LLM API for higher quality (with your own API keys)
Updates
Air-Gapped Updates
- Download the update bundle on a connected machine
- Transfer to your air-gapped environment
- Apply with
docker compose pullorhelm upgrade
Connected Updates
- Pull latest container images from the Compass registry
- Apply database migrations automatically on startup
- Zero-downtime upgrades with rolling deployments on Kubernetes
Licensing
On-premises deployments require an Enterprise licence key:
- The licence key is validated on application startup
- For air-gapped environments, offline licence validation is supported
- Licence includes support, updates, and a defined number of users/connectors
Support
Enterprise on-premises customers receive:
- Dedicated support channel
- Deployment assistance and architecture review
- Priority bug fixes and security patches
- Quarterly business reviews
Contact sales@usecompass.io to discuss on-premises deployment for your organisation.