Skip to main content
CodeAlive

Your Code. Your Servers. Your AI.

Run the same product inside your perimeter. Same engine, same retrieval, same MCP. Plug in your own LLM and your own Git server.

  • Docker Compose or Kubernetes / Helm
  • Bring Your Own LLM (OpenAI-compatible)
  • Tenant Isolation + AES-256 Envelope Encryption

When the cloud version doesn't make it past procurement

If your code can't leave the network (for regulators, IP, or your own security team), the cloud product is a non-starter. This is the same product running on your hardware.

Strict Data Residency

  • Financial services and other regulated industries with regional data laws
  • Internal compliance policies that prohibit code leaving the perimeter
  • Companies preparing for security audits (SOC 2 / ISO 27001 of their own)
Nothing leaves your VPC. Ever.

IP & Secrets Protection

  • Proprietary algorithms, models, or trade secrets in the codebase
  • Strict third-party data sharing policies
  • NDA-driven engagements where outbound code paths are unacceptable
Source stays where you control the keys

Procurement & Vendor Constraints

  • Multinational companies with regional vendor approval lists
  • Customers who require running everything in their own VPC
  • Teams using internal LLM stacks instead of public APIs
No new vendors, no exception requests.

Same features as cloud.

Self-hosted ships the same binary as cloud. The matrix below is exhaustive.

Full Context Engine

  • CloudIncluded
  • Self-HostedIncluded

Multi-Repository Intelligence

  • CloudIncluded
  • Self-HostedIncluded

Code Review Agent

  • CloudIncluded
  • Self-HostedIncluded

Deep Research

  • CloudIncluded
  • Self-HostedIncluded

MCP Server

  • CloudIncluded
  • Self-HostedIncluded

API Access

  • CloudIncluded
  • Self-HostedIncluded

GraphRAG Knowledge Graph

  • CloudIncluded
  • Self-HostedIncluded

Incremental Indexing

  • CloudIncluded
  • Self-HostedIncluded

Team Collaboration

  • CloudIncluded
  • Self-HostedIncluded

RBAC & Permissions

  • CloudIncluded
  • Self-HostedIncluded

Tenant Isolation (per-org KEK binding)

  • CloudIncluded
  • Self-HostedIncluded

Encryption at Rest (AES-256-GCM envelope)

  • CloudIncluded
  • Self-HostedIncluded

Custom LLM Support

  • CloudNot included
  • Self-HostedIncluded

SSO / SAML

  • CloudRoadmap
  • Self-HostedAvailable on Enterprise (custom)

Data Residency Control

  • CloudLimited
  • Self-HostedFull

Two deployment paths

Two install paths: one for getting started, one for production at scale.

Quick Setup

Docker Compose

Single-node deployment for smaller teams.

Ideal for
Development teams, POCs, small enterprises
Recommended
16GB RAM, 8 vCPUs, 200GB SSD storage

Up in under an hour on one node.

Production Ready

Kubernetes / Helm

Horizontally scalable, high-availability deployment.

Ideal for
Large enterprises, platform teams
Compatibility
Managed K8s (EKS, GKE, AKS) and on-prem distributions

HA, autoscaling, thousands of repos.

All deployments include automated backup, monitoring endpoints, and health checks. Container images are pulled from our private registry (or your mirror).

Bring your own LLM.

Retrieval does the work, so a 70B-class open-weights model handles most queries. No H100 cluster required.

Most of what makes a frontier model look smart on code is context. We supply the context, you bring a smaller model that fits your GPU budget.

Recommended Open-Weights Models

Tested with these. Anything OpenAI-compatible also works.

  • gpt-oss-120b

    OpenAI open-weights generalist; strong default for coding/reasoning

  • GLM 5.1

    Zhipu AI flagship; long context, strong code understanding

  • Kimi 2.6

    Moonshot agentic model; large context window

  • DeepSeek V4

    DeepSeek MoE flagship; strong on coding tasks

  • Qwen 3 Coder

    Alibaba long-context coder model

Hardware footprint depends on the chosen model and quantization. Pick a model that fits the GPU budget you already have. CodeAlive's retrieval lets a smaller model behave like a larger one with better context.

Commercial API Support

For hybrid deployments.

  • OpenAI (GPT-class models)
  • Anthropic Claude
  • Google Gemini (default in current self-hosted builds)
  • DeepInfra
  • Azure OpenAI Service

How You Wire It In

CodeAlive talks to any OpenAI-compatible API. That covers vLLM, Ollama, TGI, LocalAI, SGLang, and most internal LLM gateways. Point the LLM endpoint at your inference stack, drop in an API key, done.

Smaller model with good retrieval beats a frontier model without it.

What's actually in the codebase today

No SOC 2 badge yet. Instead, here's what's actually in the codebase today.

Tenant Isolation

  • Every query auto-filtered by organization ID at the repository layer
  • Cross-tenant access throws WrongTenantAccessException, enforced in code, not policy
  • Sandboxed indexing containers per repo with no DB or gateway access
  • Covered by integration tests that run on every build

Access Control

  • Role-based access control (RBAC) via the Mandate model (Administrator / Manager / User / ReadOnly / Guest)
  • Organization-level workspaces and per-repository scoping
  • API keys bound to a single org and storable as SHA-256 hashes
  • SSO / SAML: on the roadmap; available today only for Enterprise via custom deployment

Data Protection

  • Source code is not permanently stored. Repos are pulled into a sandbox during indexing and deleted after
  • Code symbols stored encrypted: AES-256-GCM with envelope encryption
  • Per-org KEK binding: even with the master key, decrypting another org's data fails cryptographically
  • TLS 1.3 in transit; key material zeroed after use; KEKs versioned for zero-downtime rotation

Audit & Observability

  • Structured logging via OpenTelemetry (traces + logs + metrics)
  • Query traces include user/org context; consumable from your existing log backend (Grafana / Loki / Datadog / Splunk)
  • GDPR-aware consent logging for cookie/preference events
  • SIEM-ready format (JSON), no native SIEM connectors yet

Network Security

  • Runs entirely within your VPC / network for self-hosted
  • Default-deny network policies between services
  • Operator access via VPN; no public Kubernetes API
  • Secrets injected via External Secrets Operator, never in images or env files

Vulnerability Management

  • Container image signing
  • Regular security patches via versioned image releases
  • CVE monitoring on dependencies
  • SBOM available on request

Available on Request

  • NDA tailored to your requirements
  • Security overview document
  • Pen-test engagement window for Enterprise customers

What runs where

Every component, where it sits in your network, and what it talks to.

Component Descriptions

Context Engine
Indexes and queries your codebase
GraphRAG Engine
Builds and queries the knowledge graph of code relationships
Indexer Service
Processes repositories and extracts semantic information
Query Processor
Handles natural language queries and retrieves relevant context
Vector DB
Stores embeddings for semantic search
Your LLM
The language model you choose to run (fully under your control)

Resource Requirements

CodeAlive itself, excluding any local LLM you run alongside.

  • Minimum (POC)

    Repositories
    A handful
    Users
    A few
    CPU
    4 cores
    RAM
    8GB
    Storage
    50GB SSD
  • Production (small/medium team)

    Repositories
    Up to ~200
    Users
    Up to ~100
    CPU
    8+ cores
    RAM
    16-32GB
    Storage
    200GB+ SSD
  • Large team / multi-team

    Repositories
    Up to ~1,000
    Users
    Up to ~500
    CPU
    16+ cores
    RAM
    64GB+
    Storage
    500GB+ SSD
  • Enterprise

    Repositories
    Custom
    Users
    Custom
    CPU
    Sized with you
    RAM
    Sized with you
    Storage
    Sized with you

GPU is only required if you also run a local LLM on the same host. For an OpenAI-compatible LLM running elsewhere (your own vLLM cluster, an LLM gateway, or a cloud API), CodeAlive itself runs CPU-only. Real footprint depends on repo size, indexing cadence, and concurrent users. Happy to size with you on a deployment call.

Talks to your existing stack

The Git servers, identity providers, and observability stack you already run.

Git Providers (Internal)

  • GitLab Self-Managed
  • Bitbucket Data Center
  • Gitea
  • GitHub Enterprise Server
  • Azure DevOps Server

Identity Providers

Roadmap; available on Enterprise via custom deployment.

  • SAML 2.0 / OIDC (planned for general availability)
  • Okta
  • Azure AD / Entra ID
  • Ping Identity
  • OneLogin
  • Keycloak

Today:Email/password + API keys with org-scoped RBAC

AI Agents & IDEs (via MCP)

  • Cursor
  • Claude Code
  • Continue
  • Cline
  • VS Code (via extension)
  • JetBrains IDEs

Observability

  • Prometheus metrics endpoint
  • OpenTelemetry traces
  • Structured logging (JSON)
  • Integration with Grafana, Datadog, Splunk

CI/CD Integration

  • API for pipeline integration
  • Webhook support
  • GitHub Actions (self-hosted runners)
  • GitLab CI
  • Jenkins

Customer stories

Self-hosted customers land here once they go live and approve a quote. No invented personas in the meantime. Want to be the first reference? Talk to us.

For now, see the homepage testimonials: Hauke Feddersen (grasbyte GmbH), Zhaksylyk Ualiyev (Esqadra Technologies), Alexander Kolotov (Blockscout), Sergey Loginov, and Sergey Sarafinovich. Those are real, published with permission.

First answer in a day, production in a week

You're not figuring out the Helm chart alone. We deploy with you.

Deployment timeline

  1. Discovery Call

    Day 0
    • Understand your requirements
    • Assess infrastructure compatibility
    • Define success criteria
  2. Architecture Review

    Day 1-2
    • Finalize deployment topology
    • Plan integration points
    • Security review
  3. Deployment

    Day 3-5
    • Guided installation
    • Configuration assistance
    • Initial repository indexing
  4. Validation

    Day 5-7
    • Functional testing
    • Performance tuning
    • User acceptance
  5. Go Live

    Day 7+
    • Team onboarding
    • Ongoing support
    • Regular check-ins

Support Tiers

  • Standard

    Response Time
    24 hours
    Included With
    All self-hosted
  • Priority

    Response Time
    4 hours
    Included With
    Enterprise tier
  • Dedicated

    Response Time
    1 hour + Slack
    Included With
    Enterprise+

Common Questions About Self-Hosted Deployment

Ready to run it inside your perimeter?

Inside your VPC, against your LLM, talking to your internal Git server.

  • Same engine as the cloud product: same retrieval, same MCP, same agents
  • Deployable via Docker Compose or Kubernetes / Helm
  • Bring your own LLM; anything OpenAI-compatible
  • NDA available; security overview document on request