Self-Hosting Memintel

Memintel is open source and designed to be self-hosted. This page covers everything you need to run Memintel on your own infrastructure — from provisioning dependencies through to verifying a working deployment.

note

The hosted cloud version is the fastest way to get started. Self-hosting is for teams that need data residency, private infrastructure, or custom deployment configurations.

Architecture

Memintel has a minimal production footprint. There is no message queue, no scheduler, and no object storage dependency.

Your Application
      ↓
Memintel Backend  (Python / FastAPI)
      ↓           ↓
PostgreSQL 15+   Redis 7+
      ↑
LLM Provider (Anthropic / Any OpenAI-compatible endpoint / AWS Bedrock / Google Vertex AI)

Three dependencies. One process. No scheduler.

The application layer owns scheduling — your application calls POST /evaluate/full on whatever cadence it needs. Memintel does not include a built-in scheduler.

Prerequisites

Dependency	Minimum Version	Notes
Python	3.11+	Required to run the backend
PostgreSQL	15+	Primary database. Cloud-hosted recommended
Redis	7+	Cache layer. Cloud-hosted recommended
Node.js	18+	Required for TypeScript SDK only — not needed for the backend
Git	Any recent	To clone the repository
LLM Provider	—	Anthropic (cloud), any OpenAI-compatible endpoint (Ollama, vLLM, Azure OpenAI, on-premise), AWS Bedrock (planned), Google Vertex AI (planned)

Step 1 — Provision Dependencies

PostgreSQL

Set up a PostgreSQL 15+ instance. Recommended cloud providers:

Provider	Notes	Free Tier
Supabase	Easiest setup. Managed PostgreSQL with dashboard	Yes — 500MB
Neon	Serverless PostgreSQL. Good for variable workloads	Yes — 3GB
AWS RDS	Best for production at scale	No
Railway	Simple setup. Good for development and staging	Yes — limited

Once provisioned, your connection string will look like:

postgresql://username:password@host:5432/memintel

warning

Never put credentials in any config file committed to Git. Always use environment variables.

Redis

Set up a Redis 7+ instance. Recommended providers:

Provider	Notes	Free Tier
Upstash	Serverless Redis. Easiest setup. Pay per request	Yes — 10,000 req/day
Redis Cloud	Official Redis hosting	Yes — 30MB
AWS ElastiCache	Best for production at scale	No
Railway	Simple setup alongside the backend	Yes — limited

LLM Provider

Memintel supports two LLM providers today, with two more planned:

Provider	Config value	Notes
Anthropic	`provider: anthropic`	Simplest setup. Recommended for cloud deployments.
OpenAI-compatible	`provider: openai_compatible`	Covers any server that speaks the OpenAI chat completions API: Ollama, vLLM, LM Studio, Azure OpenAI, and any on-premise model deployment.
AWS Bedrock	`provider: bedrock`	Planned — not yet implemented.
Google Vertex AI	`provider: vertex`	Planned — not yet implemented.

The provider is selected in memintel_config.yaml:

# Cloud deployment — Anthropic
llm:
  provider: anthropic
  model: claude-sonnet-4-6
  api_key: ${ANTHROPIC_API_KEY}

# On-premise deployment — any OpenAI-compatible endpoint
# Covers: Ollama, vLLM, LM Studio, Azure OpenAI, private model servers
llm:
  provider: openai_compatible
  base_url: https://llm.internal.yourcompany.com/v1
  model: llama3.1:70b
  api_key: ${INTERNAL_LLM_KEY}   # omit entirely if your endpoint needs no auth
  timeout_seconds: 60             # increase for slower on-premise GPU servers
  ssl_verify: false               # set false for self-signed internal certificates

For Anthropic, get your API key from console.anthropic.com. For on-premise deployments, the api_key field is optional — omit it entirely if your internal endpoint requires no authentication.

On-premise deployments

When using an on-premise model via openai_compatible, the LLM inference, data evaluation, and all decision records remain entirely within your network. Nothing leaves your infrastructure at any point in the pipeline. The guardrails system bounds the LLM's output regardless of which model is used — a weaker on-premise model produces the same validated, deterministic compiled output as a cloud model. Tighter guardrails configuration compensates for any capability difference.

Backend Hosting

The FastAPI backend runs on any platform that supports Python:

Platform	Best For	Notes
Railway	Quickest deployment	Supports env vars, auto-deploy from GitHub
Render	Simple managed hosting	Free tier available. Auto-deploy from GitHub
AWS EC2 / ECS	Full control at scale	More configuration required
Google Cloud Run	Serverless containers	Good for variable workloads

Step 2 — Configure Environment Variables

warning

The server will refuse to start if any required variable is missing. This is by design.

Required Variables

Variable	Required	Default	Description
`DATABASE_URL`	Yes	—	Full PostgreSQL connection string including credentials
`REDIS_URL`	Yes	—	Full Redis connection string including credentials
`MEMINTEL_CONFIG_PATH`	Yes	—	Absolute path to `memintel_config.yaml` on the server

Optional Variables

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	No*	—	Required when `llm.provider: anthropic` and `USE_LLM_FIXTURES=false`
`MEMINTEL_ELEVATED_KEY`	Yes	—	Secret key for privileged API endpoints (registry, compile, guardrails, calibration). Use a strong random string
`USE_LLM_FIXTURES`	No	`true`	Set to `false` in production to use a real LLM provider
`MAX_RETRIES`	No	`3`	LLM refinement loop retries before failing
`DB_POOL_MIN`	No	`5`	Minimum asyncpg connection pool size
`DB_POOL_MAX`	No	`20`	Maximum asyncpg connection pool size
`DB_COMMAND_TIMEOUT`	No	`30`	Per-statement database timeout in seconds

tip

ANTHROPIC_API_KEY is only required when USE_LLM_FIXTURES=false and llm.provider is set to anthropic. For openai_compatible providers, set base_url in memintel_config.yaml instead — no additional environment variable is needed unless your endpoint requires an API key. During initial setup and smoke testing, leave USE_LLM_FIXTURES=true — this validates the full pipeline without consuming LLM credits regardless of which provider is configured.

note

If your memintel_config.yaml references connector environment variables (for example ANALYTICS_DB_HOST, ANALYTICS_DB_USER, ANALYTICS_DB_PASSWORD, ACCOUNTS_DB_HOST, ACCOUNTS_DB_USER, ACCOUNTS_DB_PASSWORD), those must also be set before starting the server. The server will exit with a ConfigError naming the missing variable if any are absent.

Setting Variables

On Linux / Mac:

# For Anthropic:
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
export ANTHROPIC_API_KEY="sk-ant-..."

# For OpenAI-compatible / on-premise (set base_url in memintel_config.yaml):
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
# Only set if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"

On Railway, Render, or other cloud platforms: set these in the platform's environment variables dashboard. Never put them in a .env file committed to Git.

Step 3 — Install and Run

Clone the Repository

git clone https://github.com/SooperGenAI/memintel.git
cd memintel/backend/memintel-backend

Install Python Dependencies

Memintel requires Python 3.11. If you have a newer Python version installed, create a virtual environment explicitly using 3.11:

python3.11 -m venv venv

Activate it:

# On Mac/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

Then install dependencies:

pip install -r requirements.txt

Verify key packages are installed:

python -c "import fastapi, pydantic, asyncpg, aioredis; print('OK')"

Run Unit Tests

python -m pytest tests/unit/ -q --tb=short

All unit tests should pass before proceeding.

Step 4 — Run Database Migrations

With DATABASE_URL set, run the Alembic migrations:

alembic upgrade head

Expected output:

INFO [alembic.runtime.migration] Running upgrade -> 0001, initial_schema
INFO [alembic.runtime.migration] Running upgrade 0001 -> 0002, add_application_context
INFO [alembic.runtime.migration] Running upgrade 0002 -> 0003, add_guardrails_versions
INFO [alembic.runtime.migration] Running upgrade 0003 -> 0004, add_decision_records

warning

If you see any error during migration, do not proceed. Fix the migration error before starting the server.

warning

If alembic reports a driver error such as "scheme is expected to be either postgresql or postgres", temporarily set DATABASE_URL with the asyncpg prefix for the migration command only:

DATABASE_URL=postgresql+asyncpg://user:password@host:5432/memintel alembic upgrade head

On Windows PowerShell:

$env:DATABASE_URL = "postgresql+asyncpg://user:password@host:5432/memintel"
alembic upgrade head

Then revert DATABASE_URL back to postgresql:// before starting the server.

Verify the schema — you should see all nine tables:

psql $DATABASE_URL -c "\dt"

Table	Purpose
`tasks`	Task definitions with version pinning. Carries three provenance fields: `context_version`, `guardrails_version`, `context_warning`
`definitions`	Concept, primitive, and condition definitions
`feedback`	User feedback for calibration
`calibration_tokens`	Single-use calibration tokens
`execution_graphs`	Compiled execution graphs (IR)
`jobs`	Async job tracking
`application_context`	Application context versions
`guardrails_versions`	Guardrails versions — created by Migration 0003. Stores all API-posted guardrails with full audit trail
`decisions`	Decision audit trail — every evaluation recorded with concept_value, threshold_applied, condition_version, ir_hash, input_primitives, signal_errors, and outcome

Step 5 — Start the Server

uvicorn app.main:app --host 0.0.0.0 --port 8000

For production with multiple workers:

uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

Verify Startup

The server logs should show:

INFO: Started server process
INFO: Waiting for application startup.
INFO: Application startup complete.

warning

If startup fails with ConfigError, check that all required environment variables are set and MEMINTEL_CONFIG_PATH points to a valid file.

Health Check

curl http://localhost:8000/openapi.json

If the openapi endpoint returns Not Found, verify the server is running by opening the Swagger API docs in your browser:

http://localhost:8000/docs

A working deployment will show the full Memintel API reference. If the page loads, the server is running correctly.

warning

If database or cache shows "disconnected", verify DATABASE_URL and REDIS_URL are correct and the services are reachable from the server.

Step 6 — Smoke Test

Run these tests with USE_LLM_FIXTURES=true before switching to a real LLM. This validates the full pipeline against your real database and Redis without consuming LLM credits.

Define Application Context

curl -X POST http://localhost:8000/context/context \
  -H "Content-Type: application/json" \
  -d '{
    "domain": {
      "description": "Smoke test deployment",
      "entities": [{"name": "user", "description": "test user"}],
      "decisions": ["test_decision"]
    }
  }'

Expected: HTTP 201 with context_id and "version": "v1"

Create a Task

curl -X POST http://localhost:8000/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "intent": "Alert when churn score exceeds 0.8",
    "entity_scope": "user",
    "delivery": {"type": "webhook", "endpoint": "https://example.com/hook"}
  }'

Expected: HTTP 200 with task_id, condition_id, "context_version": "v1", "context_warning": null

Execute Full Pipeline

curl -X POST http://localhost:8000/evaluate/full \
  -H "Content-Type: application/json" \
  -d '{
    "concept_id": "<concept_id from task>",
    "concept_version": "<concept_version from task>",
    "condition_id": "<condition_id from task>",
    "condition_version": "<condition_version from task>",
    "entity": "user_test_001",
    "timestamp": "2024-01-15T09:00:00Z"
  }'

Expected: HTTP 200, result.deterministic: true

Verify Determinism

Run the same execute call three times with identical parameters. All three responses must return the same result.value and decision.value:

for i in 1 2 3; do
  curl -s -X POST http://localhost:8000/evaluate/full \
    -H 'Content-Type: application/json' \
    -d '{...same payload...}' | jq '.result.value'
done

If the values diverge, stop and investigate before going to production.

Step 7 — Switch to Real LLM

Once all smoke tests pass against the real database, update your environment variables and memintel_config.yaml to enable the real LLM provider.

For Anthropic:

export USE_LLM_FIXTURES=false
export ANTHROPIC_API_KEY="sk-ant-..."

Ensure memintel_config.yaml has:

llm:
  provider: anthropic
  model: claude-sonnet-4-6
  api_key: ${ANTHROPIC_API_KEY}

For OpenAI-compatible / on-premise:

export USE_LLM_FIXTURES=false
# Only if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"

Ensure memintel_config.yaml has:

llm:
  provider: openai_compatible
  base_url: https://llm.internal.yourcompany.com/v1
  model: llama3.1:70b
  api_key: ${INTERNAL_LLM_KEY}   # omit this line if no auth required
  timeout_seconds: 60
  ssl_verify: false               # only if using self-signed certificates

Restart the server after changing these variables, then repeat the smoke test with the real LLM. This will consume LLM credits for Anthropic — run once to verify, not repeatedly.

Test one task creation for each strategy type to confirm the LLM is resolving correctly within guardrails:

Strategy	Test Intent
`threshold`	`"Alert when churn score exceeds 0.8"`
`z_score`	`"Alert when login frequency drops significantly"`
`equals`	`"Alert when user segment is high_risk"`
`composite`	`"Alert when churn is high AND engagement is low"`

Step 8 — Post-Deployment Setup

Define Application Context

The first action after deployment is verified — define your application context. This significantly improves task creation accuracy from the first real user request.

curl -X POST https://your-domain/context/context \
  -H "Content-Type: application/json" \
  -d @your-context-file.json

See the Application Context page in the Admin Guide for domain-specific examples and full schema documentation.

Define Guardrails (Recommended)

Post your guardrails policy via API. This takes effect immediately without a restart:

curl -X POST https://your-domain/guardrails \
  -H "Content-Type: application/json" \
  -H "X-Elevated-Key: your-elevated-key" \
  -d @guardrails.json

tip

Using POST /guardrails is preferred over editing memintel_guardrails.yaml — changes take effect immediately with no restart required, and every change is versioned. See the Admin Guide for full guardrails schema and domain-specific examples.

Monitoring

Memintel emits structured logs with trace_id, entity, concept_id, and condition_id on every execution. Configure your log aggregation tool to index on these fields:

Index trace_id for request tracing
Alert on execution_error log entries
Dashboard: execution duration p50 / p95 / p99
Dashboard: condition evaluation true/false ratio per condition_id

First Week

Day	Action
Day 1	Define application context (`POST /context/context`). Post guardrails (`POST /guardrails`). Create first task.
Day 2–3	Run executions. Collect initial feedback. Verify feedback is being recorded.
Day 4–5	Run first calibration cycle. Review recommended parameter adjustments.
Day 7	Review execution logs. Check for unexpected errors. Verify determinism is holding.

Estimated Deployment Time

Phase	Estimated Time
Infrastructure setup	30–60 minutes
Environment configuration	15 minutes
Database migration	5 minutes
Service startup and verification	15 minutes
Smoke testing (fixtures mode)	20 minutes
Real LLM integration test	30 minutes
Total	~2 hours

Known Issues

aioredis compatibility with Python 3.11

aioredis has a known incompatibility with Python 3.11 that causes this error on startup:

TypeError: duplicate base class TimeoutError

Fix it by patching the installed aioredis exceptions file after pip install completes.

On Mac/Linux:

sed -i 's/class TimeoutError(asyncio.TimeoutError, builtins.TimeoutError, RedisError):/class TimeoutError(asyncio.TimeoutError, RedisError):/' venv/lib/python3.11/site-packages/aioredis/exceptions.py

On Windows PowerShell:

(Get-Content venv\Lib\site-packages\aioredis\exceptions.py) -replace 'class TimeoutError\(asyncio\.TimeoutError, builtins\.TimeoutError, RedisError\):', 'class TimeoutError(asyncio.TimeoutError, RedisError):' | Set-Content venv\Lib\site-packages\aioredis\exceptions.py

This patch is applied to your local venv only and does not affect the source code. It will need to be reapplied if you recreate the venv.

Architecture​

Prerequisites​

Step 1 — Provision Dependencies​

PostgreSQL​

Redis​

LLM Provider​

Backend Hosting​

Step 2 — Configure Environment Variables​

Required Variables​

Optional Variables​

Setting Variables​

Step 3 — Install and Run​

Clone the Repository​

Install Python Dependencies​

Run Unit Tests​

Step 4 — Run Database Migrations​

Step 5 — Start the Server​

Verify Startup​

Health Check​

Step 6 — Smoke Test​

Define Application Context​

Create a Task​

Execute Full Pipeline​

Verify Determinism​

Step 7 — Switch to Real LLM​

Step 8 — Post-Deployment Setup​

Define Application Context​

Define Guardrails (Recommended)​

Monitoring​

First Week​

Estimated Deployment Time​

Known Issues​

aioredis compatibility with Python 3.11​

Further Reading​

Architecture

Prerequisites

Step 1 — Provision Dependencies

PostgreSQL

Redis

LLM Provider

Backend Hosting

Step 2 — Configure Environment Variables

Required Variables

Optional Variables

Setting Variables

Step 3 — Install and Run

Clone the Repository

Install Python Dependencies

Run Unit Tests

Step 4 — Run Database Migrations

Step 5 — Start the Server

Verify Startup

Health Check

Step 6 — Smoke Test

Define Application Context

Create a Task

Execute Full Pipeline

Verify Determinism

Step 7 — Switch to Real LLM

Step 8 — Post-Deployment Setup

Define Application Context

Define Guardrails (Recommended)

Monitoring

First Week

Estimated Deployment Time

Known Issues

aioredis compatibility with Python 3.11

Further Reading