Skip to main content

Self-Hosting Memintel

Memintel is open source and designed to be self-hosted. This page covers everything you need to run Memintel on your own infrastructure — from provisioning dependencies through to verifying a working deployment.

note

The hosted cloud version is the fastest way to get started. Self-hosting is for teams that need data residency, private infrastructure, or custom deployment configurations.


Architecture

Memintel has a minimal production footprint. There is no message queue, no scheduler, and no object storage dependency.

Your Application

Memintel Backend (Python / FastAPI)
↓ ↓
PostgreSQL 15+ Redis 7+

LLM Provider (Anthropic / Any OpenAI-compatible endpoint / AWS Bedrock / Google Vertex AI)

Three dependencies. One process. No scheduler.

The application layer owns scheduling — your application calls POST /evaluate/full on whatever cadence it needs. Memintel does not include a built-in scheduler.


Prerequisites

DependencyMinimum VersionNotes
Python3.11+Required to run the backend
PostgreSQL15+Primary database. Cloud-hosted recommended
Redis7+Cache layer. Cloud-hosted recommended
Node.js18+Required for TypeScript SDK only — not needed for the backend
GitAny recentTo clone the repository
LLM ProviderAnthropic (cloud), any OpenAI-compatible endpoint (Ollama, vLLM, Azure OpenAI, on-premise), AWS Bedrock (planned), Google Vertex AI (planned)

Step 1 — Provision Dependencies

PostgreSQL

Set up a PostgreSQL 15+ instance. Recommended cloud providers:

ProviderNotesFree Tier
SupabaseEasiest setup. Managed PostgreSQL with dashboardYes — 500MB
NeonServerless PostgreSQL. Good for variable workloadsYes — 3GB
AWS RDSBest for production at scaleNo
RailwaySimple setup. Good for development and stagingYes — limited

Once provisioned, your connection string will look like:

postgresql://username:password@host:5432/memintel
warning

Never put credentials in any config file committed to Git. Always use environment variables.

Redis

Set up a Redis 7+ instance. Recommended providers:

ProviderNotesFree Tier
UpstashServerless Redis. Easiest setup. Pay per requestYes — 10,000 req/day
Redis CloudOfficial Redis hostingYes — 30MB
AWS ElastiCacheBest for production at scaleNo
RailwaySimple setup alongside the backendYes — limited

LLM Provider

Memintel supports two LLM providers today, with two more planned:

ProviderConfig valueNotes
Anthropicprovider: anthropicSimplest setup. Recommended for cloud deployments.
OpenAI-compatibleprovider: openai_compatibleCovers any server that speaks the OpenAI chat completions API: Ollama, vLLM, LM Studio, Azure OpenAI, and any on-premise model deployment.
AWS Bedrockprovider: bedrockPlanned — not yet implemented.
Google Vertex AIprovider: vertexPlanned — not yet implemented.

The provider is selected in memintel_config.yaml:

# Cloud deployment — Anthropic
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}

# On-premise deployment — any OpenAI-compatible endpoint
# Covers: Ollama, vLLM, LM Studio, Azure OpenAI, private model servers
llm:
provider: openai_compatible
base_url: https://llm.internal.yourcompany.com/v1
model: llama3.1:70b
api_key: ${INTERNAL_LLM_KEY} # omit entirely if your endpoint needs no auth
timeout_seconds: 60 # increase for slower on-premise GPU servers
ssl_verify: false # set false for self-signed internal certificates

For Anthropic, get your API key from console.anthropic.com. For on-premise deployments, the api_key field is optional — omit it entirely if your internal endpoint requires no authentication.

On-premise deployments

When using an on-premise model via openai_compatible, the LLM inference, data evaluation, and all decision records remain entirely within your network. Nothing leaves your infrastructure at any point in the pipeline. The guardrails system bounds the LLM's output regardless of which model is used — a weaker on-premise model produces the same validated, deterministic compiled output as a cloud model. Tighter guardrails configuration compensates for any capability difference.

Backend Hosting

The FastAPI backend runs on any platform that supports Python:

PlatformBest ForNotes
RailwayQuickest deploymentSupports env vars, auto-deploy from GitHub
RenderSimple managed hostingFree tier available. Auto-deploy from GitHub
AWS EC2 / ECSFull control at scaleMore configuration required
Google Cloud RunServerless containersGood for variable workloads

Step 2 — Configure Environment Variables

warning

The server will refuse to start if any required variable is missing. This is by design.

Required Variables

VariableRequiredDefaultDescription
DATABASE_URLYesFull PostgreSQL connection string including credentials
REDIS_URLYesFull Redis connection string including credentials
MEMINTEL_CONFIG_PATHYesAbsolute path to memintel_config.yaml on the server

Optional Variables

VariableRequiredDefaultDescription
ANTHROPIC_API_KEYNo*Required when llm.provider: anthropic and USE_LLM_FIXTURES=false
MEMINTEL_ELEVATED_KEYYesSecret key for privileged API endpoints (registry, compile, guardrails, calibration). Use a strong random string
USE_LLM_FIXTURESNotrueSet to false in production to use a real LLM provider
MAX_RETRIESNo3LLM refinement loop retries before failing
DB_POOL_MINNo5Minimum asyncpg connection pool size
DB_POOL_MAXNo20Maximum asyncpg connection pool size
DB_COMMAND_TIMEOUTNo30Per-statement database timeout in seconds
tip

ANTHROPIC_API_KEY is only required when USE_LLM_FIXTURES=false and llm.provider is set to anthropic. For openai_compatible providers, set base_url in memintel_config.yaml instead — no additional environment variable is needed unless your endpoint requires an API key. During initial setup and smoke testing, leave USE_LLM_FIXTURES=true — this validates the full pipeline without consuming LLM credits regardless of which provider is configured.

note

If your memintel_config.yaml references connector environment variables (for example ANALYTICS_DB_HOST, ANALYTICS_DB_USER, ANALYTICS_DB_PASSWORD, ACCOUNTS_DB_HOST, ACCOUNTS_DB_USER, ACCOUNTS_DB_PASSWORD), those must also be set before starting the server. The server will exit with a ConfigError naming the missing variable if any are absent.

Setting Variables

On Linux / Mac:

# For Anthropic:
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
export ANTHROPIC_API_KEY="sk-ant-..."

# For OpenAI-compatible / on-premise (set base_url in memintel_config.yaml):
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
# Only set if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"

On Railway, Render, or other cloud platforms: set these in the platform's environment variables dashboard. Never put them in a .env file committed to Git.


Step 3 — Install and Run

Clone the Repository

git clone https://github.com/SooperGenAI/memintel.git
cd memintel/backend/memintel-backend

Install Python Dependencies

Memintel requires Python 3.11. If you have a newer Python version installed, create a virtual environment explicitly using 3.11:

python3.11 -m venv venv

Activate it:

# On Mac/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

Then install dependencies:

pip install -r requirements.txt

Verify key packages are installed:

python -c "import fastapi, pydantic, asyncpg, aioredis; print('OK')"

Run Unit Tests

python -m pytest tests/unit/ -q --tb=short

All unit tests should pass before proceeding.


Step 4 — Run Database Migrations

With DATABASE_URL set, run the Alembic migrations:

alembic upgrade head

Expected output:

INFO [alembic.runtime.migration] Running upgrade -> 0001, initial_schema
INFO [alembic.runtime.migration] Running upgrade 0001 -> 0002, add_application_context
INFO [alembic.runtime.migration] Running upgrade 0002 -> 0003, add_guardrails_versions
INFO [alembic.runtime.migration] Running upgrade 0003 -> 0004, add_decision_records
warning

If you see any error during migration, do not proceed. Fix the migration error before starting the server.

warning

If alembic reports a driver error such as "scheme is expected to be either postgresql or postgres", temporarily set DATABASE_URL with the asyncpg prefix for the migration command only:

DATABASE_URL=postgresql+asyncpg://user:password@host:5432/memintel alembic upgrade head

On Windows PowerShell:

$env:DATABASE_URL = "postgresql+asyncpg://user:password@host:5432/memintel"
alembic upgrade head

Then revert DATABASE_URL back to postgresql:// before starting the server.

Verify the schema — you should see all nine tables:

psql $DATABASE_URL -c "\dt"
TablePurpose
tasksTask definitions with version pinning. Carries three provenance fields: context_version, guardrails_version, context_warning
definitionsConcept, primitive, and condition definitions
feedbackUser feedback for calibration
calibration_tokensSingle-use calibration tokens
execution_graphsCompiled execution graphs (IR)
jobsAsync job tracking
application_contextApplication context versions
guardrails_versionsGuardrails versions — created by Migration 0003. Stores all API-posted guardrails with full audit trail
decisionsDecision audit trail — every evaluation recorded with concept_value, threshold_applied, condition_version, ir_hash, input_primitives, signal_errors, and outcome

Step 5 — Start the Server

uvicorn app.main:app --host 0.0.0.0 --port 8000

For production with multiple workers:

uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

Verify Startup

The server logs should show:

INFO: Started server process
INFO: Waiting for application startup.
INFO: Application startup complete.
warning

If startup fails with ConfigError, check that all required environment variables are set and MEMINTEL_CONFIG_PATH points to a valid file.

Health Check

curl http://localhost:8000/openapi.json

If the openapi endpoint returns Not Found, verify the server is running by opening the Swagger API docs in your browser:

http://localhost:8000/docs

A working deployment will show the full Memintel API reference. If the page loads, the server is running correctly.

warning

If database or cache shows "disconnected", verify DATABASE_URL and REDIS_URL are correct and the services are reachable from the server.


Step 6 — Smoke Test

Run these tests with USE_LLM_FIXTURES=true before switching to a real LLM. This validates the full pipeline against your real database and Redis without consuming LLM credits.

Define Application Context

curl -X POST http://localhost:8000/context/context \
-H "Content-Type: application/json" \
-d '{
"domain": {
"description": "Smoke test deployment",
"entities": [{"name": "user", "description": "test user"}],
"decisions": ["test_decision"]
}
}'

Expected: HTTP 201 with context_id and "version": "v1"

Create a Task

curl -X POST http://localhost:8000/tasks \
-H "Content-Type: application/json" \
-d '{
"intent": "Alert when churn score exceeds 0.8",
"entity_scope": "user",
"delivery": {"type": "webhook", "endpoint": "https://example.com/hook"}
}'

Expected: HTTP 200 with task_id, condition_id, "context_version": "v1", "context_warning": null

Execute Full Pipeline

curl -X POST http://localhost:8000/evaluate/full \
-H "Content-Type: application/json" \
-d '{
"concept_id": "<concept_id from task>",
"concept_version": "<concept_version from task>",
"condition_id": "<condition_id from task>",
"condition_version": "<condition_version from task>",
"entity": "user_test_001",
"timestamp": "2024-01-15T09:00:00Z"
}'

Expected: HTTP 200, result.deterministic: true

Verify Determinism

Run the same execute call three times with identical parameters. All three responses must return the same result.value and decision.value:

for i in 1 2 3; do
curl -s -X POST http://localhost:8000/evaluate/full \
-H 'Content-Type: application/json' \
-d '{...same payload...}' | jq '.result.value'
done

If the values diverge, stop and investigate before going to production.


Step 7 — Switch to Real LLM

Once all smoke tests pass against the real database, update your environment variables and memintel_config.yaml to enable the real LLM provider.

For Anthropic:

export USE_LLM_FIXTURES=false
export ANTHROPIC_API_KEY="sk-ant-..."

Ensure memintel_config.yaml has:

llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}

For OpenAI-compatible / on-premise:

export USE_LLM_FIXTURES=false
# Only if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"

Ensure memintel_config.yaml has:

llm:
provider: openai_compatible
base_url: https://llm.internal.yourcompany.com/v1
model: llama3.1:70b
api_key: ${INTERNAL_LLM_KEY} # omit this line if no auth required
timeout_seconds: 60
ssl_verify: false # only if using self-signed certificates

Restart the server after changing these variables, then repeat the smoke test with the real LLM. This will consume LLM credits for Anthropic — run once to verify, not repeatedly.

Test one task creation for each strategy type to confirm the LLM is resolving correctly within guardrails:

StrategyTest Intent
threshold"Alert when churn score exceeds 0.8"
z_score"Alert when login frequency drops significantly"
equals"Alert when user segment is high_risk"
composite"Alert when churn is high AND engagement is low"

Step 8 — Post-Deployment Setup

Define Application Context

The first action after deployment is verified — define your application context. This significantly improves task creation accuracy from the first real user request.

curl -X POST https://your-domain/context/context \
-H "Content-Type: application/json" \
-d @your-context-file.json

See the Application Context page in the Admin Guide for domain-specific examples and full schema documentation.

Post your guardrails policy via API. This takes effect immediately without a restart:

curl -X POST https://your-domain/guardrails \
-H "Content-Type: application/json" \
-H "X-Elevated-Key: your-elevated-key" \
-d @guardrails.json
tip

Using POST /guardrails is preferred over editing memintel_guardrails.yaml — changes take effect immediately with no restart required, and every change is versioned. See the Admin Guide for full guardrails schema and domain-specific examples.

Monitoring

Memintel emits structured logs with trace_id, entity, concept_id, and condition_id on every execution. Configure your log aggregation tool to index on these fields:

  • Index trace_id for request tracing
  • Alert on execution_error log entries
  • Dashboard: execution duration p50 / p95 / p99
  • Dashboard: condition evaluation true/false ratio per condition_id

First Week

DayAction
Day 1Define application context (POST /context/context). Post guardrails (POST /guardrails). Create first task.
Day 2–3Run executions. Collect initial feedback. Verify feedback is being recorded.
Day 4–5Run first calibration cycle. Review recommended parameter adjustments.
Day 7Review execution logs. Check for unexpected errors. Verify determinism is holding.

Estimated Deployment Time

PhaseEstimated Time
Infrastructure setup30–60 minutes
Environment configuration15 minutes
Database migration5 minutes
Service startup and verification15 minutes
Smoke testing (fixtures mode)20 minutes
Real LLM integration test30 minutes
Total~2 hours

Known Issues

aioredis compatibility with Python 3.11

aioredis has a known incompatibility with Python 3.11 that causes this error on startup:

TypeError: duplicate base class TimeoutError

Fix it by patching the installed aioredis exceptions file after pip install completes.

On Mac/Linux:

sed -i 's/class TimeoutError(asyncio.TimeoutError, builtins.TimeoutError, RedisError):/class TimeoutError(asyncio.TimeoutError, RedisError):/' venv/lib/python3.11/site-packages/aioredis/exceptions.py

On Windows PowerShell:

(Get-Content venv\Lib\site-packages\aioredis\exceptions.py) -replace 'class TimeoutError\(asyncio\.TimeoutError, builtins\.TimeoutError, RedisError\):', 'class TimeoutError(asyncio.TimeoutError, RedisError):' | Set-Content venv\Lib\site-packages\aioredis\exceptions.py

This patch is applied to your local venv only and does not affect the source code. It will need to be reapplied if you recreate the venv.


Further Reading