Self-Hosting Memintel
Memintel is open source and designed to be self-hosted. This page covers everything you need to run Memintel on your own infrastructure — from provisioning dependencies through to verifying a working deployment.
The hosted cloud version is the fastest way to get started. Self-hosting is for teams that need data residency, private infrastructure, or custom deployment configurations.
Architecture
Memintel has a minimal production footprint. There is no message queue, no scheduler, and no object storage dependency.
Your Application
↓
Memintel Backend (Python / FastAPI)
↓ ↓
PostgreSQL 15+ Redis 7+
↑
LLM Provider (Anthropic / Any OpenAI-compatible endpoint / AWS Bedrock / Google Vertex AI)
Three dependencies. One process. No scheduler.
The application layer owns scheduling — your application calls POST /evaluate/full on whatever cadence it needs. Memintel does not include a built-in scheduler.
Prerequisites
| Dependency | Minimum Version | Notes |
|---|---|---|
| Python | 3.11+ | Required to run the backend |
| PostgreSQL | 15+ | Primary database. Cloud-hosted recommended |
| Redis | 7+ | Cache layer. Cloud-hosted recommended |
| Node.js | 18+ | Required for TypeScript SDK only — not needed for the backend |
| Git | Any recent | To clone the repository |
| LLM Provider | — | Anthropic (cloud), any OpenAI-compatible endpoint (Ollama, vLLM, Azure OpenAI, on-premise), AWS Bedrock (planned), Google Vertex AI (planned) |
Step 1 — Provision Dependencies
PostgreSQL
Set up a PostgreSQL 15+ instance. Recommended cloud providers:
| Provider | Notes | Free Tier |
|---|---|---|
| Supabase | Easiest setup. Managed PostgreSQL with dashboard | Yes — 500MB |
| Neon | Serverless PostgreSQL. Good for variable workloads | Yes — 3GB |
| AWS RDS | Best for production at scale | No |
| Railway | Simple setup. Good for development and staging | Yes — limited |
Once provisioned, your connection string will look like:
postgresql://username:password@host:5432/memintel
Never put credentials in any config file committed to Git. Always use environment variables.
Redis
Set up a Redis 7+ instance. Recommended providers:
| Provider | Notes | Free Tier |
|---|---|---|
| Upstash | Serverless Redis. Easiest setup. Pay per request | Yes — 10,000 req/day |
| Redis Cloud | Official Redis hosting | Yes — 30MB |
| AWS ElastiCache | Best for production at scale | No |
| Railway | Simple setup alongside the backend | Yes — limited |
LLM Provider
Memintel supports two LLM providers today, with two more planned:
| Provider | Config value | Notes |
|---|---|---|
| Anthropic | provider: anthropic | Simplest setup. Recommended for cloud deployments. |
| OpenAI-compatible | provider: openai_compatible | Covers any server that speaks the OpenAI chat completions API: Ollama, vLLM, LM Studio, Azure OpenAI, and any on-premise model deployment. |
| AWS Bedrock | provider: bedrock | Planned — not yet implemented. |
| Google Vertex AI | provider: vertex | Planned — not yet implemented. |
The provider is selected in memintel_config.yaml:
# Cloud deployment — Anthropic
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}
# On-premise deployment — any OpenAI-compatible endpoint
# Covers: Ollama, vLLM, LM Studio, Azure OpenAI, private model servers
llm:
provider: openai_compatible
base_url: https://llm.internal.yourcompany.com/v1
model: llama3.1:70b
api_key: ${INTERNAL_LLM_KEY} # omit entirely if your endpoint needs no auth
timeout_seconds: 60 # increase for slower on-premise GPU servers
ssl_verify: false # set false for self-signed internal certificates
For Anthropic, get your API key from console.anthropic.com. For on-premise deployments, the api_key field is optional — omit it entirely if your internal endpoint requires no authentication.
When using an on-premise model via openai_compatible, the LLM inference, data evaluation, and all decision records remain entirely within your network. Nothing leaves your infrastructure at any point in the pipeline. The guardrails system bounds the LLM's output regardless of which model is used — a weaker on-premise model produces the same validated, deterministic compiled output as a cloud model. Tighter guardrails configuration compensates for any capability difference.
Backend Hosting
The FastAPI backend runs on any platform that supports Python:
| Platform | Best For | Notes |
|---|---|---|
| Railway | Quickest deployment | Supports env vars, auto-deploy from GitHub |
| Render | Simple managed hosting | Free tier available. Auto-deploy from GitHub |
| AWS EC2 / ECS | Full control at scale | More configuration required |
| Google Cloud Run | Serverless containers | Good for variable workloads |
Step 2 — Configure Environment Variables
The server will refuse to start if any required variable is missing. This is by design.
Required Variables
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | Yes | — | Full PostgreSQL connection string including credentials |
REDIS_URL | Yes | — | Full Redis connection string including credentials |
MEMINTEL_CONFIG_PATH | Yes | — | Absolute path to memintel_config.yaml on the server |
Optional Variables
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY | No* | — | Required when llm.provider: anthropic and USE_LLM_FIXTURES=false |
MEMINTEL_ELEVATED_KEY | Yes | — | Secret key for privileged API endpoints (registry, compile, guardrails, calibration). Use a strong random string |
USE_LLM_FIXTURES | No | true | Set to false in production to use a real LLM provider |
MAX_RETRIES | No | 3 | LLM refinement loop retries before failing |
DB_POOL_MIN | No | 5 | Minimum asyncpg connection pool size |
DB_POOL_MAX | No | 20 | Maximum asyncpg connection pool size |
DB_COMMAND_TIMEOUT | No | 30 | Per-statement database timeout in seconds |
ANTHROPIC_API_KEY is only required when USE_LLM_FIXTURES=false and llm.provider is set to anthropic. For openai_compatible providers, set base_url in memintel_config.yaml instead — no additional environment variable is needed unless your endpoint requires an API key. During initial setup and smoke testing, leave USE_LLM_FIXTURES=true — this validates the full pipeline without consuming LLM credits regardless of which provider is configured.
If your memintel_config.yaml references connector environment variables (for example ANALYTICS_DB_HOST, ANALYTICS_DB_USER, ANALYTICS_DB_PASSWORD, ACCOUNTS_DB_HOST, ACCOUNTS_DB_USER, ACCOUNTS_DB_PASSWORD), those must also be set before starting the server. The server will exit with a ConfigError naming the missing variable if any are absent.
Setting Variables
On Linux / Mac:
# For Anthropic:
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
export ANTHROPIC_API_KEY="sk-ant-..."
# For OpenAI-compatible / on-premise (set base_url in memintel_config.yaml):
export DATABASE_URL="postgresql://user:password@host:5432/memintel"
export REDIS_URL="redis://host:6379/0"
export MEMINTEL_CONFIG_PATH="/etc/memintel/memintel_config.yaml"
export USE_LLM_FIXTURES="false"
# Only set if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"
On Railway, Render, or other cloud platforms: set these in the platform's environment variables dashboard. Never put them in a .env file committed to Git.
Step 3 — Install and Run
Clone the Repository
git clone https://github.com/SooperGenAI/memintel.git
cd memintel/backend/memintel-backend
Install Python Dependencies
Memintel requires Python 3.11. If you have a newer Python version installed, create a virtual environment explicitly using 3.11:
python3.11 -m venv venv
Activate it:
# On Mac/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
Then install dependencies:
pip install -r requirements.txt
Verify key packages are installed:
python -c "import fastapi, pydantic, asyncpg, aioredis; print('OK')"
Run Unit Tests
python -m pytest tests/unit/ -q --tb=short
All unit tests should pass before proceeding.
Step 4 — Run Database Migrations
With DATABASE_URL set, run the Alembic migrations:
alembic upgrade head
Expected output:
INFO [alembic.runtime.migration] Running upgrade -> 0001, initial_schema
INFO [alembic.runtime.migration] Running upgrade 0001 -> 0002, add_application_context
INFO [alembic.runtime.migration] Running upgrade 0002 -> 0003, add_guardrails_versions
INFO [alembic.runtime.migration] Running upgrade 0003 -> 0004, add_decision_records
If you see any error during migration, do not proceed. Fix the migration error before starting the server.
If alembic reports a driver error such as "scheme is expected to be either postgresql or postgres", temporarily set DATABASE_URL with the asyncpg prefix for the migration command only:
DATABASE_URL=postgresql+asyncpg://user:password@host:5432/memintel alembic upgrade head
On Windows PowerShell:
$env:DATABASE_URL = "postgresql+asyncpg://user:password@host:5432/memintel"
alembic upgrade head
Then revert DATABASE_URL back to postgresql:// before starting the server.
Verify the schema — you should see all nine tables:
psql $DATABASE_URL -c "\dt"
| Table | Purpose |
|---|---|
tasks | Task definitions with version pinning. Carries three provenance fields: context_version, guardrails_version, context_warning |
definitions | Concept, primitive, and condition definitions |
feedback | User feedback for calibration |
calibration_tokens | Single-use calibration tokens |
execution_graphs | Compiled execution graphs (IR) |
jobs | Async job tracking |
application_context | Application context versions |
guardrails_versions | Guardrails versions — created by Migration 0003. Stores all API-posted guardrails with full audit trail |
decisions | Decision audit trail — every evaluation recorded with concept_value, threshold_applied, condition_version, ir_hash, input_primitives, signal_errors, and outcome |
Step 5 — Start the Server
uvicorn app.main:app --host 0.0.0.0 --port 8000
For production with multiple workers:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4
Verify Startup
The server logs should show:
INFO: Started server process
INFO: Waiting for application startup.
INFO: Application startup complete.
If startup fails with ConfigError, check that all required environment variables are set and MEMINTEL_CONFIG_PATH points to a valid file.
Health Check
curl http://localhost:8000/openapi.json
If the openapi endpoint returns Not Found, verify the server is running by opening the Swagger API docs in your browser:
http://localhost:8000/docs
A working deployment will show the full Memintel API reference. If the page loads, the server is running correctly.
If database or cache shows "disconnected", verify DATABASE_URL and REDIS_URL are correct and the services are reachable from the server.
Step 6 — Smoke Test
Run these tests with USE_LLM_FIXTURES=true before switching to a real LLM. This validates the full pipeline against your real database and Redis without consuming LLM credits.
Define Application Context
curl -X POST http://localhost:8000/context/context \
-H "Content-Type: application/json" \
-d '{
"domain": {
"description": "Smoke test deployment",
"entities": [{"name": "user", "description": "test user"}],
"decisions": ["test_decision"]
}
}'
Expected: HTTP 201 with context_id and "version": "v1"
Create a Task
curl -X POST http://localhost:8000/tasks \
-H "Content-Type: application/json" \
-d '{
"intent": "Alert when churn score exceeds 0.8",
"entity_scope": "user",
"delivery": {"type": "webhook", "endpoint": "https://example.com/hook"}
}'
Expected: HTTP 200 with task_id, condition_id, "context_version": "v1", "context_warning": null
Execute Full Pipeline
curl -X POST http://localhost:8000/evaluate/full \
-H "Content-Type: application/json" \
-d '{
"concept_id": "<concept_id from task>",
"concept_version": "<concept_version from task>",
"condition_id": "<condition_id from task>",
"condition_version": "<condition_version from task>",
"entity": "user_test_001",
"timestamp": "2024-01-15T09:00:00Z"
}'
Expected: HTTP 200, result.deterministic: true
Verify Determinism
Run the same execute call three times with identical parameters. All three responses must return the same result.value and decision.value:
for i in 1 2 3; do
curl -s -X POST http://localhost:8000/evaluate/full \
-H 'Content-Type: application/json' \
-d '{...same payload...}' | jq '.result.value'
done
If the values diverge, stop and investigate before going to production.
Step 7 — Switch to Real LLM
Once all smoke tests pass against the real database, update your environment variables and memintel_config.yaml to enable the real LLM provider.
For Anthropic:
export USE_LLM_FIXTURES=false
export ANTHROPIC_API_KEY="sk-ant-..."
Ensure memintel_config.yaml has:
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}
For OpenAI-compatible / on-premise:
export USE_LLM_FIXTURES=false
# Only if your endpoint requires authentication:
export INTERNAL_LLM_KEY="your-internal-key"
Ensure memintel_config.yaml has:
llm:
provider: openai_compatible
base_url: https://llm.internal.yourcompany.com/v1
model: llama3.1:70b
api_key: ${INTERNAL_LLM_KEY} # omit this line if no auth required
timeout_seconds: 60
ssl_verify: false # only if using self-signed certificates
Restart the server after changing these variables, then repeat the smoke test with the real LLM. This will consume LLM credits for Anthropic — run once to verify, not repeatedly.
Test one task creation for each strategy type to confirm the LLM is resolving correctly within guardrails:
| Strategy | Test Intent |
|---|---|
threshold | "Alert when churn score exceeds 0.8" |
z_score | "Alert when login frequency drops significantly" |
equals | "Alert when user segment is high_risk" |
composite | "Alert when churn is high AND engagement is low" |
Step 8 — Post-Deployment Setup
Define Application Context
The first action after deployment is verified — define your application context. This significantly improves task creation accuracy from the first real user request.
curl -X POST https://your-domain/context/context \
-H "Content-Type: application/json" \
-d @your-context-file.json
See the Application Context page in the Admin Guide for domain-specific examples and full schema documentation.
Define Guardrails (Recommended)
Post your guardrails policy via API. This takes effect immediately without a restart:
curl -X POST https://your-domain/guardrails \
-H "Content-Type: application/json" \
-H "X-Elevated-Key: your-elevated-key" \
-d @guardrails.json
Using POST /guardrails is preferred over editing memintel_guardrails.yaml — changes take effect immediately with no restart required, and every change is versioned. See the Admin Guide for full guardrails schema and domain-specific examples.
Monitoring
Memintel emits structured logs with trace_id, entity, concept_id, and condition_id on every execution. Configure your log aggregation tool to index on these fields:
- Index
trace_idfor request tracing - Alert on
execution_errorlog entries - Dashboard: execution duration p50 / p95 / p99
- Dashboard: condition evaluation true/false ratio per
condition_id
First Week
| Day | Action |
|---|---|
| Day 1 | Define application context (POST /context/context). Post guardrails (POST /guardrails). Create first task. |
| Day 2–3 | Run executions. Collect initial feedback. Verify feedback is being recorded. |
| Day 4–5 | Run first calibration cycle. Review recommended parameter adjustments. |
| Day 7 | Review execution logs. Check for unexpected errors. Verify determinism is holding. |
Estimated Deployment Time
| Phase | Estimated Time |
|---|---|
| Infrastructure setup | 30–60 minutes |
| Environment configuration | 15 minutes |
| Database migration | 5 minutes |
| Service startup and verification | 15 minutes |
| Smoke testing (fixtures mode) | 20 minutes |
| Real LLM integration test | 30 minutes |
| Total | ~2 hours |
Known Issues
aioredis compatibility with Python 3.11
aioredis has a known incompatibility with Python 3.11 that causes this error on startup:
TypeError: duplicate base class TimeoutError
Fix it by patching the installed aioredis exceptions file after pip install completes.
On Mac/Linux:
sed -i 's/class TimeoutError(asyncio.TimeoutError, builtins.TimeoutError, RedisError):/class TimeoutError(asyncio.TimeoutError, RedisError):/' venv/lib/python3.11/site-packages/aioredis/exceptions.py
On Windows PowerShell:
(Get-Content venv\Lib\site-packages\aioredis\exceptions.py) -replace 'class TimeoutError\(asyncio\.TimeoutError, builtins\.TimeoutError, RedisError\):', 'class TimeoutError(asyncio.TimeoutError, RedisError):' | Set-Content venv\Lib\site-packages\aioredis\exceptions.py
This patch is applied to your local venv only and does not affect the source code. It will need to be reapplied if you recreate the venv.
Further Reading
- Application Context — configure domain context for more accurate task compilation
- Quickstart — your first task in 5 minutes
- API Reference — full endpoint documentation
- GitHub Repository — source code, issues, and discussions