Context Quilt — Persistent Memory for AI Applications

Give your AI a memory

Context Quilt is a persistent cognitive memory layer that sits between your app and its LLM. It learns from every conversation and injects relevant context into the next one — with zero latency impact.

<10ms

Hot path latency

Memory tiers

Blocked user requests

Any

LLM provider

For Developers

Add memory in minutes, not months

Quick Start

Deploy the Stack

docker compose up -d

POST /v1/auth/register

Start Sending Events

POST /v1/memory

API Reference

POST /v1/auth/register Register a new application

No authentication required. Returns a client_secret (shown only once).

# Request
{ "app_name": "my-coding-assistant" }

# Response 200
{
  "app_id": "uuid",
  "app_name": "my-coding-assistant",
  "client_secret": "sk-...",
  "created_at": "2025-01-15T..."
}

POST /v1/auth/token Get JWT access token

Exchange app_id + client_secret for a JWT. Token expires in 60 minutes.

# Request (form-encoded)
username={app_id}&password={client_secret}

# Response 200
{
  "access_token": "eyJ...",
  "token_type": "bearer",
  "expires_in": 3600
}

GET /v1/auth/apps List registered applications

Returns all registered apps with their auth enforcement settings.

PATCH /v1/auth/apps/{app_id} Update app settings

Toggle enforce_auth on/off for an application.

POST /v1/memory Queue content for memory processing

The write path entry point. Accepts conversations, summaries, queries, sentiment, and tool calls. Returns immediately — processing happens asynchronously.

# Request — Bearer or X-App-ID auth
{
  "user_id": "user-123",
  "interaction_type": "chat_log",
  "content": "User discussed project timeline...",
  "response": "AI suggested Q2 deadline...",
  "metadata": {
    "meeting_id": "mtg-456",
    "project": "Widget 2.0"
  }
}

# Response 200
{
  "status": "queued",
  "message": "Memory update received for async processing"
}

Interaction Types:

summary query sentiment tool_call trace chat_log meeting_summary

POST /v1/recall Get relevant context for a query

The hot path. Matches entities in the user's text against their knowledge graph and returns a pre-formatted context block. Target: <10ms. No LLM call involved.

# Request
{
  "user_id": "user-123",
  "text": "What's the status of Widget 2.0?",
  "max_hops": 2
}

# Response 200
{
  "context": "Sarah is a PM at Acme Corp.
  Prefers direct answers. Working on Widget 2.0.
  Decided on Nova 3 for transcription.
  Samples due Thursday. David (CTO) is sponsor.",
  "matched_entities": ["Widget 2.0", "Sarah"],
  "patch_count": 7
}

GET /v1/quilt/{user_id} Get user's complete quilt

Returns all facts, action items, and patch connections for a user. Supports filtering by category and incremental sync via since timestamp.

# Response 200
{
  "user_id": "user-123",
  "facts": [
    {
      "patch_id": "uuid",
      "fact": "Product Manager at Acme",
      "category": "identity",
      "patch_type": "identity",
      "source": "inferred",
      "connections": [
        {
          "to_patch_id": "uuid",
          "role": "works_on",
          "label": "Widget 2.0"
        }
      ]
    }
  ],
  "action_items": [...],
  "server_time": "2025-01-15T..."
}

GET /v1/quilt/{user_id}/graph Visual knowledge graph

Generates a force-directed graph visualization of the user's quilt. Returns SVG, PNG, or interactive HTML. Color-coded by patch type with edge coloring by relationship role.

PATCH /v1/quilt/{user_id}/patches/{patch_id} Update a patch

Let users correct extracted facts. Accepts fact and category fields. Marks the patch as user-declared.

DELETE /v1/quilt/{user_id}/patches/{patch_id} Delete a patch

Permanently remove a single patch from the user's quilt.

DELETE /v1/quilt/{user_id} Delete all user data (GDPR)

Complete data deletion. Removes all patches, entities, relationships, and cached data for a user.

POST /v1/quilt/{user_id}/rename-speaker Rename an entity

Rename a speaker or entity across all patches, relationships, and the entity index. Useful when the system labels someone as "Speaker 4" and the user knows their real name.

GET /v1/projects/{user_id} List user's projects

Returns all projects for a user with status and patch counts.

POST /v1/projects/{user_id} Create a project

Create a named project. Patches and entities can be scoped to projects for organized context retrieval.

PATCH /v1/projects/{user_id}/{project_id} Update or archive a project

Rename a project or change its status to archived. Archiving cascades to child patches.

POST /v1/enrich Template-based context injection

Pass a prompt template with [[placeholder]] syntax. CQ substitutes values from the user's profile. Supports defaults via [[key|fallback]].

# Request
{
  "user_id": "user-123",
  "template": "You are helping [[name|a user]],
  a [[role]]. They prefer [[communication_style]].
  Active project: [[current_project|none]]."
}

# Response 200
{
  "enriched_prompt": "You are helping Sarah,
  a Product Manager. They prefer concise,
  direct answers. Active project: Widget 2.0.",
  "used_variables": ["name", "role", "communication_style", "current_project"],
  "missing_variables": []
}

GET /v1/profile/{user_id} Get user's hydrated profile

Returns the cached user profile from Redis. Supports ?keys=key1,key2 to filter specific fields. Useful for building personalized UIs.

POST /v1/prewarm Warm user cache

Hydrate a user's profile and entity index from PostgreSQL into Redis. Call at session start to ensure the first /v1/recall is a cache hit. Completes in <50ms.

GET /health Health check

Returns service status and version. No authentication required.

Admin Dashboard API

A full admin API is available under /api/dashboard/ (requires X-Admin-Key header). Includes stats, user management, patch history, extraction metrics, cost tracking, pipeline testing, prompt management, and system health monitoring.

GET /stats GET /users GET /patches/recent GET /metrics/cost POST /test-pipeline GET /health-check GET /config

📖

Full Swagger / OpenAPI Docs

Interactive API documentation with request/response schemas, try-it-out capability, and full endpoint reference. Explore every endpoint live.

Open Swagger UI View Source & Docs on GitHub

Any LLM Provider

OpenRouter, OpenAI, Anthropic, Google Gemini, Ollama, vLLM, LiteLLM — anything OpenAI-compatible. Default extraction: Mistral Small 3.1 at $0.00009/call.

Smart Queue Consolidation

Events group by meeting_id and consolidate before extraction. Time-based (60 min) or context-budget (80%) triggers. One LLM call per batch.

Four Cognitive Roles

Extraction pipeline: Picker (facts & entities), Stitcher (organization), Designer (communication profile), Cataloger (summary). Single-call or multi-role mode.

Flexible Metadata

Send arbitrary key-value metadata with events. Group by meeting_id, ticket_id, repo, deal_id — whatever your app needs.

App-Scoped ACLs

Apps can only read/write patches they created. Built-in access control ensures multi-app environments stay isolated.

Deploy Anywhere

Docker Compose for the full stack (API + PostgreSQL + Redis + optional pgAdmin). Self-host on your infra. GPU-optional.

Give your AI a memory

The Goldfish Problem

Memory Fragmentation

Proxying Kills Performance

Read: Get Context

Write: Learn & Extract

Patches — Factual Memory

Stitching — Episodic Memory

Working Memory — Redis Cache

Send events as they happen

CQ extracts & connects

Recall context before LLM calls

Meeting Copilots

Coding Assistants

Customer Support

Sales & CRM Tools

Without Context Quilt

With Context Quilt

Quick Start

API Reference

Admin Dashboard API

Full Swagger / OpenAPI Docs

Any LLM Provider

Smart Queue Consolidation

Four Cognitive Roles

Flexible Metadata

App-Scoped ACLs

Deploy Anywhere

Non-Proxy Architecture

Graph Memory

Communication Profiling

Open Source Core

Provider Neutral

User Control Built In

Fully Transparent

User-Controlled

Private by Design

Auto-Archiving

Ready to give your AI a memory?