engramia

OpenAI Assistants migration

Move off Assistants before 26 Aug 2026.

The OpenAI Assistants API sunsets on 26 August 2026. Threads, files, and run state stop persisting. Engramia replaces thread persistence with eval-weighted memory and plugs straight into the OpenAI Agents SDK.

Migration window
110 days until sunset
What sunsets

The Assistants API is not a small change of behaviour

From 26 Aug 2026, the entire /v1/assistants surface returns 410 Gone. Anything you have wired into Threads, Messages, Runs, or assistant-scoped Files needs a replacement.

Thread persistence stops

Per-user message history is the most common Assistants integration. Engramia replaces it with scope-aware patterns that survive across runs and scale beyond one user.

Files / vector store rebind

Files attached to assistants need re-uploading. Engramia uses pgvector with HNSW index and scope-aware search — re-import once, never rebind.

Vendor lock-in remains

The Assistants successor (Responses API) is still OpenAI-only. Engramia is provider-agnostic — OpenAI, Anthropic, and local embeddings work side-by-side today.

Concept mapping

Threads to scopes, messages to patterns

Most Assistants concepts map cleanly onto Engramia primitives. Where they don't, you get something stronger — eval-weighted ranking, RBAC, audit log.

OpenAI AssistantsEngramia
Thread (per-user persistence)Tenant + project scope (multi-user, RBAC-enforced)
Message history (linear, append-only)Patterns (eval-weighted, decayable, scoped)
Files (vector store per assistant)Embeddings (pgvector, HNSW index, scope-aware search)
Run state (transient per request)Async job queue (Prefer: respond-async, durable status)
Tools (assistant-bound)Skill registry (cross-tenant searchable, evaluated)
Single-vendor (OpenAI-only)Multi-LLM (OpenAI + Anthropic providers, swappable)
Before / after

Three lines of glue, not a rewrite

The OpenAI Agents SDK adapter (engramia.sdk.openai_agents) is the recommended path — same gpt-4.1 model, same agent definition, recall and learn happen automatically.

Before — Assistants
# Before — OpenAI Assistants API (sunset 26 Aug 2026)
from openai import OpenAI
client = OpenAI()

assistant = client.beta.assistants.create(
    name="coder",
    model="gpt-4.1",
    instructions="You are a senior developer.",
)

thread = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Build a CSV parser.",
)
run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id,
)
After — Engramia + Agents SDK
# After — Engramia + OpenAI Agents SDK (production-ready)
from agents import Agent, Runner
from engramia import Memory
from engramia.sdk.openai_agents import EngramiaRunHooks, engramia_instructions

memory = Memory()  # picks up ENGRAMIA_DATABASE_URL + provider config

agent = Agent(
    name="coder",
    instructions=engramia_instructions(
        memory,
        base="You are a senior developer.",
    ),
    model="gpt-4.1",
)

hooks = EngramiaRunHooks(memory)
result = await Runner.run(
    agent,
    "Build a CSV parser.",
    hooks=hooks,
)
# Memory captures the run, evaluates the output, and improves recall
# for the next "Build a CSV parser" task automatically.
Migration runbook

Four steps, dual-write friendly

The path below assumes a production deployment with active Threads. Cut-over is reversible up to step 4 — run both in parallel until you have confidence.

1

Export Assistants threads

Use the OpenAI API to enumerate threads + messages for the assistant_ids you control. Each Thread becomes a (tenant_id, project_id) scope; each Message becomes a candidate Engramia pattern with the original timestamp.

2

Bulk-import to Engramia

Group consecutive (user, assistant) messages into pattern pairs and call POST /v1/learn (one per pair) under the right scope. Carry original timestamps via run_id for replay determinism. The backfill script in the example repo does this end to end.

3

Swap thread.create() for memory.recall()

Replace the Threads-bound run loop with an Engramia recall before each LLM call. The OpenAI Agents SDK adapter (engramia.sdk.openai_agents.EngramiaRunHooks) wires this in three lines and works with the same gpt-4.1 model.

4

Cut over

Run dual-write for one release cycle (OpenAI Threads + Engramia in parallel) to compare behaviour, then flip the read path to Engramia and stop creating new Threads. Audit log records the cut-over for compliance.

Get started

Migration support is on the house for the first 30 customers