Astrocyte v1.0.0 Roadmap
This roadmap organizes the v1.0.0 path into milestones, ordered by dependency and impact. Each milestone is a shippable increment — earlier milestones unblock later ones. 0.9.x contains the pre-GA M8–M11 implementation; v1.0.0 is reserved for the GA declaration after eval gates pass.
Architecture references: See c4-deployment-domain.md (System Architecture, Domain Model, Application Architecture), ADR-001 (Deployment Models), ADR-002 (Identity Model), ADR-003 (Config Schema).
Milestone Overview
Section titled “Milestone Overview”| # | Milestone | Version | Gaps Addressed | Dependencies | Scope |
|---|---|---|---|---|---|
| M1 | Identity & Context | v0.5.0 | Gap 1, Gap 5 | None | Core |
| M2 | Config Schema Evolution | v0.5.0 | Gap 4 | M1 (same release) | Core |
| M3 | Extraction Pipeline | v0.6.0 | Gap 7 | M2 | Core |
| M4 | External Data Sources | v0.7.0 | Gap 2 | M2, M3 | Core + Adapters |
| M5 | Production Storage Providers | v0.8.0 | Gap 6 | None (parallel) | Adapters |
| M6 | Standalone Gateway | v0.8.0 | Gap 3 | M1, M2 | Shipped in-repo (astrocyte-gateway-py); v0.8.x tags (see § Release numbering) |
| M7 | Structured recall authority | v0.8.0 | Gap 8 | M5 | Core |
| M8 | LLM wiki compile | v0.9.0 | — | M3, M5 | Core; opt-in per bank; manual brain.compile() plus gateway/MCP surfaces — llm-wiki-compile.md |
| M9 | Time travel | v0.9.0 | — | M5 | Core; retained_at + forgotten_at + as_of on VectorFilters + brain.history() |
| M10 | Gap analysis | v0.9.0 | — | M7, M9 | Core; brain.audit() → AuditResult(gaps, coverage_score) |
| M11 | Entity resolution | v0.9.0 | — | M5 (GraphStore) | Core; retain-time EntityResolver + evidence chains + astrocyte-age graph adapter |
Release pairing: M1 and M2 ship together in a single v0.5.0 tag (identity + structured context first; config schema immediately after in the same minor). Later milestones renumber as above.
v0.8.0 ships three milestones: M5 (production storage adapters), M6 (standalone gateway — deployment models / Gap 3), and M7 (structured recall authority). M7 depends on multi-store hybrid recall from M5; M6 is logically last but ships on the same v0.8.x line as M5/M7.
v0.9.0 ships M8–M11 (LLM wiki compile, time travel, gap analysis, and entity resolution). These complete the pre-GA feature surface. GA remains gated on the v0.9.x evaluation bar, not on additional API surface.
v1.0.0 general availability requires M1–M11 implemented and eval-validated. M9 (time travel), M10 (gap analysis), and M11 (entity resolution) constitute the full “third option” answer; they are implemented in v0.9.0 and remain under release-gate validation before GA. See platform-positioning.md for the diagnostic test framing.
Post-0.9 research track: Five research phases (R1–R5) target fundamental barriers to benchmark accuracy after the M8–M11 feature surface. Each phase has an eval gate before promotion to core. See § Research track below.
v0.7.4 closes out pre-v0.8 standalone gateway and repository packaging (image workflows, Helm, CI gates, docs alignment) before v0.8.0 milestone work.
M1: Identity & Context
Section titled “M1: Identity & Context”Gap: Identity & Authorization Protocol (Gap 1) + User-Scoped Memory (Gap 5)
Why first: Every other milestone depends on structured identity. You can’t route ingest to the right bank without knowing who the data belongs to. You can’t scope memory without knowing user vs agent vs OBO.
Deliverables
Section titled “Deliverables”-
Structured
AstrocyteContext(backwards-compatible)- Add
actor: ActorIdentity | None(type, id, claims) - Add
on_behalf_of: ActorIdentity | None(OBO delegation) - Keep
principal: strworking — derive fromactor.idwhen actor is set - Add
effective_permissions()→ intersection of actor + OBO grants - File:
astrocyte-py/astrocyte/types.py
- Add
-
Identity-driven bank resolution
BankResolverthat maps(actor, on_behalf_of)→bank_id- Default:
user:{id}→user-{id}bank (convention-based) - Configurable via
identity:config section - File: new
astrocyte-py/astrocyte/identity.py
-
Integration adapter migration (Phase 1)
- Add optional
contextparameter to all 19 adapters - No breaking changes —
context=Noneuses current behavior - Files:
astrocyte-py/astrocyte/integrations/*.py
- Add optional
Acceptance Criteria
Section titled “Acceptance Criteria”-
AstrocyteContext(principal="user:123")still works unchanged -
AstrocyteContext(actor=ActorIdentity(type="user", id="123"))resolves principal automatically - OBO:
context.effective_permissions(bank_id)returns intersection - Bank resolver maps identity to bank_id with configurable rules
- All existing tests pass without modification (
tests/test_identity_m1.py,tests/test_types.py, etc.) - ADR-002 implementation validated (library surface; HTTP JWT/OIDC mapping lives in
astrocyte-gateway-py— § M6)
M2: Config Schema Evolution
Section titled “M2: Config Schema Evolution”Gap: Config Schema for External World (Gap 4)
Why second: M3 and M4 need config sections to declare sources, agents, and extraction profiles. M6 needs the deployment: section.
Deliverables
Section titled “Deliverables”-
New config sections (all optional, backwards-compatible)
sources: # External data source definitionstavus-transcripts:type: webhookextraction_profile: conversationtarget_bank_template: "user-{principal}"auth:type: hmacsecret: ${TAVUS_WEBHOOK_SECRET}agents: # Agent registration and default bank mappingsupport-bot:principal: "agent:support-bot"default_bank: "shared-support"allowed_banks: ["shared-*", "user-*"]deployment: # Deployment mode configurationmode: library # "library" | "standalone" | "plugin"identity: # Identity resolution configurationresolver: convention # "convention" | "config" | "custom"obo_enabled: falseextraction_profiles: # Reusable extraction configurationsconversation:chunking_strategy: dialogueentity_extraction: llmmetadata_mapping:speaker: "$.participant_name" -
Config dataclass extensions
- New dataclasses:
SourceConfig,AgentRegistrationConfig,DeploymentConfig,IdentityConfig,ExtractionProfileConfig - Added to
AstrocyteConfigas optional fields - File:
astrocyte-py/astrocyte/config.py
- New dataclasses:
-
Config validation
- Validate source references to extraction profiles exist
- Validate agent bank patterns against declared banks
- Validate deployment mode prerequisites
Acceptance Criteria
Section titled “Acceptance Criteria”- Loading a config with no new sections produces identical
AstrocyteConfigto current - New sections parse correctly with type validation
-
${ENV_VAR}substitution works in new sections - Profile merge order preserved (compliance → profile → user config)
- ADR-003 implementation validated
M3: Extraction Pipeline
Section titled “M3: Extraction Pipeline”Gap: Extraction Pipeline (Gap 7)
Why third: External data sources (M4) push raw content through the extraction pipeline. Building the pipeline before the connectors means M4 just needs to wire sources to an existing pipeline.
Deliverables
Section titled “Deliverables”-
Extraction pipeline (inbound complement to retrieval pipeline)
Raw Content → Content Normalizer → Chunker → Entity Extractor → brain.retain()- Content normalizer: handles different content types (transcript, email, document, event)
- Routes to appropriate chunking strategy (sentence, paragraph, dialogue, fixed)
- Applies extraction profile from config
- File: new
astrocyte-py/astrocyte/pipeline/extraction.py
-
Dialogue chunking strategy
- Detect turn boundaries (
Speaker: textpattern) - Keep complete turns together
- Group consecutive turns up to max_chunk_size
- Fallback to sentence splitting for single oversized turns
- File: extend
astrocyte-py/astrocyte/pipeline/chunking.py
- Detect turn boundaries (
-
Content type routing
content_typefield onRetainRequest- Routes to appropriate chunking strategy
- Extraction profile can override defaults
- File: extend
astrocyte-py/astrocyte/pipeline/orchestrator.py
-
Extraction profile resolution
- Load from config
extraction_profiles:section - Source → profile mapping
- Default profiles for common content types
- Load from config
Acceptance Criteria
Section titled “Acceptance Criteria”- Dialogue chunker preserves speaker attribution in chunks (see
chunking._chunk_dialogue+ tests intests/test_chunking.py) - Content type routing selects chunking strategy; optional
extraction_profileoverrides (astrocyte.pipeline.extraction.resolve_retain_chunking,PipelineOrchestrator.retain) - Extraction profiles load from YAML
extraction_profiles:; built-insbuiltin_text/builtin_conversationmerged at runtime (merged_extraction_profiles/merged_user_and_builtin_profiles) - Inbound chain: normalize → chunk → optional entity extract → embed → store; profile-driven
metadata_mapping,tag_rules, andentity_extractionapplied on retain (prepare_retain_input) - Default
RetainRequest.content_typeremainstext; omittingextraction_profilepreserves prior behavior aside from light normalization and builtin profile table merge inside the orchestrator
M4: External Data Sources
Section titled “M4: External Data Sources”Gap: External Data Sources / Inbound Connectors (Gap 2)
Why fourth: Requires M2 (config schema for source definitions) and M3 (extraction pipeline to process inbound data).
Deliverables
Section titled “Deliverables”-
IngestSource SPI (Protocol class)
class IngestSource(Protocol):source_id: strsource_type: str # "webhook" | "stream" | "poll"async def start(self) -> None: ...async def stop(self) -> None: ...async def health_check(self) -> HealthStatus: ...File: new
astrocyte-py/astrocyte/ingest/source.py -
Webhook receiver (first implementation)
- HTTP endpoint that receives POST payloads
- HMAC signature validation
- Routes to extraction pipeline based on source config
- File: new
astrocyte-py/astrocyte/ingest/webhook.py
-
Source registry
- Register/deregister sources at runtime
- Load from config
sources:section - Health monitoring and error thresholds
- File: new
astrocyte-py/astrocyte/ingest/registry.py
-
Proxy query adapter (for federated recall) — M4.1 (see below)
- Forward recall queries to external APIs
- Merge results with local recall hits
- Configurable via
sources:withtype: proxy
Acceptance Criteria
Section titled “Acceptance Criteria”- Webhook ingest validates HMAC (when
auth.type: hmac), parses JSON body, resolves target bank, callsbrain.retain()— library API:astrocyte.ingest.handle_webhook_ingest(HTTP server binding is M6 / app-specific) - Source registry loads
type: webhookentries fromsources:and manages start/stop/health (SourceRegistry,WebhookIngestSource) - Proxy query adapter merges external recall with local recall — M4.1 / federated recall (
astrocyte.recall.proxy, RRF inPipelineOrchestrator.recall; optionalRecallRequest.external_context) - Source health available via
IngestSource.health_check()→HealthStatus(surfaced onGET /health/ingestandGET /v1/admin/sourcesinastrocyte-gateway-py; structured ingest logs viaastrocyte.ingest.logutilwhenASTROCYTE_LOG_FORMAT=json) - Ingest uses
brain.retain()so policy (PII, validation, rate limits, quotas) applies on the same path as interactive retains
M4.1 (implemented): Federated / proxy recall
Section titled “M4.1 (implemented): Federated / proxy recall”Scope: sources: entries with type: proxy, url (query template with {query}), and target_bank forward recall queries to external HTTP APIs (JSON hits / results arrays). Hits merge with local vector/graph recall via RRF in PipelineOrchestrator.recall; callers may also pass RecallRequest.external_context. Engine-only recall merges proxy hits by score without double-counting HybridEngineProvider.
Implementation: astrocyte.recall.proxy (fetch_proxy_recall_hits, gather_proxy_hits_for_bank, merge_manual_and_proxy_hits); config validation in validate_astrocyte_config; dependency httpx for outbound GET.
Later (when we need them): a hosted redirect/callback server (or tighter gateway integration) for full browser OAuth UX, PKCE, and device / JWT bearer token grants — not required for the current library surface; see adr-003-config-schema.md (design note under proxy OAuth).
Relation to releases: Shipped after v0.7.0 (webhook ingest); bundled for users on the v0.8.0 line — see root CHANGELOG.md.
Thin HTTP binding (library): optional astrocyte[gateway] provides create_ingest_webhook_app (Starlette ASGI) → POST /v1/ingest/webhook/{source_id} forwarding raw body/headers to handle_webhook_ingest. Standalone JWT/OIDC, OpenAPI, Docker/Helm, and ops packaging are astrocyte-gateway-py (§ M6 — shipped in-repo).
v0.8.x connector track — shipped in-repo (incremental)
Section titled “v0.8.x connector track — shipped in-repo (incremental)”- Event streams:
astrocyte-ingestion-kafka,astrocyte-ingestion-redis—type: streamwithingest_stream_drivers; gateway lifespan startsIngestSupervisor. - API poll:
astrocyte-ingestion-github—type: poll/api_pollwithdriver: github; same supervisor model. - Further connectors (NATS, additional poll targets, gateway plugins) remain on the v0.8.x cadence per § Release Strategy — not deferred to post-GA only.
Deferred / remaining (see § Release Strategy)
Section titled “Deferred / remaining (see § Release Strategy)”- NATS and other stream backends as needed
- Additional poll drivers (beyond GitHub) as product priorities dictate
- Additional gateway plugin targets (AWS API Gateway, Envoy, etc.) as needed — see
gateway-plugins/for the shipped Kong, APISIX, and Azure APIM plugins
M5: Production Storage Providers
Section titled “M5: Production Storage Providers”Gap: Production Storage Providers (Gap 6)
Parallel track: No dependency on M1-M4. Can be developed concurrently.
Deliverables
Section titled “Deliverables”-
Graph store: Neo4j adapter — PyPI package
astrocyte-neo4j(adapters-storage-py/astrocyte-neo4j/)- Implements
GraphStoreSPI - Entity and relationship CRUD
- Neighborhood traversal for graph-enhanced recall
- Cypher query generation
- Implements
-
Document store: Elasticsearch/OpenSearch adapter —
astrocyte-elasticsearch(adapters-storage-py/astrocyte-elasticsearch/)- Implements
DocumentStoreSPI - BM25 full-text search
- Keyword retrieval for hybrid recall
- Index lifecycle management
- Implements
-
Vector stores —
astrocyte-postgres(PostgreSQL + pgvector),astrocyte-qdrant(Qdrant); both underadapters-storage-py/- Implement
VectorStoreSPI (pgvector may also expose document-oriented helpers where applicable) - Payload / collection management per backend
- Implement
Acceptance Criteria
Section titled “Acceptance Criteria”- Each adapter has integration tests exercising the
VectorStore/GraphStore/DocumentStoreAPI (see each package’stests/; optional: authors may also run shared suites fromastrocyte.testing) - Hybrid recall (vector + graph + document) produces fused results — core E2E:
tests/test_m5_hybrid_recall_e2e.py,tests/test_astrocyte_tier1.py(in-memory); production adapters validated inadapters-storage-ci.yml - Each adapter ships as a separate PyPI package (
astrocyte-postgres,astrocyte-qdrant,astrocyte-neo4j,astrocyte-elasticsearch) - Integration tests run against containerized instances in CI (
.github/workflows/adapters-storage-ci.yml; pgvector also covered from rootci.yml) - README with quick-start —
adapters-storage-py/README.mdplus per-adapter READMEs
Same release — v0.8.0: This milestone shares the v0.8.0 line with M6 (standalone gateway) and M7 (structured recall authority). Hybrid adapters, gateway, and precedence config ship on the same minor family; see § M6 and § M7.
M6: Standalone Gateway
Section titled “M6: Standalone Gateway”Gap: Deployment Models (Gap 3)
Implementation status (as of 2026-04-12): The standalone gateway is shipped in this repository under astrocyte-services-py/astrocyte-gateway-py/ — FastAPI routes, auth modes, webhook ingest, health/admin, multi-stage Dockerfile, Compose + runbook, Helm chart, example configs, CI gates, GHCR publish with attestations, and observability hooks (request IDs, JSON access logs, optional OpenTelemetry extra). Release tagging follows the v0.8.x policy above (not a separate v0.9.0 tag for gateway-only).
Why last: Requires M1 (identity) and M2 (deployment config). The gateway is a thin HTTP adapter over the same Astrocyte core — all intelligence stays in the library.
Deliverables
Section titled “Deliverables”-
FastAPI-based standalone gateway (
astrocyte-gateway-pypackage)- REST API:
/v1/retain,/v1/recall,/v1/reflect,/v1/forget - JWT validation middleware (consumes tokens from external IdP)
- Maps JWT claims →
AstrocyteContextwith structured identity - OpenAPI spec auto-generated
- REST API:
-
Webhook receiver endpoint
/v1/ingest/webhook/{source_id}— routes to IngestSource registry- HMAC validation per source config
-
Health and admin endpoints
/health— gateway + storage backend health/v1/admin/sources— source registry status/v1/admin/banks— bank listing and health
-
Docker packaging
- Multi-stage Dockerfile
docker-compose.ymlwith pgvector + gateway- Helm chart (basic) for Kubernetes
Acceptance Criteria
Section titled “Acceptance Criteria”- Gateway exposes the Tier 1 core memory API via REST (
/v1/retain,/recall,/reflect,/forget; OpenAPI at/docs). (Tier 2 engine HTTP surface remains library-first / product-specific.) - AuthN maps to structured
AstrocyteContext:api_key,jwt/jwt_hs256,jwt_oidc(JWKS, RS256) withActorIdentity/ claims — seeastrocyte-gateway-pyREADME and ADR-002. - Webhook ingest works end-to-end through the gateway (
/v1/ingest/webhook/{source_id}; CI + tests). - Docker Compose + runbook bring up a working stack (
astrocyte-services-py/; see services README). - Performance: < 10ms gateway-only overhead vs core latency — a benchmark harness exists in-repo (
astrocyte-services-py/astrocyte-gateway-py/scripts/bench_gateway_overhead.py) and a manual GitHub Actions workflow can run it, but the < 10 ms threshold is not enforced as a release gate because results are environment-dependent. - ADR-001: Standalone gateway deployment path is documented and implemented (package, container, Helm, ops docs). Full organizational “validated” sign-off stays a release / operator checklist outside this file.
Deferred (post–core gateway; scheduled from v0.8.x onward)
Section titled “Deferred (post–core gateway; scheduled from v0.8.x onward)”Gateway plugin integration (Kong, APISIX, Azure API Management, and similar), event stream connectors, and poll schedulers are in scope from the v0.8.0 baseline toward v1.0.0, shipped incrementally on v0.8.x tags per § Release numbering — see § v0.8.x — Connector & gateway integration track and § Stability: SPI, adapters, and config files.
- gRPC transport — wait for demand signal (may remain post–v1.0.0)
M7: Structured recall authority
Section titled “M7: Structured recall authority”Gap: Structured truth / authority precedence across heterogeneous stores (Gap 8) — orthogonal to tiered_retrieval (cost / latency tiers) and to MIP (retain routing in mip.yaml).
Why with M5: Precedence across graph vs document vs vector (and optional proxy tiers) requires production GraphStore / DocumentStore / VectorStore adapters and validated hybrid recall behavior. M7 layers declarative authority and prompt- or code-enforced conflict policy on top. Release: v0.8.0, same line as M5 and M6 (adapters + gateway + authority on one minor family).
Product stance: Optional in config (default off). The default Astrocyte recall path remains multi-strategy fusion (RRF, weights). Naming and docs must keep cost tiers (tiered_retrieval) and truth tiers (this milestone) distinct. See built-in-pipeline.md §9.4.
Deliverables
Section titled “Deliverables”-
astrocyte.yamlschema — optionalrecall_authority:(RecallAuthorityConfig: tiers,tier_by_bank,rules_inline/rules_path,apply_to_reflect). Phase 1 formats fused hits and injectsauthority_context; per-backend binding and strategy (parallel_all/sequential_stop) remain future if we add multi-query tier retrieval. - Runtime —
apply_recall_authorityon recall; pipeline /Astrocyte.reflectoptional injection; retain-timemetadata["authority_tier"]viatier_by_bank/ extraction profiles. Code-side demotion of hits is not required for Phase 1 (see ADR-004). - Documentation —
docs/_design/adr/adr-004-recall-authority.md; cross-links inbuilt-in-pipeline.mdand roadmap. - Tests —
tests/test_recall_authority.py,tests/test_m7_reflect_authority.py,tests/test_m5_hybrid_recall_e2e.py(interaction with hybrid recall).
Acceptance Criteria
Section titled “Acceptance Criteria”- Disabled by default; enabling does not change behavior for existing configs without the new section
- Clear separation from
tiered_retrievalin code, config keys, and docs (ADR-004 +built-in-pipeline.md§9.4) - At least one end-to-end path: multi-store recall →
authority_context/ reflect injection — see tests above - Performance and token-budget implications documented (ADR-004 — authority blocks add prompt tokens; combine with
homeostasis/ reflect limits)
M9: Time Travel
Section titled “M9: Time Travel”Why: Diagnostic test 3 from Familiarity is the Enemy — “What did we believe about X on March 1st?” Immutable history with as-of queries is required for compliance audit, post-mortems, and accountability. Without it, memory systems are a mutable black box.
Design: Every VectorItem gains retained_at: datetime (system-set on write, immutable) and forgotten_at: datetime | None (set on soft-delete via forget()). VectorFilters gains as_of: datetime | None. The recall path applies the filter: retained_at <= as_of AND (forgotten_at IS NULL OR forgotten_at > as_of). A new brain.history(bank_id, start, end) method returns the memory timeline.
Deliverables
Section titled “Deliverables”VectorItem+VectorFilterstype changes —retained_at: datetime,forgotten_at: datetime | None,as_of: datetime | Noneon filtersbrain.history(bank_id, start, end)— returns memories retained and forgotten in a date range- Soft-delete semantics for
forget()— default mode writesforgotten_at; hard delete available via flag astrocyte-postgres+astrocyte-ageadapter updates — implementas_offilter in SQL queriesas_ofon REST gateway —/v1/recallacceptsas_ofISO timestamp parameter
Acceptance Criteria
Section titled “Acceptance Criteria”-
brain.recall(query, as_of=datetime(...))returns only memories retained before that timestamp and not yet forgotten -
forget()default mode writesforgotten_at; hard delete mode physically removes -
brain.history(bank_id, start, end)returns correctly scoped timeline - All existing tests pass (no breaking changes to default retain/recall behavior)
- pgvector adapter SQL uses indexed
retained_atcolumn (no sequential scan)
M10: Gap Analysis
Section titled “M10: Gap Analysis”Why: Diagnostic test 1 from Familiarity is the Enemy — “What don’t we know?” The ability to reason about absence is what separates an intelligent memory system from a retrieval index. Compliance teams, engineering managers, and agents preparing for a task all need to know where coverage is thin.
Design: brain.audit(scope: str, bank_id: str) -> AuditResult. Implementation: scoped recall → LLM judge (given these memories, what’s missing?) → structured AuditResult. The LLM judge is prompted to identify absent topics, thin coverage, and potential contradictions. coverage_score: float is a 0–1 composite (memory density × recency × topic breadth).
Deliverables
Section titled “Deliverables”AuditResulttype —gaps: list[GapItem],coverage_score: float,memories_scanned: int,scope: strGapItemtype —topic: str,severity: Literal["high", "medium", "low"],reason: strbrain.audit(scope, bank_id, *, max_memories=50)— core implementation- Audit judge — LLM prompt template in
astrocyte/pipeline/audit.py; swappable viaaudit_judge:config /v1/auditREST endpoint — on the standalone gateway
Acceptance Criteria
Section titled “Acceptance Criteria”-
brain.audit("topic", bank_id)returnsAuditResultwith at least one gap when coverage is sparse -
coverage_scoreis 1.0 for a well-covered topic, < 0.5 for an empty bank - Works with mock LLM provider (testable without API key)
- Audit is opt-in (no cost impact when not called)
M11: Entity Resolution
Section titled “M11: Entity Resolution”Why: Diagnostic test 2 from Familiarity is the Enemy — “Is ‘Calvin’ the same person as ‘the CTO’?” Cosine similarity can retrieve both documents, but it cannot assert they are the same entity with evidence. Entity resolution with evidence chains is what enables cross-document reasoning, graph traversal, and trustworthy recall.
Design: Retain-time pipeline stage. After fact extraction: extract named entities from the incoming content → query GraphStore for existing entity candidates → for each candidate above similarity threshold, call LLM to confirm with evidence quote → if confirmed, write EntityLink(link_type="alias_of", entity_a, entity_b, evidence, confidence) to graph. On recall, graph-enhanced retrieval automatically traverses entity links.
Default graph store: astrocyte-age (Apache AGE on PostgreSQL) — same instance as astrocyte-postgres, zero additional operational burden. Neo4j remains available as an alternative.
Deliverables
Section titled “Deliverables”EntityResolverclass —astrocyte/pipeline/entity_resolution.py; configurable similarity threshold, LLM confirmationEntityLinktype —type: str,entity_a: str,entity_b: str,evidence: str,confidence: float,created_at: datetimeGraphStoreSPI extensions —find_entity_candidates(name, threshold),store_entity_link(link)astrocyte-ageadapter — new PyPI packageastrocyte-age; Apache AGE on PostgreSQL; implementsGraphStoreSPI;docker-compose.ymlupdated to use PostgreSQL image with both pgvector + age extensions- Retain pipeline integration — entity resolution runs after fact extraction, before dedup; opt-in via
entity_resolution: enabled: truein config - Recall integration — graph traversal includes entity links; query for “Calvin” returns memories tagged with “CTO” via link
Acceptance Criteria
Section titled “Acceptance Criteria”- Entity resolution stage identifies that “Calvin” and “the CTO” are the same entity when evidence is present in the retained text
-
EntityLinkwritten to graph with evidence quote, not just cosine score - Disabled by default;
entity_resolution: enabled: truein config activates -
astrocyte-agepassesGraphStoreSPI contract tests -
astrocyte-age+astrocyte-postgreswork against the same PostgreSQL instance - Docker Compose updated with AGE-enabled PostgreSQL image
Dependency Graph
Section titled “Dependency Graph”graph LR
M1["M1: Identity & Context (v0.5.0)"] --> M2["M2: Config Schema (v0.5.0)"]
M2 --> M3["M3: Extraction Pipeline (v0.6.0)"]
M2 --> M6["M6: Standalone Gateway (v0.8.0)"]
M3 --> M4["M4: External Data Sources (v0.7.0)"]
M1 --> M6
M5["M5: Storage Providers (v0.8.0)"] -.->|parallel| M1
M5 -.->|parallel| M2
M5 -.->|parallel| M3
M5 --> M7["M7: Structured recall authority (v0.8.0)"]
M3 --> M8["M8: LLM Wiki Compile (v0.9.0)"]
M5 --> M8
M5 --> M9["M9: Time Travel (v0.9.0)"]
M7 --> M10["M10: Gap Analysis (v0.9.0)"]
M9 --> M10
M5 --> M11["M11: Entity Resolution (v0.9.0)"]
style M1 fill:#e1f5fe
style M2 fill:#e1f5fe
style M3 fill:#fff3e0
style M4 fill:#fff3e0
style M5 fill:#e8f5e9
style M6 fill:#fce4ec
style M7 fill:#fff9c4
style M8 fill:#f3e5f5
style M9 fill:#e0f2f1
style M10 fill:#e0f2f1
style M11 fill:#e0f2f1
Critical path to v1.0.0: M1 → M2 → M3 → M4 → M5 → M7 → M8/M9/M10/M11 → eval gates. M8–M11 are implemented in v0.9.0; GA is the validation milestone.
Parallel track: M5 (storage providers) can proceed independently of M1–M4
Gateway track: v0.5.0 (M1+M2) → M6 (can start after M2, parallel with M3/M4). M6 ships on the v0.8.0 line with M5/M7 (see § Milestone Overview).
Authority track: M5 → M7 (both ship v0.8.0)
Wiki compile track (v0.9.0): M3 + M5 → M8. It now runs alongside M9–M11 and is part of the pre-GA validation surface.
Third-option track (v0.9.0 → v1.0.0): M5 → M9 (time travel) → M10 (gap analysis); M5 → M11 (entity resolution). All three are implemented before v1.0.0 GA.
Research track (post-0.9): R1–R5 phases targeting benchmark accuracy barriers; eval gates promotion.
Stability: SPI, adapters, and config files
Section titled “Stability: SPI, adapters, and config files”This section is the single place for stability expectations ahead of v1.0.0. It applies to the v0.8.x connector/plugin track and to GA.
Core SPIs (library)
Section titled “Core SPIs (library)”Protocols and abstract surfaces that third-party code implements or calls — including VectorStore, GraphStore, DocumentStore, IngestSource, integration adapter hooks, and pipeline entrypoints documented as public API.
- Through v1.0.0: Prefer additive changes (new optional fields, new methods with defaults). Breaking changes require a documented deprecation in release notes and, where applicable, an ADR; after v1.0.0, 1.x keeps SPIs backwards-compatible except in a major semver bump.
- Tier distinction: Tier 1 memory API and documented SPIs are stability-guaranteed; Tier 2 engine-only or experimental surfaces stay explicitly labeled in code/docs until promoted.
Adapter packages (adapters-storage-py/, PyPI astrocyte-*)
Section titled “Adapter packages (adapters-storage-py/, PyPI astrocyte-*)”- Adapters implement core SPIs; they release on their own PyPI versions but declare a supported
astrocytecore range in each package README. - A breaking SPI change in core triggers coordinated minor/major adapter releases; CI (e.g.
adapters-storage-ci.yml) is the enforcement backstop.
Ingest transport packages (adapters-ingestion-py/)
Section titled “Ingest transport packages (adapters-ingestion-py/)”- Ingest transport adapters split from core (
astrocyte-ingestion-kafka,astrocyte-ingestion-redis, …). Version and test like storage adapters when published; CI usesadapters-ingestion-ci.yml.
Vendor integration packages (adapters-integration-py/)
Section titled “Vendor integration packages (adapters-integration-py/)”- Reserved for vendor / product integrations (outbound API clients, bidirectional flows, gateway-related shims) that are not storage SPIs and not generic ingest transports. Empty until packages land.
astrocyte.yaml / astrocyte.yml (declarative config)
Section titled “astrocyte.yaml / astrocyte.yml (declarative config)”- Schema evolution is additive-first: new sections and keys ship as optional with defaults that preserve behavior for existing files.
- Renames/removals of supported keys go through deprecation (warn → major) as described in ADR-003; validation errors should name the migration path.
- Cross-references (
sources→extraction_profiles,recall_authority, etc.) stay validated so invalid configs fail at load time.
mip.yaml (Memory Intent Protocol)
Section titled “mip.yaml (Memory Intent Protocol)”- Routing contracts (intents, bank patterns, tool routing) are stable for integrators in the same sense as config: additive evolution; breaking changes require version bump of the MIP schema or file format if we introduce a version field, or deprecation of prior behavior.
- Orthogonality: MIP controls where retain/recall routes;
tiered_retrieval,recall_authority, and hybrid fusion are separate knobs — docs must not overload names (see § M7 andbuilt-in-pipeline.md).
Release Strategy
Section titled “Release Strategy”Current release line
Section titled “Current release line”v0.8.0 and follow-on v0.8.x tags shipped M5–M7 (production storage adapters, standalone gateway, structured recall authority) and the connector track. v0.9.0 shipped M8–M11. Release notes: root CHANGELOG.md; tag and PyPI order: RELEASING.md.
Historical — v0.4.x era (pre–M1 snapshot)
Section titled “Historical — v0.4.x era (pre–M1 snapshot)”- Library-first deployment narrative; opaque
principal: str; early integrations - Baseline pipeline, policy, MIP, MCP — superseded by v0.5.0+ milestones below
v0.5.0 — Identity & Context (M1) + Config Schema (M2)
Section titled “v0.5.0 — Identity & Context (M1) + Config Schema (M2)”- M1: Structured
AstrocyteContextwithActorIdentity(backwards-compatible,principal: strstill works) - M1: OBO delegation with permission intersection
- M1: Identity-driven bank resolution (
BankResolver) - M1: Phase 1 integration adapter migration (optional
contextparameter on all 19 adapters) - M2: New optional config sections:
sources,agents,deployment, extendedidentity,extraction_profiles - M2: Config validation for cross-references (source → extraction profile, agent → bank patterns)
- M2: No breaking changes to existing
astrocyte.ymlfiles
v0.6.0 — Extraction Pipeline (M3)
Section titled “v0.6.0 — Extraction Pipeline (M3)”- Inbound extraction pipeline: raw content → normalize → chunk → extract → retain
- Dialogue chunking strategy (speaker-aware, turn-preserving)
- Content type routing on
RetainRequest - Extraction profile resolution from config
v0.7.0 — External Data Sources (M4, ingest)
Section titled “v0.7.0 — External Data Sources (M4, ingest)”IngestSourceSPI (Protocol class)- Webhook receiver with HMAC validation
- Source registry with health monitoring
v0.7.1 — Federated recall (M4.1)
Section titled “v0.7.1 — Federated recall (M4.1)”- Proxy recall:
sources:withtype: proxy+ HTTP merge with local RRF (astrocyte.recall.proxy,httpx) - Optional manual federated hits via
RecallRequest.external_context
v0.8.0 — Production storage (M5) + standalone gateway (M6) + structured recall authority (M7)
Section titled “v0.8.0 — Production storage (M5) + standalone gateway (M6) + structured recall authority (M7)”- M5 — adapters: Neo4j (
astrocyte-neo4j), Elasticsearch (astrocyte-elasticsearch), Qdrant (astrocyte-qdrant), PostgreSQL/pgvector (astrocyte-postgres) — packages underadapters-storage-py/; hybrid recall validated end-to-end (vector + graph + document) - M6 — gateway: FastAPI
astrocyte-gateway-py(implemented in repo): Tier 1 REST, JWT/OIDC →AstrocyteContext, webhook ingest, health/admin, Docker/Compose/Helm, GHCR — § M6 - M7 — precedence: Optional
astrocyte.yamlrecall_authority:(ADR-004); labeledauthority_contextfor recall/reflect; not the default recall path — full spec § M7
v0.8.x — Connector & gateway integration track
Section titled “v0.8.x — Connector & gateway integration track”Scope: Implemented from the v0.8.0 baseline. This line added ingest transports, gateway plugins, and gateway hardening before the v0.9.0 M8–M11 feature release.
- Additional event stream / poll connectors (beyond Kafka, Redis, GitHub) and NATS where needed
- Gateway plugins — shipped: Thin integration plugins for Kong (Lua), Apache APISIX (Lua), and Azure API Management (XML policy fragments + Bicep/Terraform/APIOps deployment). Located in
gateway-plugins/at the repo root. Each plugin intercepts OpenAI-compatible/chat/completionsrequests and calls the standalone gateway for recall (pre-hook) and retain (post-hook). Seegateway-plugins/README.md. - Hardening: CORS, body limits, admin auth, rate limits at the edge (as product requires)
- LLM wiki compile (M8 — v0.9.0): Async CompileEngine maintains rewritable
WikiPagememories (entity/topic/concept) from raw memories, with provenance and cross-links; recall tiers wiki hits ahead of raw with fallback; periodic lint pass catches contradictions, stale claims, and orphans. Opt-in per bank; defaults off. Full design:llm-wiki-compile.md.
SPI, adapter, and astrocyte.yaml / mip.yaml stability rules for this track — § Stability: SPI, adapters, and config files.
v1.0.0 — General Availability (M1–M11)
Section titled “v1.0.0 — General Availability (M1–M11)”M8–M11 are implemented in v0.9.0 and are required for GA validation. Together with M1–M7 they constitute the full “third option” system that passes all four diagnostic tests once the eval gates clear.
| Milestone | Diagnostic test | What it delivers |
|---|---|---|
| M8: Wiki compile | Supports gap/entity/time-travel quality | brain.compile() + durable wiki pages and provenance |
| M9: Time travel | Test 3 — time travel | retained_at + forgotten_at + as_of filter + brain.history() |
| M10: Gap analysis | Test 1 — gap analysis | brain.audit() → AuditResult(gaps, coverage_score) |
| M11: Entity resolution | Test 2 — entity resolution | EntityResolver + EntityLink with evidence + astrocyte-age graph adapter |
astrocyte-age is the preferred graph store for the PostgreSQL reference stack — Apache AGE runs inside the same PostgreSQL instance as pgvector, eliminating the need for a separate graph database for most deployments.
v0.9.0 — M8–M11 feature release
Section titled “v0.9.0 — M8–M11 feature release”v0.9.0 ships M8–M11: wiki compile, time travel, gap analysis, entity resolution, and the astrocyte-age Apache AGE adapter. Full design: llm-wiki-compile.md.
GA eval gate: validate quality gains and no unacceptable latency regression across the memory benchmarks before declaring v1.0.0.
Research track
Section titled “Research track”The research track continues after the M8–M11 implementation. Eval gates determine promotion of each technique to core. All techniques are additive — no SPI breaking changes.
Post-GA engineering candidates:
- Additional storage adapters (Pinecone, Weaviate, Memgraph)
- Multi-region / global deployment patterns
- Tavus CVI integration (bidirectional) — HTTP client package
astrocyte-integration-tavus(adapters-integration-py/); full bidirectional + e2e deferred - Phase 3 identity migration (context required, principal-only deprecated)
- Graph-enhanced wiki:
WikiPageentity cross-links traversable in graph store
Research track: path to 100% benchmark accuracy
Five research phases targeting the fundamental barriers remaining after v1.0.0. Each phase addresses a distinct barrier identified by benchmark category-level analysis. Phases are independent and can proceed in parallel on the v0.9.0 line; evaluation gates determine promotion to core. All techniques are additive — no SPI breaking changes.
See platform-positioning.md § Research Agenda for the full diagnostic-test grounding.
Phase R1: Retrieval recall (getting the right chunks into context)
Section titled “Phase R1: Retrieval recall (getting the right chunks into context)”Barrier: Recall hit rate plateaus at ~76% with current embeddings — 24% of queries never retrieve a relevant chunk. No synthesis improvement helps if the right content isn’t in the window.
| Technique | Mechanism | Expected lift | Eval gate |
|---|---|---|---|
| Hypothetical Document Embedding (HyDE) | LLM generates hypothetical answer → embed that → search; bridges paraphrase gap between query and memory | +5–10pp recall hit rate | LoCoMo recall_hit_rate ≥ 0.85 |
| Hierarchical indexing | Index at sentence, paragraph, session, and topic granularity; recall searches across levels | +3–5pp MRR | LongMemEval extraction ≥ 90% |
| Exhaustive multi-pass reformulation | N diverse query reformulations (beyond current multi-query expansion); union-dedupe results | +2–4pp recall hit rate | No regression on adversarial abstention |
| Learned retrieval fine-tuning | Domain-adapted embedding model trained on retain/recall pairs from operator query logs | +5–8pp recall precision | Requires operator opt-in; not benchmarked centrally |
Phase R2: Synthesis quality (extracting the right answer from retrieved context)
Section titled “Phase R2: Synthesis quality (extracting the right answer from retrieved context)”Barrier: Even with correct chunks retrieved, gpt-4o-mini synthesis misses details, conflates entities, or hallucinates ~10–15% of the time.
| Technique | Mechanism | Expected lift | Eval gate |
|---|---|---|---|
| Chain-of-thought extraction | ”Step 1: Which memories mention X? Step 2: What do they say? Step 3: Synthesize” — forces explicit grounding | +3–5pp reflect accuracy | LoCoMo multi-hop ≥ 90% |
| Retrieval-augmented verification | After LLM generates answer, recall using the answer as query; verify retrieved memories support it; abstain if not | +3–5pp on open-domain (fewer false positives) | No regression on single-hop |
| Contrastive reflect | Generate N answers with different chunk orderings; keep the most consistent; abstain if divergent | +2–3pp reflect accuracy | Retain p95 latency increase < 50% |
| Fine-tuned synthesis model | Small model trained specifically on “given these chunks, extract the answer” — plugs in as Tier 2 engine via SPI | +5–10pp across all categories | LongMemEval overall ≥ 85% |
Phase R3: Structured memory representation
Section titled “Phase R3: Structured memory representation”Barrier: Raw text chunks lose relational and temporal structure. Multi-hop and reasoning queries require structured knowledge that chunked text doesn’t encode.
| Technique | Mechanism | Expected lift | Eval gate |
|---|---|---|---|
| Relation extraction | Retain-time: extract typed triples (entity, relation, value, temporal) alongside raw chunks; store in graph | +5–8pp LoCoMo multi-hop | Entity resolution confidence ≥ 0.9 |
| Episodic memory indexing | Index by episode context (who, where, emotional valence, sequence) not just content — enables situational retrieval | +3–5pp on contextual queries | New benchmark category needed |
| Contradiction detection + resolution | Retain-time: compare incoming facts against existing; flag conflicts; prefer newer or higher-confidence | +5–8pp LongMemEval update | Update category ≥ 90% |
Phase R4: Temporal reasoning
Section titled “Phase R4: Temporal reasoning”Barrier: LongMemEval temporal category is at ~11% (mock). LLMs can retrieve timestamped content but struggle to reason about ordering, duration, and change over time.
| Technique | Mechanism | Expected lift | Eval gate |
|---|---|---|---|
| Temporal knowledge graph | Time intervals on every graph edge; temporal queries become interval algebra (Allen’s), not LLM reasoning | +20–30pp LongMemEval temporal | Temporal category ≥ 50% |
| Timeline generation | Before answering temporal query, generate explicit timeline artifact; answer from that | +10–15pp temporal reasoning | No regression on extraction |
| Temporal embeddings | Encode when something was true, not just what it says; time-aware similarity search | Research-stage; lift TBD | Requires custom embedding model |
| Change detection pipeline | Retain-time: flag superseded facts, tag updates with preference_update / correction types | +5–8pp LongMemEval update | Update category ≥ 85% |
Phase R5: Calibrated abstention
Section titled “Phase R5: Calibrated abstention”Barrier: Perfect abstention means answering when you know and declining when you don’t — never hallucinating, never refusing something answerable. Current adversarial is 100% but open-domain is 61.5%.
| Technique | Mechanism | Expected lift | Eval gate |
|---|---|---|---|
| Calibrated confidence scoring | Output confidence with every reflect; train calibration on benchmark pairs | +5–8pp open-domain | Calibration error < 0.05 |
| Retrieval sufficiency classifier | Lightweight model predicts “Is there enough info to answer?” before LLM synthesis; short-circuits to abstention | +5–8pp open-domain (fewer false positives) | No regression on single-hop |
| Self-consistency verification | Generate answer N times with different chunk orderings; divergence → abstain | +3–5pp across categories | Adversarial stays ≥ 98% |
| Coverage-aware retrieval | brain.audit() coverage map checked at recall time; abstain if query topic outside bank coverage | +3–5pp open-domain | Requires M10 (gap analysis) |
Projected benchmark trajectory
Section titled “Projected benchmark trajectory”| Phase | LoCoMo (canonical) | LongMemEval (canonical) | Primary categories improved |
|---|---|---|---|
| Current | ~70–78% | ~75% (20Q) | — |
| v0.9.x + M8–M11 | ~72–80% | ~60–68% (200Q) | Multi-session, knowledge-update, temporal, entity |
| v1.0.0 GA gates | ~78–85% | ~68–75% (200Q) | Multi-hop, temporal, reasoning, update |
| After R1 (retrieval) | ~82–88% | ~75–82% | All (higher recall hit rate) |
| After R2 (synthesis) | ~85–90% | ~80–85% | Multi-hop, open-domain, reasoning |
| After R3 (structured) | ~88–92% | ~83–88% | Multi-hop, update |
| After R4 (temporal) | ~89–93% | ~88–92% | Temporal, update |
| After R5 (abstention) | ~92–95% | ~90–93% | Open-domain, adversarial edge cases |
| Theoretical ceiling | ~95–97% | ~93–97% | Bounded by embedding + LLM model quality |
True 100% requires a purpose-built synthesis model fine-tuned on the “given structured memory, answer questions” task — not a general-purpose LLM. Astrocyte’s SPI model means such a model plugs in as a Tier 2 engine without changing the framework. The research track creates the structured substrate (temporal KG, episodic indexing, contradiction resolution) that makes such a model trainable.
Gap-to-Milestone Mapping
Section titled “Gap-to-Milestone Mapping”| Gap | Description | Milestone | Version | Status |
|---|---|---|---|---|
| Gap 1 | Identity & Authorization Protocol | M1 | v0.5.0 | Implemented in core (ADR-002 Phase 1); JWT/tenant_id enforcement = later milestones |
| Gap 2 | External Data Sources | M4 + M4.1 | v0.7.0 / v0.7.1 | Shipped: webhook ingest, source registry, stream + poll adapters (adapters-ingestion-py/), proxy federated recall (astrocyte.recall.proxy) |
| Gap 3 | Deployment Models | M6 | v0.8.0 | Shipped in-repo: astrocyte-gateway-py, Docker/Compose/Helm, CI, GHCR (ADR-001 standalone path). |
| Gap 4 | Config Schema Evolution | M2 | v0.5.0 | Implemented in core (ADR-003); ships with M1 in same tag |
| Gap 5 | User-Scoped Memory | M1 | v0.5.0 | Implemented in core (BankResolver, ACL, optional adapter context); HTTP UX via astrocyte-gateway-py auth + config |
| Gap 6 | Production Storage Providers | M5 | v0.8.0 | Implemented: adapters-storage-py/ packages + CI + publish workflows |
| Gap 7 | Extraction Pipeline | M3 | v0.6.0 | Implemented (pipeline/extraction, chunking, profiles) |
| Gap 8 | Structured recall authority (truth precedence vs cost-tiered retrieval) | M7 | v0.8.0 | Implemented (recall_authority, astrocyte.recall.authority, ADR-004) |
Next milestone focus
Section titled “Next milestone focus”Current state: M1–M11 are implemented on the v0.9.x line. Active lines in order:
1. M8–M11 (v0.9.x) — pre-GA feature surface, eval-gated Ships the complete third-option implementation surface: wiki compile, time travel, gap analysis, entity resolution, and Apache AGE. Depends on M3 + M5 (both done).
2. Research track (R1–R5) Runs after the M8–M11 feature surface, before and after v1.0.0 GA as quality work warrants. Five research phases target fundamental benchmark accuracy barriers; each has its own eval gate. See § Research track.
3. Ecosystem composition (CocoIndex)
Astrocyte is read-side; CocoIndex is build-side. The four-phase integration roadmap in cocoindex-integration.md covers (0) parallel writers to the same Postgres (already works, doc-only); (1) an upstream astrocyte target connector for CocoIndex apps; (2) adoption of four CocoIndex pipeline patterns inside Astrocyte (per-component update stats, push-source LiveMapFeed, stable-path retain trace, fingerprint memoization); (3) speculative inverse direction. Phases 0+1 are scheduled alongside v1.0.0 GA work; Phase 2 items are sequenced by impact ÷ effort and can ship independently.
3. v1.0.0 GA — validation and stabilization Required for GA:
- Eval gates — M8–M11 pass the benchmark and latency release bar.
- Docs and examples — operator and integrator docs match the shipped API surface.
- Stability pass — SPI, adapter,
astrocyte.yaml, andmip.yamlcontracts are frozen or explicitly deprecated. - Tag v1.0.0 once M1–M11 all pass CI and release gates.
SPI, adapter, astrocyte.yaml, and mip.yaml stability: follow § Stability; encode deprecations before breaking changes.