Skip to content

Observation evolution — live memory


title: Observation evolution — live memory (M21)

Section titled “title: Observation evolution — live memory (M21)”

Astrocyte’s memory layer doesn’t sit still. Observations accrue evidence over time and acquire a computed trend; mental-model documents are structured objects modified by typed delta operations rather than re-generated prose. This page is the user-facing guide to what landed in M21 and how to use it from your agent code or via the MCP server.

Every observation carries a computed trend derived from the timestamps of its supporting evidence:

TrendMeaning
newAll evidence is within the recent window (default 30 days). Fresh insight, just learned.
strengtheningDenser evidence recently than historically. The pattern is reinforcing.
stableEvidence spread across time, continues to present. Long-term true.
weakeningEvidence mostly old, sparse recently. May still apply but the agent is hearing less about it.
staleNo evidence in the recent window. The observation may no longer apply.

The trend is computed algorithmically from _obs_source_timestamps metadata, not generated by the LLM — same evidence list always yields the same trend. Thresholds (recent_days=30, old_days=90) match Hindsight’s defaults and are hard-coded in v0.15.0; we’ll expose them as AstrocyteConfig fields in a future release if users ask.

from astrocyte.pipeline.observation import compute_observation_trend
from astrocyte.pipeline.trend import Trend
# A hit you got back from recall (observation strategy)
trend = compute_observation_trend(hit.metadata)
if trend in (Trend.NEW, Trend.STRENGTHENING):
# surface as a "fresh insight" badge in your UI
...
elif trend == Trend.STALE:
# demote or hide — the agent hasn't heard about this lately
...
// memory_list_observations response shape
{
"observations": [
{
"id": "obs-abc123",
"text": "User prefers async stand-ups",
"trend": "new",
"proof_count": 1,
"source_ids": "[\"chunk-1\"]",
"confidence": "0.9",
"updated_at": "2026-05-18T19:35:01+00:00",
"scope": "bank"
}
]
}

Pass trend="new" (or any other value) to filter; pass scope="..." to limit to a topic.

Mental models are now stored with an authoritative structured document schema (Pydantic-typed sections + blocks). Updates emit operations that target the structure rather than re-emitting the whole document:

// memory_update_mental_model request shape
{
"model_id": "alice-prefs",
"operations": [
{
"op": "append_block",
"section_id": "tools",
"block": { "type": "paragraph", "text": "Now also uses Linear." }
},
{
"op": "rename_section",
"section_id": "schedule",
"new_heading": "Schedule (Q2 2026)"
}
]
}

The operation types you can emit (see astrocyte.pipeline.delta_ops for full schemas):

OpTargetsEffect
append_blocksection_idAdd a block at the end
insert_blocksection_id + indexAdd a block at a position
replace_blocksection_id + indexSwap a single block
remove_blocksection_id + indexDrop a block
add_sectionoptional after_section_idNew section
remove_sectionsection_idDrop a section
replace_section_blockssection_idRebuild a section’s blocks in one go
rename_sectionsection_idChange the heading (id stays stable)

Block types: paragraph, bullet_list, ordered_list, code.

Why operations and not “send the new doc”

Section titled “Why operations and not “send the new doc””

When an LLM is asked to regenerate a document, it drifts even on content it didn’t intend to change — bullet styling, casing, separator lines, paraphrasing all wobble around. Operations let the LLM specify just what changed; unmentioned sections and blocks are physically copied through by apply_operations. Drift on unchanged content is structurally impossible, not just discouraged.

Invalid operations (unknown section_id, out-of-range index, malformed payloads) are dropped with a logged reason in the response’s skipped list. The document never gets worse than its input:

// memory_update_mental_model response shape
{
"changed": true,
"revision": 4,
"applied": [ { "op": "append_block", "section_id": "tools" } ],
"skipped": [
{ "op": "remove_block", "section_id": "ghost", "reason": "unknown section_id: ghost" }
]
}

Even an LLM that emits 100% garbage produces zero changes — changed: false, revision unchanged. The structure can only get better or stay the same per refresh.

Pre-M21 mental models stored a raw markdown content string with no structured representation. The first memory_update_mental_model call against a legacy row parses content via structured_doc.parse_markdown and persists the structured form going forward. One-shot, never repeated. Reads keep working through the migration window — the rendered content column stays as the source of truth for markdown consumers.

User-curated directives (M18b/M19 deferred item)

Section titled “User-curated directives (M18b/M19 deferred item)”

Directives are user-authored hard rules — the architecturally correct replacement for the M18a-2 directive_compile auto-extraction path (which was deprecated in M19 after replicating a −30pp SSP regression because compressed auto-directives overrode original preference nuance).

// memory_create_directive request
{
"rule_text": "Always confirm before sending money over $100",
"scope": "bank"
}

The directive is stored as a MentalModel with kind="directive". It participates in the agentic reflect loop’s search_mental_models tool exactly like general / preference models, but the discriminator signals to the answerer that the rule should be applied as a preference override rather than as one input among many.

ToolPurpose
memory_list_mental_modelsList all mental models in a bank, optionally by scope
memory_create_mental_modelAuthor a new mental model (raw markdown OR sections)
memory_update_mental_modelApply delta operations to an existing model
memory_delete_mental_modelSoft-delete a model
memory_create_directiveAuthor a user-curated hard rule
memory_list_observationsList observations, filter by trend / scope
memory_get_observationFetch one observation by id
memory_create_observationHand-author an observation (bypass consolidator)
memory_delete_observationDelete one observation

Plus the existing tools from earlier cycles (memory_retain, memory_recall, memory_reflect, memory_forget, memory_history, memory_audit, memory_compile, memory_banks, memory_health).

  • docs/_plugins/recall-vs-reflect.md — the consumption contract for memory_recall vs memory_reflect (M20).
  • docs/_plugins/mental-models.md — the original M9 architecture + storage model that M21 extends with structured docs.
  • docs/_design/m21-observation-evolution.md — the M21 cycle doc with full scope, rationale, and ship gate.
  • astrocyte/pipeline/structured_doc.py — the schema + renderer + parser.
  • astrocyte/pipeline/delta_ops.py — the operation types + apply_operations.
  • astrocyte/pipeline/trend.py — the Trend enum + compute_trend.