Multi-bank orchestration
Agents don’t have one memory - they have several. A support agent has personal context about the current user, shared team knowledge, and org-wide policies. Astrocyte orchestrates across these banks because it sits above the provider layer.
For the core API and provider architecture, see architecture-framework.md.
1. The problem
Section titled “1. The problem”A single bank per agent is limiting:
- An agent serving multiple users needs per-user banks plus shared knowledge.
- A team of agents needs individual memory plus a shared team bank.
- Enterprise agents need personal + team + org-wide policy banks.
Without multi-bank orchestration, callers must query each bank individually, merge results, and handle per-bank policies themselves. That logic belongs in the framework.
The same patterns support human–agent collaboration: an agent’s bank_groups often include a per-user bank plus team and org banks. That is only safe when access control grants the agent principal read (and optionally write) on those banks; see access-control.md §1.4 and sandbox-awareness-and-exfiltration.md for wiring identity so this stays intentional, not cross-environment bleed.
2. Bank topology
Section titled “2. Bank topology”Astrocyte supports three bank relationship patterns:
2.1 Isolated banks
Section titled “2.1 Isolated banks”Each bank is independent. No cross-bank queries. Default behavior.
flowchart TB UA[User A bank] UB[User B bank] SB[Shared bank] AA[Agent A] AB[Agent B] AA --> UA AA --> UB AB --> SB
2.2 Layered banks (cascade)
Section titled “2.2 Layered banks (cascade)”Banks are ordered by specificity. Query the most specific first; widen if results are thin.
flowchart LR P[Personal - most specific] --> T[Team] --> O[Org - least specific]
2.3 Parallel banks (fan-out)
Section titled “2.3 Parallel banks (fan-out)”Query all banks simultaneously, fuse results across banks.
flowchart LR P[Personal] --> F[Fuse] T[Team] --> F O[Org] --> F F --> R[Results]
2.4 Shared banks (multi-principal, same data)
Section titled “2.4 Shared banks (multi-principal, same data)”Sharing does not mean two principals in one API call. Each request has one AstrocyteContext.principal. Shared memory means the same bank_id appears in two principals’ effective grants—for example user:calvin and agent:support-bot-1 both have read (and possibly write) on user-calvin.
Implications:
- Recall: When the user runs a recall against
user-calvinand when the agent runs a recall againstuser-calvin(with its own principal), both see consistent content subject to their permission bits (e.g. agent without forget). - Multi-bank recall: Strategies like
cascade/parallelonly include banks the current principal may read. A user might have admin on personal + team + org; an agent might be limited to personal read + team read—orchestration config must match those grant shapes or calls will AccessDenied mid-query. - Retain: Writes target one
bank_id; provenance and audit should record which principal retained (seeaccess-control.md§6 andmemory-lifecycle.md).
3. API surface
Section titled “3. API surface”3.1 Single-bank (existing, unchanged)
Section titled “3.1 Single-bank (existing, unchanged)”hits = await brain.recall("What does Calvin prefer?", bank_id="user-calvin")3.2 Multi-bank recall
Section titled “3.2 Multi-bank recall”hits = await brain.recall( "What does Calvin prefer?", banks=["user-calvin", "team-support", "org-policies"], strategy="cascade",)3.3 Strategy options
Section titled “3.3 Strategy options”Implemented in astrocyte-py: Astrocyte.recall(..., banks=[...], strategy=...) accepts a string ("parallel" | "cascade" | "first_match") or a MultiBankStrategy instance. Omitting strategy with multiple banks keeps parallel merge (backward compatible). Cross-bank deduplication keeps the highest-scoring hit per distinct text.
@dataclassclass MultiBankStrategy: mode: Literal["cascade", "parallel", "first_match"] = "parallel"
# Cascade-specific min_results_to_stop: int = 3 # Stop widening when we have enough cascade_order: list[str] | None = None # Explicit order (default: banks= list order)
# Parallel-specific bank_weights: dict[str, float] | None = None # Weight results by bank dedup_across_banks: bool = True| Strategy | Behavior | Use case |
|---|---|---|
cascade | Query banks in order; stop when min_results_to_stop reached | Personal → team → org (most specific first) |
parallel | Query all banks concurrently; fuse results with optional weights | Agent needs breadth across all knowledge |
first_match | Query banks in order; return first bank’s results if non-empty | Fallback pattern (try primary, fall back to secondary) |
3.4 Multi-bank retain
Section titled “3.4 Multi-bank retain”Retain targets a single bank (you always know where to store). But metadata can reference other banks:
await brain.retain( "Calvin prefers dark mode", bank_id="user-calvin", metadata={"also_relevant_to": ["team-support"]},)3.5 Multi-bank reflect
Section titled “3.5 Multi-bank reflect”Multi-bank reflect is not yet on the Astrocyte API; call recall with the desired banks / strategy, then synthesize out-of-band, or run reflect on a single bank_id. A first-class reflect(..., banks=...) can follow the same strategy machinery as recall.
4. Cross-bank fusion
Section titled “4. Cross-bank fusion”When using parallel strategy, results from different banks are fused:
- Per-bank recall: run recall against each bank concurrently.
- Deduplicate: remove identical memories that appear in multiple banks (by content hash).
- Weight: apply per-bank weights (e.g., personal bank gets 2x weight over org bank).
- Re-score: normalize scores across banks (different banks may use different scoring scales).
- RRF fusion: merge weighted, normalized results using reciprocal rank fusion.
- Token budget: truncate to fit the overall token budget.
# Configurationmulti_bank: default_strategy: cascade bank_groups: support_agent: banks: ["user-{user_id}", "team-support", "org-policies"] strategy: cascade cascade_order: ["user-{user_id}", "team-support", "org-policies"] bank_weights: "user-{user_id}": 2.0 "team-support": 1.5 "org-policies": 1.05. Per-bank policy enforcement
Section titled “5. Per-bank policy enforcement”Each bank can have its own policy overrides (see use-case-profiles.md). Multi-bank orchestration respects per-bank policies:
- Rate limits: applied per-bank, not aggregated. A cascade across 3 banks counts as 3 rate-limited operations.
- PII barriers: applied to the query before it reaches any bank. Applied once, not per-bank.
- Token budgets: the overall budget is split across banks (proportional to weight, or configurable).
- Access control: the caller must have read access to every bank in the query (see
access-control.md).
6. Bank templates
Section titled “6. Bank templates”For applications that create banks dynamically (one per user), templates define the default configuration:
bank_templates: user: name_pattern: "user-{user_id}" profile: personal auto_create: true access: - principal: "user:{user_id}" permissions: [read, write, forget, admin] - principal: "agent:{agent_id}" permissions: [read, write] team: name_pattern: "team-{team_id}" profile: support auto_create: falseWhen brain.recall(bank_id="user-123") is called and the bank doesn’t exist, the matching template creates it with the specified profile and access settings.
7. Principle traceability
Section titled “7. Principle traceability”| Feature | Principle |
|---|---|
| Multiple bank types (personal, team, org) | P4: Heterogeneity - specialized subtypes |
| Cascade strategy (widen when thin) | P5: Metabolic coupling - adapt retrieval depth to supply |
| Per-bank policy enforcement | P3: Homeostasis - per-region regulation |
| Cross-bank fusion | P2: Tripartite synapse - mediate the exchange |
| Shared banks (users + agents) | P6: Barrier maintenance - explicit who may cross which bank |
8. Hybrid Tier-2 engine + Tier-1 pipeline (same bank_id)
Section titled “8. Hybrid Tier-2 engine + Tier-1 pipeline (same bank_id)”When both a hosted engine and a local pipeline (vector / graph / document path) should answer for the same logical bank, use HybridEngineProvider (astrocyte-py). It implements EngineProvider: recall fans out to both backends, applies optional per-source weights, dedupes by text (highest score wins), then ranks and applies the request token budget. retain targets exactly one backend via retain_target="engine" or "pipeline".
from astrocyte import Astrocyte, HybridEngineProviderfrom astrocyte.pipeline.orchestrator import PipelineOrchestrator
engine = ... # Tier-2 EngineProviderpipeline = PipelineOrchestrator(vector_store=..., llm_provider=...)hybrid = HybridEngineProvider( engine=engine, pipeline=pipeline, retain_target="engine", engine_recall_weight=1.0, pipeline_recall_weight=1.0,)brain = Astrocyte.from_config("astrocyte.yaml")brain.set_engine_provider(hybrid)MemoryHit.source is set to tier2_engine or tier1_pipeline when not already present, for observability.