Complete reference for astrocyte.yaml — every key, type, default, and valid value.
Astrocyte loads configuration with this merge order (last wins):
Compliance profile (compliance_profile: pdpa) — sets barriers, lifecycle, access control, DLP
Profile (profile: personal) — sets defaults, homeostasis, signal quality
Your astrocyte.yaml — overrides everything above
All string values support ${ENV_VAR} substitution — unresolved vars are left as-is.
Key Type Default Description provider_tier"storage" | "engine""engine"Tier 1 (storage) uses Astrocyte’s built-in pipeline with your own backends. Tier 2 (engine) delegates to a full memory engine (Mystique, Mem0, Zep, etc.)profilestring | null nullBuilt-in profile name (minimal, personal, research, coding, support) or path to custom profile file (./my-profile.yaml) compliance_profilestring | null nullPre-built compliance preset: gdpr, hipaa, or pdpa. Sets barriers, lifecycle, access control, and DLP automatically fallback_strategy"error" | "local_llm" | "degrade""error"How to handle provider failures
Used when provider_tier: storage.
Key Type Default Description vector_storestring | null nullVector store provider: in_memory, pgvector, qdrant, etc. vector_store_configdict | null nullProvider-specific settings (connection URL, dimensions, etc.) graph_storestring | null nullGraph store provider: neo4j, etc. graph_store_configdict | null nullProvider-specific settings document_storestring | null nullDocument store provider: elasticsearch, etc. document_store_configdict | null nullProvider-specific settings
# Example: pgvector + Neo4j hybrid
connection_url : ${DATABASE_URL}
embedding_dimensions : 1536
uri : bolt://localhost:7687
auth_password : ${NEO4J_PASSWORD}
Used when provider_tier: engine.
Key Type Default Description providerstring | null nullEngine provider name: mystique, mem0, zep, etc. provider_configdict | null nullEngine-specific settings (endpoint, API key, etc.)
endpoint : ${MYSTIQUE_ENDPOINT}
api_key : ${MYSTIQUE_API_KEY}
Key Type Default Description llm_providerstring | null nullLLM provider for reflect and extraction: openai, anthropic, litellm, mock, etc. llm_provider_configdict | null nullAPI key, model name, endpoint, etc. embedding_providerstring | null nullSeparate embedding provider (if different from LLM) embedding_provider_configdict | null nullEmbedding-specific settings (dimensions, model, etc.)
api_key : ${OPENAI_API_KEY}
embedding_provider : openai
embedding_provider_config :
model : text-embedding-3-small
Rate limits, quotas, and token budgets.
retain_max_content_bytes : 51200
global_per_minute : null # optional global cap
retain_per_day : null # optional daily cap
Key Type Default Description recall_max_tokensint | null nullMax tokens per recall operation reflect_max_tokensint | null nullMax tokens per reflect operation retain_max_content_bytesint | null nullMax bytes per retain operation rate_limits.retain_per_minuteint | null nullMax retain operations per minute rate_limits.recall_per_minuteint | null nullMax recall operations per minute rate_limits.reflect_per_minuteint | null nullMax reflect operations per minute rate_limits.global_per_minuteint | null nullGlobal rate limit across all operations quotas.retain_per_dayint | null nullMax retains per 24 hours quotas.reflect_per_dayint | null nullMax reflects per 24 hours
Safety controls applied on every retain operation.
Key Type Default Description modestring "regex"Detection mode: regex, ner, llm, rules_then_llm, disabled actionstring "redact"What to do when PII is found: redact, reject, warn countrieslist[string] | null nullCountry-specific patterns: SG (NRIC), IN (Aadhaar), GB (NINO), US (SSN), AU (TFN), CA, JP, CN, DE, FR, IT, ES patternslist[dict] | null nullCustom regex patterns: [{type: "custom_id", pattern: "\\d{8}"}] type_overridesdict | null nullOverride action per PII type: {credit_card: {action: reject}}
countries : [ SG , IN , GB , US ]
credit_card : { action : reject }
Key Type Default Description max_content_lengthint 50000Reject content over this many bytes reject_empty_contentbool trueReject empty or whitespace-only content reject_binary_contentbool trueReject non-text content allowed_content_typeslist[string] | null nullOptional content type whitelist
Key Type Default Description blocked_keyslist[string] ["api_key", "password", "token", "secret"]Keys to scrub from metadata max_metadata_size_bytesint 4096Max metadata size in bytes
Deduplication and noise detection.
Key Type Default Description enabledbool trueEnable duplicate detection similarity_thresholdfloat 0.95Cosine similarity threshold for duplicates (0–1) actionstring "skip"What to do with duplicates: skip, warn, update
Key Type Default Description enabledbool trueEnable noisy bank detection retain_spike_multiplierfloat 5.0Retain rate spike threshold min_avg_content_lengthint 20Minimum average chunk length (chars) max_dedup_ratefloat 0.8Max dedup rate before flagging actionstring "warn"Action on noisy bank: warn, throttle, reject
Circuit breaker and degraded mode.
Key Type Default Description degraded_modestring "empty_recall"Fallback on failure: empty_recall, error, cache circuit_breaker.failure_thresholdint 5Failures before circuit opens circuit_breaker.recovery_timeout_secondsfloat 30.0Seconds before half-open attempt circuit_breaker.half_open_max_callsint 2Max calls in half-open state
Similarity-based caching for repeated queries.
Key Type Default Description enabledbool falseEnable recall cache similarity_thresholdfloat 0.95Cache hit threshold max_entriesint 256Max cached results ttl_secondsfloat 300.0Cache entry lifetime (seconds)
Progressive retrieval strategy — tries cheaper/faster tiers first.
Key Type Default Description enabledbool falseEnable tiered retrieval min_resultsint 3Min results per tier before advancing min_scorefloat 0.3Min relevance score threshold max_tierint 3Maximum tier (0–4) full_recall"pipeline" | "hybrid""pipeline"Recall path for tier 3+. hybrid requires a hybrid engine provider
Structured truth precedence — labels fused hits for synthesis.
Key Type Default Description enabledbool falseEnable recall authority rules_inlinestring | null nullInline authority rules rules_pathstring | null nullPath to authority rules file tierslist []Precedence tiers: [{id: "primary", priority: 1, label: "Verified"}] tier_by_bankdict {}Map bank IDs to tier IDs: {bank-1: "primary"} apply_to_reflectbool trueInject authority context into reflect prompts
label : " Verified sources "
label : " Inferred knowledge "
LLM-scored selective retention — only stores content above an importance threshold.
Key Type Default Description enabledbool falseEnable curated retain modelstring | null nullLLM model for importance scoring context_recall_limitint 5Max context items for scoring
Multi-factor ranking — blends recency, reliability, salience, and similarity.
Key Type Default Description enabledbool falseEnable curated recall freshness_weightfloat 0.3Recency bonus weight reliability_weightfloat 0.2Authority/source weight salience_weightfloat 0.2Relevance/importance weight original_score_weightfloat 0.3Vector similarity weight freshness_half_life_daysfloat 30.0Decay curve for recency min_scorefloat | null nullMin final score threshold
Key Type Default Description enabledbool falseEnable ACL enforcement default_policystring "owner_only"Default when no grants match: owner_only, open, deny
Identity-driven bank resolution.
Key Type Default Description auto_resolve_banksbool falseAuto-create banks from principal user_bank_prefixstring "user-"Prefix for user-scoped banks agent_bank_prefixstring "agent-"Prefix for agent-scoped banks service_bank_prefixstring "service-"Prefix for service-scoped banks resolver"convention" | "config" | "custom" | nullnullBank resolution strategy obo_enabledbool falseEnable on-behalf-of permission intersection
Top-level access grants (merged with per-bank banks.*.access).
principal : " agent:support-bot "
permissions : [ read , write ]
permissions : [ read , write , forget , admin ]
Field Type Description bank_idstring Bank ID or glob pattern (* for all) principalstring Principal: user:X, agent:X, service:X, or * permissionslist[string] Permissions: read, write, forget, admin, *
Data Loss Prevention — output scanning for PII in recall and reflect results.
Key Type Default Description scan_recall_outputbool falseScan recall results for PII scan_reflect_outputbool falseScan reflect output for PII output_pii_actionstring "warn"Action on detected PII: redact, reject, warn
Automatic memory archival and deletion based on age and activity.
Key Type Default Description enabledbool falseEnable lifecycle management ttl.archive_after_daysint 90Archive if not recalled in N days ttl.delete_after_daysint 365Delete if older than N days ttl.exempt_tagslist[string] | null nullTags that skip TTL (e.g. pinned, compliance) ttl.fact_type_overridesdict | null nullOverride archive_after_days by fact type: {world: 180, experience: null}
exempt_tags : [ pinned , compliance ]
experience : null # never auto-archive
Per-profile reasoning defaults — affect reflect synthesis behavior.
Key Type Default Description skepticismint 3How critical to source (1–5) literalismint 3How literal vs. interpretive (1–5) empathyint 3How empathetic in synthesis (1–5) preferred_fact_typeslist[string] | null nullPreference order: [experience, world, observation] tagslist[string] | null nullDefault tags for all retained content
Key Type Default Description otel_enabledbool falseEnable OpenTelemetry spans prometheus_enabledbool falseEnable Prometheus metrics log_levelstring "info"Log level: debug, info, warn, error
MCP (Model Context Protocol) server settings — used by astrocyte.integrations.mcp.
Key Type Default Description default_bank_idstring | null nullDefault bank for MCP calls expose_reflectbool trueAllow reflect via MCP expose_forgetbool falseAllow forget via MCP max_results_limitint 50Max items returned per request principalstring | null nullPrincipal for MCP operations
Key Type Default Description mip_config_pathstring | null nullPath to MIP routing rules file (./mip.yaml)
See Memory Intent Protocol for the full MIP DSL — match operators, actions, override hierarchy, and intent policy.
Per-bank overrides. Each key is a bank ID. Any top-level section can be overridden per bank.
similarity_threshold : 0.90
- bank_id : sensitive-bank
principal : " agent:analyst "
permissions : [ read , write ]
Key Type Default Description profilestring | null nullOverride profile for this bank accesslist[dict] | null nullBank-specific access grants homeostasisHomeostasisConfig | null nullOverride homeostasis settings barriersBarrierConfig | null nullOverride barrier settings signal_qualitySignalQualityConfig | null nullOverride signal quality settings
External data source definitions for ingestion. See poll ingest guide for webhook, stream, and poll setup.
extraction_profile : builtin_text
target_bank : webhook-data
secret : ${WEBHOOK_SECRET}
target_bank : github-issues
url : redis://localhost:6379
consumer_group : astrocyte-group
url : " https://api.example.com/search?q={query} "
Key Type Description typestring Source type: webhook, stream, poll / api_poll, proxy driverstring | null Driver name: github, redis, kafka extraction_profilestring | null Extraction profile name for ingested content target_bankstring | null Destination bank ID target_bank_templatestring | null Template: "bank-{source_id}" authdict | null Auth config (type-specific: hmac_sha256, bearer, etc.) pathstring | null Source-specific path (e.g. owner/repo for GitHub) urlstring | null Source URL (Redis URL, Kafka bootstrap servers, proxy endpoint) topicstring | null Stream topic or Redis stream key consumer_groupstring | null Consumer group name interval_secondsint | null Poll interval (min 60 for GitHub) recall_methodstring | null Proxy only: GET (default) or POST recall_bodydict | null Proxy POST only: request body template
Registered agents with bank access and rate hints.
principal : " agent:support-bot "
default_bank : shared-support
banks : [ shared-support , team-* ]
permissions : [ read , write ]
max_retain_per_minute : 60
max_recall_per_minute : 120
Key Type Default Description principalstring | null nullAgent principal (e.g. agent:my-bot) bankslist[string] | null nullAllowed bank IDs (glob patterns supported) allowed_bankslist[string] | null nullAlias for banks default_bankstring | null nullDefault bank when not specified permissionslist[string] | null nullDeclared permissions (documentation/validation) max_retain_per_minuteint | null nullPer-agent retain rate hint max_recall_per_minuteint | null nullPer-agent recall rate hint
Reusable extraction configurations for ingestion sources.
chunking_strategy : dialogue
speaker : " $.participant_name "
- match : { source : slack }
Key Type Default Description content_typestring | null nullExpected content type chunking_strategystring | null nullStrategy: sentence, paragraph, fixed, dialogue chunk_sizeint | null nullMax characters per chunk entity_extractionbool | string | null nullExtract entities: true, false, ner, llm fact_typestring | null nullDefault fact type: world, experience, observation authority_tierstring | null nullRecall authority tier ID (overrides recall_authority.tier_by_bank) metadata_mappingdict | null nullMap source fields to metadata keys tag_ruleslist[dict] | null nullGenerate tags from metadata patterns
Built-in profiles: builtin_text and builtin_conversation.
Standalone gateway settings — ignored in library mode.
- " http://localhost:3000 "
cert_path : /path/to/cert.pem
key_path : /path/to/key.pem
Key Type Default Description mode"library" | "standalone" | "plugin""library"Deployment mode hoststring | null nullBind address (standalone only) portint | null nullPort (standalone only) workersint | null nullWorker processes (standalone only) cors_originslist[string] | null nullCORS allowed origins tls.cert_pathstring | null nullTLS certificate path tls.key_pathstring | null nullTLS private key path
Pre-built compliance presets that configure barriers, lifecycle, access control, and DLP. Set compliance_profile at the top level — your explicit config overrides any values the profile sets.
Profile PII mode PII action Lifecycle Access default DLP pdparules_then_llmredact5-year retention owner_onlyReflect output scanned gdprrules_then_llmredact2-year retention denyReflect output scanned hipaarules_then_llmreject7-year retention denyRecall + reflect scanned
Any string value in astrocyte.yaml can reference environment variables with ${VAR_NAME}:
connection_url : ${DATABASE_URL}
api_key : ${OPENAI_API_KEY}
secret : ${WEBHOOK_SECRET}
Unresolved variables (not set in the environment) are left as the literal string ${VAR_NAME}.
connection_url : ${DATABASE_URL}
api_key : ${OPENAI_API_KEY}
mip_config_path : ./mip.yaml
default_policy : owner_only
Memory API reference — retain/recall/reflect/forget signatures and examples
Authentication setup — auth modes, OIDC providers, JWT claim mapping
Storage backend setup — pgvector, Qdrant, Neo4j, Elasticsearch install and config
Monitoring & observability — health endpoints, logging, tracing, metrics
Access control setup — grants, OBO, common patterns
Bank management — bank creation, multi-bank queries, lifecycle recipes
MIP developer guide — writing and testing MIP routing rules
Production-grade HTTP service — full production checklist