Storage backend setup
How to install, configure, and run each Astrocyte storage adapter. All adapters are optional PyPI packages that register via Python entry points — install the one you need, set the config key, and Astrocyte picks it up automatically.
Which backend do I need?
Section titled “Which backend do I need?”| Store type | Role | Adapters | When to use |
|---|---|---|---|
| Vector store | Semantic search (embeddings) | pgvector, qdrant | Always — required for Tier 1 recall |
| Graph store | Entity relationships and traversal | neo4j | When you need “who knows whom” or relationship-aware recall |
| Document store | BM25 full-text / keyword search | elasticsearch | When you need keyword recall alongside semantic |
You can combine stores for hybrid recall — e.g. pgvector (semantic) + Neo4j (graph) + Elasticsearch (keyword). Results are fused with reciprocal rank fusion (RRF).
In-memory (development only)
Section titled “In-memory (development only)”For quick prototyping, use the built-in in-memory store (no install required):
provider_tier: storagevector_store: in_memoryllm_provider: mockNo persistence — data is lost on restart.
PostgreSQL + pgvector
Section titled “PostgreSQL + pgvector”Recommended default for production. Stores embeddings in PostgreSQL with the pgvector extension.
Install
Section titled “Install”pip install astrocyte-pgvector# or from source:cd adapters-storage-py/astrocyte-pgvector && pip install -e .Run PostgreSQL
Section titled “Run PostgreSQL”# Docker (quickest)docker run -d --name astrocyte-pg \ -e POSTGRES_USER=astrocyte \ -e POSTGRES_PASSWORD=astrocyte \ -e POSTGRES_DB=astrocyte \ -p 5433:5432 \ pgvector/pgvector:pg16Or use the included Docker Compose stack:
cd astrocyte-services-pycp .env.example .envdocker compose up -d postgresConfigure
Section titled “Configure”provider_tier: storagevector_store: pgvectorvector_store_config: dsn: ${DATABASE_URL} embedding_dimensions: 1536 # must match your embedding model bootstrap_schema: true # auto-create tables (dev)| Key | Type | Default | Description |
|---|---|---|---|
dsn | string | required | PostgreSQL connection URI |
table_name | string | astrocyte_vectors | Table name |
embedding_dimensions | int | 128 | Vector width — must match your embedding model output |
bootstrap_schema | bool | true | Auto-create extension, table, and indexes on first use |
Connection string format:
postgresql://user:password@host:port/databasepostgresql://astrocyte:astrocyte@127.0.0.1:5433/astrocytepostgresql://user:pass@host:5432/db?sslmode=requireProduction: run migrations
Section titled “Production: run migrations”For production, disable bootstrap_schema and run migrations explicitly:
export DATABASE_URL='postgresql://astrocyte:astrocyte@127.0.0.1:5433/astrocyte'cd adapters-storage-py/astrocyte-pgvector./scripts/migrate.shvector_store_config: dsn: ${DATABASE_URL} embedding_dimensions: 1536 bootstrap_schema: false # migrations already appliedMigrations are plain SQL in migrations/:
| File | What it does |
|---|---|
001_extension.sql | Install pgvector extension |
002_astrocytes_vectors.sql | Create vectors table |
003_indexes.sql | B-tree on bank_id, HNSW on embeddings |
004_memory_layer.sql | Add memory layer column |
Requires: psql client on PATH, PostgreSQL 15+.
Production checklist
Section titled “Production checklist”- Set
bootstrap_schema: falseand runmigrate.shbefore deploying - Match
embedding_dimensionsto your embedding model (OpenAItext-embedding-3-small= 1536) - Use
?sslmode=requirein DSN for remote databases - Store DSN in env var or secrets manager, not in YAML
Qdrant
Section titled “Qdrant”Cloud-native vector database with built-in collection management.
Install
Section titled “Install”pip install astrocyte-qdrant# or from source:cd adapters-storage-py/astrocyte-qdrant && pip install -e .Run Qdrant
Section titled “Run Qdrant”docker run -d --name astrocyte-qdrant \ -p 6333:6333 \ qdrant/qdrant:v1.17.0Configure
Section titled “Configure”provider_tier: storagevector_store: qdrantvector_store_config: url: http://localhost:6333 collection_name: astrocyte_mem vector_size: 1536 # must match your embedding model| Key | Type | Default | Description |
|---|---|---|---|
url | string | required | Qdrant HTTP API URL |
collection_name | string | required | Collection name |
vector_size | int | required | Embedding dimension |
api_key | string | null | API key for authentication |
timeout | float | 30.0 | Request timeout in seconds |
Collections are created automatically on first use with cosine distance.
Qdrant Cloud
Section titled “Qdrant Cloud”vector_store: qdrantvector_store_config: url: https://your-cluster.cloud.qdrant.io collection_name: astrocyte_mem vector_size: 1536 api_key: ${QDRANT_API_KEY}Production checklist
Section titled “Production checklist”- Match
vector_sizeto your embedding model - Use
api_keyif Qdrant is network-accessible - Monitor collection size and memory usage
Graph database for entity relationships and neighborhood-aware recall.
Install
Section titled “Install”pip install astrocyte-neo4j# or from source:cd adapters-storage-py/astrocyte-neo4j && pip install -e .Run Neo4j
Section titled “Run Neo4j”docker run -d --name astrocyte-neo4j \ -p 7687:7687 -p 7474:7474 \ -e NEO4J_AUTH=neo4j/your-password \ neo4j:5Web browser: http://localhost:7474
Configure
Section titled “Configure”graph_store: neo4jgraph_store_config: uri: bolt://localhost:7687 user: neo4j password: ${NEO4J_PASSWORD} database: neo4j| Key | Type | Default | Description |
|---|---|---|---|
uri | string | required | Bolt URI (bolt://host:port) |
user | string | required | Neo4j username |
password | string | required | Neo4j password |
database | string | neo4j | Database name |
Graph model
Section titled “Graph model”Astrocyte stores entities and relationships isolated by bank:
- Nodes:
AstrocyteEntitywith propertiesentity_id,bank,name,entity_type,aliases - Relationships:
ENTITY_LINKwithlink_typeandmetadata
Production checklist
Section titled “Production checklist”- Use Neo4j 5+ for best compatibility
- Store credentials in env vars or secrets manager
- Monitor transaction throughput and heap usage
- Consider Neo4j Aura (managed) for production
Elasticsearch
Section titled “Elasticsearch”BM25 full-text search for keyword-based recall. Complements vector search for hybrid retrieval.
Install
Section titled “Install”pip install astrocyte-elasticsearch# or from source:cd adapters-storage-py/astrocyte-elasticsearch && pip install -e .Run Elasticsearch
Section titled “Run Elasticsearch”docker run -d --name astrocyte-es \ -p 9200:9200 \ -e discovery.type=single-node \ -e xpack.security.enabled=false \ -e ES_JAVA_OPTS="-Xms512m -Xmx512m" \ docker.elastic.co/elasticsearch/elasticsearch:8.15.3Configure
Section titled “Configure”document_store: elasticsearchdocument_store_config: url: http://localhost:9200 index_prefix: astrocyte_docs| Key | Type | Default | Description |
|---|---|---|---|
url | string | required | Elasticsearch HTTP URL |
index_prefix | string | astrocyte_docs | Index name prefix — one index per bank ({prefix}_{bank_id}) |
Indexes are created automatically on first use.
Elastic Cloud
Section titled “Elastic Cloud”document_store: elasticsearchdocument_store_config: url: https://user:password@your-cluster.es.cloud.elastic.co:9243 index_prefix: astrocyte_docsProduction checklist
Section titled “Production checklist”- Use Elasticsearch 8.12+ with security enabled (
xpack.security) - Configure index lifecycle management (ILM) for old indices
- Size heap appropriately (50% of available RAM, max 32 GB)
- Monitor disk usage — especially for high-volume ingestion
Hybrid recall (multiple backends)
Section titled “Hybrid recall (multiple backends)”Combine vector, graph, and document stores for richer retrieval. Results are fused with reciprocal rank fusion (RRF).
provider_tier: storage
# Semantic searchvector_store: pgvectorvector_store_config: dsn: ${DATABASE_URL} embedding_dimensions: 1536 bootstrap_schema: false
# Entity relationshipsgraph_store: neo4jgraph_store_config: uri: bolt://localhost:7687 user: neo4j password: ${NEO4J_PASSWORD}
# Keyword / full-text searchdocument_store: elasticsearchdocument_store_config: url: http://localhost:9200
# LLM for embedding + reflectllm_provider: openaillm_provider_config: api_key: ${OPENAI_API_KEY} model: gpt-4o-miniRecall automatically queries all configured stores and fuses results. No additional configuration needed — just install the adapter packages and add the config sections.
Comparison
Section titled “Comparison”| pgvector | Qdrant | Neo4j | Elasticsearch | |
|---|---|---|---|---|
| Store type | Vector | Vector | Graph | Document |
| Search | Semantic (HNSW) | Semantic (HNSW) | Neighborhood traversal | BM25 keyword |
| Managed options | Any managed Postgres | Qdrant Cloud | Neo4j Aura | Elastic Cloud |
| Best for | Default choice; existing Postgres | Dedicated vector DB; large scale | Relationship-heavy domains | Keyword recall alongside semantic |
| Persistence | SQL (full ACID) | On-disk snapshots | On-disk | Lucene segments |
| Python package | astrocyte-pgvector | astrocyte-qdrant | astrocyte-neo4j | astrocyte-elasticsearch |
| Config key | vector_store: pgvector | vector_store: qdrant | graph_store: neo4j | document_store: elasticsearch |
Docker quick reference
Section titled “Docker quick reference”Start all backends for local development:
# pgvectordocker run -d --name pg -p 5433:5432 \ -e POSTGRES_USER=astrocyte -e POSTGRES_PASSWORD=astrocyte -e POSTGRES_DB=astrocyte \ pgvector/pgvector:pg16
# Qdrantdocker run -d --name qdrant -p 6333:6333 qdrant/qdrant:v1.17.0
# Neo4jdocker run -d --name neo4j -p 7687:7687 -p 7474:7474 \ -e NEO4J_AUTH=neo4j/testpass neo4j:5
# Elasticsearchdocker run -d --name es -p 9200:9200 \ -e discovery.type=single-node -e xpack.security.enabled=false \ -e ES_JAVA_OPTS="-Xms512m -Xmx512m" \ docker.elastic.co/elasticsearch/elasticsearch:8.15.3Further reading
Section titled “Further reading”- Configuration reference — full
astrocyte.yamlschema - Memory API reference — retain/recall/reflect/forget signatures
- Bank management — bank creation, multi-bank queries, hybrid recall patterns
- Provider SPI — build your own storage adapter
- Storage adapter packages — architecture and entry point conventions
- Storage and data planes — retrieval vs export architecture