Skip to content

Stack Decisions

Every architectural choice in Sci was deliberate. Here's what we chose, what we rejected, and why.

Postgres + pgvector over Milvus / Qdrant

Chosen: Postgres + pgvector
Rejected: Milvus, Qdrant, Pinecone

A dedicated vector store adds an entire service dependency for a capability Postgres already has. pgvector colocates relational data (profiles, write queue, auth) with vector data. For the scale Sci operates at — millions of memories per user, not billions — pgvector's HNSW index is more than sufficient. Simpler ops, simpler backup, simpler sovereignty story.

Custom TypeScript MCP server over OpenMemory

Chosen: @sci/mcp (custom)
Rejected: OpenMemory (Mem0), Letta/MemGPT

OpenMemory is tied to Mem0's infrastructure and Python ecosystem. Letta/MemGPT has strong opinions about agent architecture. Sci needs to serve any MCP-compatible agent without being opinionated about the agent's internals. Custom TypeScript gives full control over the MCP tool surface, auth tiers, and profile scoping.

SQLite + hnswlib for cloud backends over "Postgres everywhere"

Chosen: SQLite + hnswlib (cloud)
Rejected: Remote Postgres / managed database

Cloud storage backends (Dropbox, S3, iCloud) need to work with files — not running servers. SQLite is the right database for "a file that is a database." hnswlib is a fast HNSW implementation that works without a server process. Two files in your Dropbox is a more honest implementation of "your data is literally in your Dropbox" than a Postgres instance that syncs to Dropbox.

LLM Wiki v2 consolidation pattern over real-time summarization

Chosen: Nightly batch consolidation
Rejected: Real-time summarization on every write

Summarizing on every write is expensive and lossy. Storing raw episodic memories preserves the original signal. Nightly consolidation promotes high-value items to semantic nodes using Ebbinghaus decay scoring — the same forgetting curve the human brain uses. The implementation is ~300 lines and is fully auditable.

BGE-base-en-v1.5 (local) over Voyage AI as default

Chosen: FastEmbed + BGE-base-en-v1.5 (local, 768-dim)
Rejected: Voyage AI as default, OpenAI embeddings

Sovereignty by default. Every embedding query to a cloud provider is a data exposure event. Local embeddings mean raw memory content never leaves the machine for the embedding step. Voyage AI (already in use in Threadline) is available as a Pro tier upgrade for users who want higher quality and accept the tradeoff.

SHA-256 hash chain provenance over blockchain

Chosen: SHA-256 hash of token, stored only
Rejected: Blockchain, distributed ledger

Tamper-evident without overhead. Only the hash of a token is stored — the plaintext is shown once and never persisted. A developer can verify token integrity with standard cryptographic tools.

Compromise.js NER over spaCy

Chosen: compromise.js (pure JavaScript)
Rejected: spaCy (Python)

spaCy is more accurate, but it requires a Python sidecar process, a model download, and a subprocess communication layer. Compromise.js runs in-process with no external dependencies. For the NER quality needed (PERSON, PLACE, ORG detection in English), compromise.js is sufficient and the operational simplicity is worth more than marginal accuracy gains. Custom entity loading from identity_facts covers the gap for user-specific names.

RRF (k=60) over learned re-ranking

Chosen: Reciprocal Rank Fusion (k=60)
Rejected: Learned re-rankers (cross-encoders, ColBERT)

RRF is deterministic, has no additional inference cost, and has been state-of-the-art for hybrid retrieval since 2009. Learned re-rankers would require an additional model call per query. For a system that already runs two DB queries per recall, adding a third inference step is a significant latency increase with marginal benefit at this scale.

MCP stdio transport over HTTP

Chosen: stdio (spawned by client)
Rejected: HTTP/SSE server

For local use, stdio is simpler — no port management, no authentication surface beyond the token, no server process to manage. The client spawns the server and owns its lifecycle. HTTP transport is available for Phase 6+ multi-client scenarios but isn't needed for the initial use case.