Architecture
DB9 is a PostgreSQL-compatible distributed SQL database built on TiKV. It separates a control plane (API server, CLI, SDK) from a data plane (the SQL engine that speaks the PostgreSQL wire protocol and stores data in TiKV).
This page explains how the pieces fit together, how queries flow from client to storage, and where the architectural boundaries are.
Mental model
Section titled “Mental model”┌─────────────────────────────────────────────────┐│ Clients ││ psql · ORMs · drivers · SDK · CLI │└────────────────────┬────────────────────────────┘ │ ┌───────────┴───────────┐ │ │ ▼ ▼┌─────────────────┐ ┌──────────────────┐│ Control Plane │ │ Data Plane ││ (db9-backend) │ │ (db9-server) ││ │ │ ││ REST API │ │ pgwire protocol ││ Auth & tokens │ │ SQL engine ││ DB lifecycle │ │ Extensions ││ Branching │ │ Worker engine ││ Observability │ │ │└────────┬────────┘ └────────┬─────────┘ │ │ └──────────┬──────────┘ ▼ ┌────────────────┐ │ TiKV │ │ (distributed │ │ KV storage) │ └────────────────┘Control plane — manages database lifecycle, authentication, tokens, branching, and observability. You interact with it through the db9 CLI, the TypeScript SDK, or the REST API.
Data plane — runs SQL. The db9-server process accepts PostgreSQL wire protocol connections, parses and optimizes SQL, executes queries, and reads/writes data in TiKV. Extensions like fs9, HTTP, embedding, vector search, and pg_cron run here as compiled-in capabilities.
TiKV — the storage layer. A distributed key-value store with Raft consensus and ACID transactions. Each DB9 database gets its own isolated keyspace in TiKV, so tenants share infrastructure but never data.
Core components
Section titled “Core components”db9-server (data plane)
Section titled “db9-server (data plane)”The SQL engine. A single async Rust process running on the Tokio runtime.
| Component | Role |
|---|---|
| pgwire listener | Accepts PostgreSQL wire protocol connections on port 5433. Handles startup, TLS, authentication, and the Simple Query / Extended Query protocols. |
| SQL parser | Parses SQL text into an AST using sqlparser-rs. |
| Analyzer | Single-pass name resolution and type checking. Resolves column references, infers types, applies coercions, and tracks subquery scopes. |
| Optimizer | Cost-based query optimization. Converts analyzed queries into logical plans, applies rewrite rules (predicate pushdown, join reordering via DPccp), then selects physical operators. Uses table statistics from ANALYZE when available. |
| Executor | Volcano-model pull-based iterator pipeline. Operators include table scan, index range scan, HNSW k-NN scan, hash join, nested-loop join, sort, aggregate, and window functions. |
| TiKV store | Facade over the TiKV client. Manages transactions (pessimistic locking, snapshot isolation), key encoding, and per-keyspace connection pooling. |
| Extension runtime | Compiled-in extensions registered at startup. Each extension can add SQL functions, types, operators, and background workers. |
| Worker engine | Unified async task queue stored in TiKV. Drives pg_cron schedules, async triggers, auto-analyze, and HNSW index background merges. |
db9-backend (control plane)
Section titled “db9-backend (control plane)”The API server. An Axum web application backed by a metadata database (SQLite or PostgreSQL).
| Component | Role |
|---|---|
| REST API | Database CRUD, user management, token lifecycle, branching, migrations, observability, and SQL execution via HTTP. |
| Auth service | Anonymous trial accounts, Auth0 SSO, API tokens, and connect tokens (short-lived pgwire credentials). |
| PD client | Manages TiKV keyspaces — creates and deletes the isolated storage namespaces that back each database. |
| PG client | Connects to db9-server to execute tenant setup SQL (roles, default extensions, seed data). |
| Reconciler | Background process that detects and recovers failed provisioning or deletion operations. |
db9 CLI and TypeScript SDK
Section titled “db9 CLI and TypeScript SDK”Client-side tools that talk to both planes:
- Control plane —
db9 create,db9 db list,db9 token create,db9 db branch create, and other management commands go through the REST API. - Data plane —
db9 sql,db9 db psql, and the SDK’sconnectToken()connect directly to db9-server over pgwire.
Query execution flow
Section titled “Query execution flow”When a client sends a SQL query, it follows this path through db9-server:
SQL text │ ├─ Parse (sqlparser-rs → AST) │ ├─ Dispatch │ ├─ Data statements (SELECT, INSERT, UPDATE, DELETE) → Analyzer │ ├─ DDL / utility statements → direct handler │ └─ Transaction control (BEGIN, COMMIT, SAVEPOINT) → session state │ ├─ Analyze (name resolution, type checking, scope tracking) │ ├─ Optimize │ ├─ Logical plan (AnalyzedQuery → LogicalPlan) │ ├─ Rewrite rules (predicate pushdown, join reordering, subquery decorrelation) │ └─ Physical plan (scan selection, join selection, aggregate strategy) │ ├─ Execute (Volcano iterator pipeline) │ ├─ BEGIN transaction (pessimistic) │ ├─ Pull rows through operator tree │ ├─ Read/write TiKV (get, scan, put, delete) │ └─ COMMIT or ROLLBACK │ └─ Respond (pgwire RowDescription + DataRow messages)Scan strategies
Section titled “Scan strategies”The physical planner selects from several scan methods:
| Scan type | When used |
|---|---|
| Table scan | No matching index, or large result sets where a sequential scan is cheaper. |
| Index range scan | BTree index matches the predicate (equality, range, IN-list). Supports expression indexes and partial indexes. |
| HNSW k-NN scan | ORDER BY vec <-> query LIMIT k with a vector index. Uses the usearch library. |
Transaction model
Section titled “Transaction model”- Default isolation: Snapshot Isolation (SI), backed by TiKV’s MVCC and pessimistic locking. This corresponds to PostgreSQL’s
REPEATABLE READlevel. - Serializable: Accepted syntactically but downgraded to Repeatable Read. TiKV does not provide PostgreSQL-style serializable snapshot isolation (SSI).
- Savepoints: Full PostgreSQL-style
SAVEPOINT/ROLLBACK TOwith undo records tracked in TiKV. - Autocommit: Statements outside an explicit transaction run in autocommit mode.
Multi-tenant isolation
Section titled “Multi-tenant isolation”DB9 isolates tenants at the storage layer using TiKV keyspaces.
TiKV Cluster├─ Keyspace A (tenant_a)│ ├─ System metadata (tables, schemas, types, extensions, roles)│ ├─ Table data (rows keyed by table ID + primary key)│ ├─ Index data (secondary index entries)│ └─ Worker state (cron jobs, task queue)│├─ Keyspace B (tenant_b)│ └─ ... (completely independent copy)│└─ Keyspace C (tenant_c) └─ ...How it works:
- The client connects with username
tenant_id.role(e.g.,a1b2c3d4e5f6.admin). - db9-server parses the username to extract the tenant ID and look up the keyspace.
- A
TenantHandleis acquired from the connection pool, binding all storage operations to that keyspace. - All key encoding includes the keyspace prefix, so reads and writes are physically separated in TiKV.
Each tenant gets independent:
- Table and schema namespaces
- Roles and privileges
- Extension installation state
- Worker queue and cron schedules
- Memory accounting (
DB9_TENANT_MEMORY_QUOTA_BYTES)
There is no cross-tenant query path. A connection can only access its own keyspace.
Extension system
Section titled “Extension system”Extensions are compiled into the db9-server binary and registered at startup. They are not dynamically loaded plugins.
Each extension adds some combination of:
- SQL functions (e.g.,
http_get(),fs9_read(),embedding()) - Types (e.g.,
vector(384)) - Operators (e.g.,
<->for vector distance) - Index methods (e.g., HNSW for vector search)
- Background workers (e.g., pg_cron scheduler)
Extensions are installed per-tenant with CREATE EXTENSION:
CREATE EXTENSION http;CREATE EXTENSION fs9;CREATE EXTENSION vector;CREATE EXTENSION embedding;CREATE EXTENSION pg_cron;Installation state is stored in TiKV under the tenant’s keyspace, so each database independently controls which extensions are active.
Some extensions are enabled by default when a database is created (http, pg_cron). Others require explicit installation.
For the full list, see the Extensions overview.
Connection handling
Section titled “Connection handling”db9-server uses a semaphore-based admission control model:
- Max connections: 1000 by default (configurable).
- Per-connection: Each client connection gets its own async Tokio task with session-local state (transaction, settings, search path).
- Connection pool:
TikvClientPoolcaches per-keyspaceTikvStoreinstances. Idle tenants are evicted after 300 seconds by default. - TLS: Supported via cert/key configuration. SCRAM-SHA-256 is the default authentication method.
For connection string format, authentication options, and driver compatibility, see Connect to DB9.
Constraints and boundaries
Section titled “Constraints and boundaries”| Area | Current state |
|---|---|
| Execution model | Single-process, single-node. No distributed query execution across multiple db9-server instances. |
| Plan cache | Session-local only. No shared prepared statement cache across connections. |
| GIN indexes | Planner generates GIN scan plans, but runtime execution falls back to table scan. |
| Parallel query | Not supported. Queries execute on a single Tokio task. |
| Foreign data wrappers | Not supported. Use the HTTP extension for external data access. |
| Logical replication | Not supported. Use branching or the REST API for data movement. |
These are architectural boundaries, not bugs. They reflect the current stage of the system. Check the SQL limits page for detailed compatibility notes.
When this architecture fits well
Section titled “When this architecture fits well”- Agent and automation workloads — instant provisioning, per-task isolation, and built-in capabilities (embeddings, files, HTTP, cron) reduce the number of external services.
- Multi-tenant SaaS — keyspace isolation provides strong data separation without managing separate database clusters.
- Development and CI — branching and disposable databases are first-class operations, not workarounds.
- Workloads that fit in a single node — the SQL engine is mature and the optimizer is cost-based, but there is no distributed execution layer today.
When to consider alternatives
Section titled “When to consider alternatives”- You need distributed query execution across multiple nodes for analytical workloads.
- You depend on PostgreSQL extensions that DB9 does not support — check Extensions and SQL limits.
- You need logical replication, foreign data wrappers, or custom C extensions.
Next steps
Section titled “Next steps”- Production Checklist — authentication, observability, recovery, and operational limits
- Connect to DB9 — connection strings, drivers, and authentication
- Why DB9 for AI Agents — how the architecture enables agent workflows
- Extensions — fs9, HTTP, vector search, pg_cron, and more
- SQL Reference — supported SQL syntax and compatibility notes
- Compatibility Matrix — full PostgreSQL compatibility surface
- CLI Reference — full command reference for the control plane