Architecture

DB9 is a PostgreSQL-compatible distributed SQL database built on TiKV. It separates a control plane (API server, CLI, SDK) from a data plane (the SQL engine that speaks the PostgreSQL wire protocol and stores data in TiKV).

This page explains how the pieces fit together, how queries flow from client to storage, and where the architectural boundaries are.

Mental model

┌─────────────────────────────────────────────────┐
│                  Clients                        │
│   psql · ORMs · drivers · SDK · CLI             │
└────────────────────┬────────────────────────────┘
                     │
         ┌───────────┴───────────┐
         │                       │
         ▼                       ▼
┌─────────────────┐   ┌──────────────────┐
│  Control Plane  │   │   Data Plane     │
│  (db9-backend)  │   │   (db9-server)   │
│                 │   │                  │
│  REST API       │   │  pgwire protocol │
│  Auth & tokens  │   │  SQL engine      │
│  DB lifecycle   │   │  Extensions      │
│  Branching      │   │  Worker engine   │
│  Observability  │   │                  │
└────────┬────────┘   └────────┬─────────┘
         │                     │
         └──────────┬──────────┘
                    ▼
           ┌────────────────┐
           │     TiKV       │
           │  (distributed  │
           │   KV storage)  │
           └────────────────┘

Control plane — manages database lifecycle, authentication, tokens, branching, and observability. You interact with it through the db9 CLI, the TypeScript SDK, or the REST API.

Data plane — runs SQL. The db9-server process accepts PostgreSQL wire protocol connections, parses and optimizes SQL, executes queries, and reads/writes data in TiKV. Extensions like fs9, HTTP, embedding, vector search, and pg_cron run here as compiled-in capabilities.

TiKV — the storage layer. A distributed key-value store with Raft consensus and ACID transactions. Each DB9 database gets its own isolated keyspace in TiKV, so tenants share infrastructure but never data.

Core components

db9-server (data plane)

The SQL engine. A single async Rust process running on the Tokio runtime.

Component	Role
pgwire listener	Accepts PostgreSQL wire protocol connections on port `5433`. Handles startup, TLS, authentication, and the Simple Query / Extended Query protocols.
SQL parser	Parses SQL text into an AST using sqlparser-rs.
Analyzer	Single-pass name resolution and type checking. Resolves column references, infers types, applies coercions, and tracks subquery scopes.
Optimizer	Cost-based query optimization. Converts analyzed queries into logical plans, applies rewrite rules (predicate pushdown, join reordering via DPccp), then selects physical operators. Uses table statistics from `ANALYZE` when available.
Executor	Volcano-model pull-based iterator pipeline. Operators include table scan, index range scan, HNSW k-NN scan, hash join, nested-loop join, sort, aggregate, and window functions.
TiKV store	Facade over the TiKV client. Manages transactions (pessimistic locking, snapshot isolation), key encoding, and per-keyspace connection pooling.
Extension runtime	Compiled-in extensions registered at startup. Each extension can add SQL functions, types, operators, and background workers.
Worker engine	Unified async task queue stored in TiKV. Drives pg_cron schedules, async triggers, auto-analyze, and HNSW index background merges.

db9-backend (control plane)

The API server. An Axum web application backed by a metadata database (SQLite or PostgreSQL).

Component	Role
REST API	Database CRUD, user management, token lifecycle, branching, migrations, observability, and SQL execution via HTTP.
Auth service	Anonymous trial accounts, Auth0 SSO, API tokens, and connect tokens (short-lived pgwire credentials).
PD client	Manages TiKV keyspaces — creates and deletes the isolated storage namespaces that back each database.
PG client	Connects to db9-server to execute tenant setup SQL (roles, default extensions, seed data).
Reconciler	Background process that detects and recovers failed provisioning or deletion operations.

db9 CLI and TypeScript SDK

Client-side tools that talk to both planes:

Control plane — db9 create, db9 db list, db9 token create, db9 db branch create, and other management commands go through the REST API.
Data plane — db9 sql, db9 db psql, and the SDK’s connectToken() connect directly to db9-server over pgwire.

Query execution flow

When a client sends a SQL query, it follows this path through db9-server:

SQL text
  │
  ├─ Parse (sqlparser-rs → AST)
  │
  ├─ Dispatch
  │   ├─ Data statements (SELECT, INSERT, UPDATE, DELETE) → Analyzer
  │   ├─ DDL / utility statements → direct handler
  │   └─ Transaction control (BEGIN, COMMIT, SAVEPOINT) → session state
  │
  ├─ Analyze (name resolution, type checking, scope tracking)
  │
  ├─ Optimize
  │   ├─ Logical plan (AnalyzedQuery → LogicalPlan)
  │   ├─ Rewrite rules (predicate pushdown, join reordering, subquery decorrelation)
  │   └─ Physical plan (scan selection, join selection, aggregate strategy)
  │
  ├─ Execute (Volcano iterator pipeline)
  │   ├─ BEGIN transaction (pessimistic)
  │   ├─ Pull rows through operator tree
  │   ├─ Read/write TiKV (get, scan, put, delete)
  │   └─ COMMIT or ROLLBACK
  │
  └─ Respond (pgwire RowDescription + DataRow messages)

Scan strategies

The physical planner selects from several scan methods:

Scan type	When used
Table scan	No matching index, or large result sets where a sequential scan is cheaper.
Index range scan	BTree index matches the predicate (equality, range, IN-list). Supports expression indexes and partial indexes.
HNSW k-NN scan	`ORDER BY vec <-> query LIMIT k` with a vector index. Uses the usearch library.

Transaction model

Default isolation: Snapshot Isolation (SI), backed by TiKV’s MVCC and pessimistic locking. This corresponds to PostgreSQL’s REPEATABLE READ level.
Serializable: Accepted syntactically but downgraded to Repeatable Read. TiKV does not provide PostgreSQL-style serializable snapshot isolation (SSI).
Savepoints: Full PostgreSQL-style SAVEPOINT / ROLLBACK TO with undo records tracked in TiKV.
Autocommit: Statements outside an explicit transaction run in autocommit mode.

Multi-tenant isolation

DB9 isolates tenants at the storage layer using TiKV keyspaces.

TiKV Cluster
├─ Keyspace A (tenant_a)
│   ├─ System metadata (tables, schemas, types, extensions, roles)
│   ├─ Table data (rows keyed by table ID + primary key)
│   ├─ Index data (secondary index entries)
│   └─ Worker state (cron jobs, task queue)
│
├─ Keyspace B (tenant_b)
│   └─ ... (completely independent copy)
│
└─ Keyspace C (tenant_c)
    └─ ...

How it works:

The client connects with username tenant_id.role (e.g., a1b2c3d4e5f6.admin).
db9-server parses the username to extract the tenant ID and look up the keyspace.
A TenantHandle is acquired from the connection pool, binding all storage operations to that keyspace.
All key encoding includes the keyspace prefix, so reads and writes are physically separated in TiKV.

Each tenant gets independent:

Table and schema namespaces
Roles and privileges
Extension installation state
Worker queue and cron schedules
Memory accounting (DB9_TENANT_MEMORY_QUOTA_BYTES)

There is no cross-tenant query path. A connection can only access its own keyspace.

Extension system

Extensions are compiled into the db9-server binary and registered at startup. They are not dynamically loaded plugins.

Each extension adds some combination of:

SQL functions (e.g., http_get(), fs9_read(), embedding())
Types (e.g., vector(384))
Operators (e.g., <-> for vector distance)
Index methods (e.g., HNSW for vector search)
Background workers (e.g., pg_cron scheduler)

Extensions are installed per-tenant with CREATE EXTENSION:

CREATE EXTENSION http;
CREATE EXTENSION fs9;
CREATE EXTENSION vector;
CREATE EXTENSION embedding;
CREATE EXTENSION pg_cron;

Installation state is stored in TiKV under the tenant’s keyspace, so each database independently controls which extensions are active.

Some extensions are enabled by default when a database is created (http, pg_cron). Others require explicit installation.

For the full list, see the Extensions overview.

Connection handling

db9-server uses a semaphore-based admission control model:

Max connections: 1000 by default (configurable).
Per-connection: Each client connection gets its own async Tokio task with session-local state (transaction, settings, search path).
Connection pool: TikvClientPool caches per-keyspace TikvStore instances. Idle tenants are evicted after 300 seconds by default.
TLS: Supported via cert/key configuration. SCRAM-SHA-256 is the default authentication method.

For connection string format, authentication options, and driver compatibility, see Connect to DB9.

Constraints and boundaries

Area	Current state
Execution model	Single-process, single-node. No distributed query execution across multiple db9-server instances.
Plan cache	Session-local only. No shared prepared statement cache across connections.
GIN indexes	Planner generates GIN scan plans, but runtime execution falls back to table scan.
Parallel query	Not supported. Queries execute on a single Tokio task.
Foreign data wrappers	Not supported. Use the HTTP extension for external data access.
Logical replication	Not supported. Use branching or the REST API for data movement.

These are architectural boundaries, not bugs. They reflect the current stage of the system. Check the SQL limits page for detailed compatibility notes.

When this architecture fits well

Agent and automation workloads — instant provisioning, per-task isolation, and built-in capabilities (embeddings, files, HTTP, cron) reduce the number of external services.
Multi-tenant SaaS — keyspace isolation provides strong data separation without managing separate database clusters.
Development and CI — branching and disposable databases are first-class operations, not workarounds.
Workloads that fit in a single node — the SQL engine is mature and the optimizer is cost-based, but there is no distributed execution layer today.

When to consider alternatives

You need distributed query execution across multiple nodes for analytical workloads.
You depend on PostgreSQL extensions that DB9 does not support — check Extensions and SQL limits.
You need logical replication, foreign data wrappers, or custom C extensions.

Next steps

Production Checklist — authentication, observability, recovery, and operational limits
Connect to DB9 — connection strings, drivers, and authentication
Why DB9 for AI Agents — how the architecture enables agent workflows
Extensions — fs9, HTTP, vector search, pg_cron, and more
SQL Reference — supported SQL syntax and compatibility notes
Compatibility Matrix — full PostgreSQL compatibility surface
CLI Reference — full command reference for the control plane