Skip to content
Discord Get Started

Observability

DB9 includes built-in observability that is always on. Query performance is sampled automatically, slow queries are always captured, and the data is available through the CLI, REST API, and direct SQL — no extensions to install.

This page covers what you can observe, how to access it, and where the current boundaries are compared to standard PostgreSQL.

MetricAvailableAccess
QPS and TPSYesCLI, API, SQL
Latency (avg, p99)YesCLI, API, SQL
Active connections (count)YesCLI, API, SQL
Query samples with latencyYesCLI, API, SQL
Slow queries (p99-sorted)YesCLI
Error count and failed queriesYesCLI, API, SQL
Write-conflict retries (TiKV)YesSQL
HNSW index build metricsYesSQL
EXPLAIN query plansYesSQL
Schema, tables, indexesYesCLI, SQL
Per-connection detailsNo
Index usage statsNo
Memory/cache statsNo
Prometheus/OpenTelemetry exportNo

The primary observability tool is db9 db inspect:

Terminal
# Summary dashboard (QPS, TPS, latency, connections, errors)
db9 db inspect <database>
# Query samples with latency breakdown
db9 db inspect <database> queries
# Combined summary + queries
db9 db inspect <database> report
# Top slow queries sorted by p99 latency
db9 db inspect <database> slow-queries
Terminal
# List schemas, tables, and indexes
db9 db inspect <database> schemas
db9 db inspect <database> tables
db9 db inspect <database> indexes

All commands support --json and --output csv for programmatic use.

Terminal
db9 db inspect mydb
Output
Summary (60-minute window)
QPS: 41.7
TPS: 20.8
Latency avg: 12.5 ms
Latency p99: 45.2 ms
Connections: 8
Statements: 150,000
Commits: 75,000
Errors: 3

Two built-in table functions provide observability data directly in SQL:

SQL
SELECT qps, tps, latency_avg_ms, latency_p99_ms, active_connections,
statement_count, txn_commit_count, error_count
FROM _db9_sys_observability();

Returns a single row with the rolling 60-minute summary. Additional columns include write-conflict retry counts (retry_attempts, retry_budget_exhausted, retry_timeout_aborts) and HNSW index metrics (hnsw_graph_bytes_written, hnsw_serialize_duration_us).

SQL
SELECT query, sample_count, error_count,
latency_avg_ms, latency_p99_ms, latency_max_ms,
last_seen_ms_ago
FROM _db9_sys_query_samples()
ORDER BY latency_p99_ms DESC
LIMIT 10;

Returns per-query aggregates for sampled queries in the current window. Up to 50 unique query groups are tracked.

SQL
SELECT query, error_count, sample_count
FROM _db9_sys_query_samples()
WHERE error_count > 0
ORDER BY error_count DESC;

For automation, call the observability endpoint directly:

Terminal
curl -s "https://api.db9.ai/customer/databases/<database-id>/observability" \
-H "Authorization: Bearer $TOKEN" | jq .

Returns JSON with summary (QPS, TPS, latency, connections) and samples (per-query metrics).

DB9 supports EXPLAIN for query plan inspection:

SQL
EXPLAIN SELECT * FROM users WHERE email = 'test@example.com';
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

Plan nodes include SeqScan, IndexScan, HnswScan, NestedLoop, HashJoin, Sort, Limit, Aggregate, and more.

EXPLAIN ANALYZE is accepted but currently shows estimated costs only — per-operator runtime statistics are not yet available.

DB9 collects observability data using in-memory sampling:

  • Window: 60-minute rolling window
  • Default rate: 1 in 1,000 queries sampled (0.1%)
  • Always captured: errors and queries exceeding the slow threshold (200 ms default)
  • Max query groups: 50 unique query fingerprints
  • Max sample events: 20,000 in memory
  • SQL normalization: whitespace reduced, text truncated to 512 characters
  • Redaction: PASSWORD literals replaced with '***'

This is process-level, in-memory data. It is not persisted to TiKV and resets when the server restarts.

For database metadata (not performance metrics), use:

Terminal
db9 db status <database>

Shows database name, ID, state (ACTIVE, CLONING, CREATE_FAILED), region, creation time, endpoints, and connection string.

DB9 implements some pg_catalog views as stubs for tool compatibility:

ViewStatusNotes
information_schema.tablesFunctionalStandard schema introspection
pg_indexesFunctionalIndex definitions
pg_stat_user_tablesStub (zeros)All counters return 0
pg_statistic_extStub (empty)No extended statistics
pg_stat_statementsNot availableUse _db9_sys_query_samples()
pg_stat_activityNot availableUse _db9_sys_observability() for connection count

Tools like Prisma, Drizzle, and psql that query information_schema and pg_indexes work normally. Tools that depend on pg_stat_statements or pg_stat_activity for monitoring need to use DB9’s native functions instead.

  • Per-connection activity — only aggregate connection count, no per-session details
  • Index usage statisticspg_stat_user_tables.idx_scan returns 0
  • Memory and cache metrics — shared buffers, work memory usage not exposed
  • Slow query log to file — observability data is query-based only, not logged to disk
  • Prometheus/OpenTelemetry export — no /metrics endpoint or trace export
  • Full query text with parameters — only normalized SQL stored (bind values not captured)
  • EXPLAIN ANALYZE runtime stats — plan shows estimated costs, not actual per-operator timing