Files
dbx/AGENTS.md
Aleksey Shakhmatov 7d25e1b73e
All checks were successful
CI / test (push) Successful in 51s
Add CI workflow, README, CLAUDE.md, AGENTS.md, and .cursorrules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:01:27 +03:00

5.8 KiB

AGENTS.md — dbx

Universal guide for AI coding agents working with this codebase.

Overview

git.codelab.vc/pkg/dbx is a Go PostgreSQL cluster library built on pgx/v5. It provides master/replica routing, automatic retries, load balancing, background health checking, panic-safe transactions, and context-based Querier injection.

Package map

dbx/                      Root — Cluster, Node, Balancer, retry, health, errors, tx, config
├── dbx.go                Interfaces: Querier, DB, Logger, MetricsHook
├── cluster.go            Cluster — routing, write/read operations
├── node.go               Node — pgxpool.Pool wrapper with health state
├── balancer.go           Balancer interface + RoundRobinBalancer
├── retry.go              retrier — exponential backoff with jitter and node fallback
├── health.go             healthChecker — background goroutine pinging nodes
├── tx.go                 RunTx, RunTxOptions, InjectQuerier, ExtractQuerier
├── errors.go             Error classification (IsRetryable, IsConnectionError, etc.)
├── config.go             Config, NodeConfig, PoolConfig, RetryConfig, HealthCheckConfig
├── options.go            Functional options (WithLogger, WithMetrics, WithRetry, etc.)
└── dbxtest/
    └── dbxtest.go        Test helpers: NewTestCluster, TestLogger

Routing architecture

                          ┌──────────────┐
                          │   Cluster    │
                          └──────┬───────┘
                                 │
                 ┌───────────────┴───────────────┐
                 │                               │
            Write ops                       Read ops
     Exec, Query, QueryRow            ReadQuery, ReadQueryRow
     Begin, BeginTx, RunTx
     CopyFrom, SendBatch
                 │                               │
                 ▼                               ▼
          ┌──────────┐              ┌────────────────────────┐
          │  Master  │              │  Balancer → Replicas   │
          └──────────┘              │  fallback → Master     │
                                    └────────────────────────┘

Retry loop (retrier.do):
  For each attempt (up to MaxAttempts):
    For each node in [target nodes]:
      if healthy → execute → on retryable error → continue
    Backoff (exponential + jitter)

Common tasks

Add a new node type (e.g., analytics replica)

  1. Add a field to Cluster struct (e.g., analytics []*Node)
  2. Add corresponding config to Config struct
  3. Connect nodes in NewCluster, add to all slice for health checking
  4. Add routing methods (e.g., AnalyticsQuery)

Customize retry logic

  1. Provide RetryConfig.RetryableErrors — custom func(error) bool classifier
  2. Or modify IsRetryable() in errors.go to add new PG error codes
  3. Adjust MaxAttempts, BaseDelay, MaxDelay in RetryConfig

Add a metrics hook

  1. Add a new callback field to MetricsHook struct in dbx.go
  2. Call it at the appropriate point (nil-check the hook and the field)
  3. See existing hooks in cluster.go (queryStart/queryEnd) and health.go (OnNodeDown/OnNodeUp)

Add a new balancer strategy

  1. Implement the Balancer interface: Next(nodes []*Node) *Node
  2. Must return nil if no suitable node is available
  3. Must check node.IsHealthy() to skip down nodes

Gotchas

  • Close() is required: Cluster.Close() stops the health checker goroutine and closes all pools. Leaking a Cluster leaks goroutines and connections
  • RunTx panic safety: runTx uses defer with recover() — it rolls back on panic, then re-panics. Do not catch panics outside RunTx expecting the tx to be committed
  • Context-based Querier injection: ExtractQuerier returns the fallback if no Querier is in context. Always pass the cluster/pool as fallback so code works both inside and outside transactions
  • Health checker goroutine: Starts immediately in NewCluster. Uses time.NewTicker — the first check happens after one interval, not immediately. Nodes start as healthy (healthy.Store(true) in newNode)
  • readNodes ordering: readNodes() returns [replicas..., master] — the retrier tries replicas first, master is the last fallback
  • errRow for closed cluster: When cluster is closed, QueryRow/ReadQueryRow return errRow{err: ErrClusterClosed} — the error surfaces on Scan()
  • No SQL parsing: Routing is purely method-based. If you call Exec with a SELECT, it still goes to master

Commands

go build ./...                          # compile
go test ./...                           # all tests
go test -race ./...                     # tests with race detector
go test -v -run TestName ./...          # single test
go vet ./...                            # static analysis

Conventions

  • Struct-based Config with defaults() method for zero-value defaults
  • Functional options (Option func(*Config)) used via ApplyOptions (primarily in dbxtest)
  • stdlib only testing — no testify, no gomock
  • Thread safetyatomic.Bool for Node.healthy and Cluster.closed
  • dbxtest helpersNewTestCluster skips on unreachable DB, auto-closes via t.Cleanup; TestLogger routes to testing.T
  • Sentinel errorsErrNoHealthyNode, ErrClusterClosed, ErrRetryExhausted
  • retryError uses multi-unwrap (Unwrap() []error) so both ErrRetryExhausted and the last error can be matched with errors.Is