Some checks failed
CI / test (push) Failing after 13s
- slog.go: SlogLogger adapts *slog.Logger to dbx.Logger interface - scan.go: Collect[T] and CollectOne[T] generic helpers using pgx.RowToStructByName - cluster.go: slow query logging via Config.SlowQueryThreshold (Warn level in queryEnd) - stats.go: PoolStats with Cluster.Stats() aggregating pool stats across all nodes - config.go/node.go: NodeConfig.Tracer passthrough for pgx.QueryTracer (OpenTelemetry) - options.go: WithSlowQueryThreshold and WithTracer functional options - dbxtest/tx.go: RunInTx runs callback in always-rolled-back transaction for test isolation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
114 lines
6.1 KiB
Markdown
114 lines
6.1 KiB
Markdown
# AGENTS.md — dbx
|
|
|
|
Universal guide for AI coding agents working with this codebase.
|
|
|
|
## Overview
|
|
|
|
`git.codelab.vc/pkg/dbx` is a Go PostgreSQL cluster library built on **pgx/v5**. It provides master/replica routing, automatic retries, load balancing, background health checking, panic-safe transactions, and context-based Querier injection.
|
|
|
|
## Package map
|
|
|
|
```
|
|
dbx/ Root — Cluster, Node, Balancer, retry, health, errors, tx, config
|
|
├── dbx.go Interfaces: Querier, DB, Logger, MetricsHook
|
|
├── cluster.go Cluster — routing, write/read operations, slow query logging
|
|
├── node.go Node — pgxpool.Pool wrapper with health state, tracer passthrough
|
|
├── balancer.go Balancer interface + RoundRobinBalancer
|
|
├── retry.go retrier — exponential backoff with jitter and node fallback
|
|
├── health.go healthChecker — background goroutine pinging nodes
|
|
├── tx.go RunTx, RunTxOptions, InjectQuerier, ExtractQuerier
|
|
├── errors.go Error classification (IsRetryable, IsConnectionError, etc.)
|
|
├── config.go Config, NodeConfig, PoolConfig, RetryConfig, HealthCheckConfig
|
|
├── options.go Functional options (WithLogger, WithMetrics, WithRetry, WithTracer, etc.)
|
|
├── slog.go SlogLogger — adapts *slog.Logger to dbx.Logger
|
|
├── scan.go Collect[T], CollectOne[T] — generic row scan helpers
|
|
├── stats.go PoolStats — aggregate pool statistics via Cluster.Stats()
|
|
└── dbxtest/
|
|
├── dbxtest.go Test helpers: NewTestCluster, TestLogger
|
|
└── tx.go RunInTx — test transaction isolation (always rolled back)
|
|
```
|
|
|
|
## Routing architecture
|
|
|
|
```
|
|
┌──────────────┐
|
|
│ Cluster │
|
|
└──────┬───────┘
|
|
│
|
|
┌───────────────┴───────────────┐
|
|
│ │
|
|
Write ops Read ops
|
|
Exec, Query, QueryRow ReadQuery, ReadQueryRow
|
|
Begin, BeginTx, RunTx
|
|
CopyFrom, SendBatch
|
|
│ │
|
|
▼ ▼
|
|
┌──────────┐ ┌────────────────────────┐
|
|
│ Master │ │ Balancer → Replicas │
|
|
└──────────┘ │ fallback → Master │
|
|
└────────────────────────┘
|
|
|
|
Retry loop (retrier.do):
|
|
For each attempt (up to MaxAttempts):
|
|
For each node in [target nodes]:
|
|
if healthy → execute → on retryable error → continue
|
|
Backoff (exponential + jitter)
|
|
```
|
|
|
|
## Common tasks
|
|
|
|
### Add a new node type (e.g., analytics replica)
|
|
|
|
1. Add a field to `Cluster` struct (e.g., `analytics []*Node`)
|
|
2. Add corresponding config to `Config` struct
|
|
3. Connect nodes in `NewCluster`, add to `all` slice for health checking
|
|
4. Add routing methods (e.g., `AnalyticsQuery`)
|
|
|
|
### Customize retry logic
|
|
|
|
1. Provide `RetryConfig.RetryableErrors` — custom `func(error) bool` classifier
|
|
2. Or modify `IsRetryable()` in `errors.go` to add new PG error codes
|
|
3. Adjust `MaxAttempts`, `BaseDelay`, `MaxDelay` in `RetryConfig`
|
|
|
|
### Add a metrics hook
|
|
|
|
1. Add a new callback field to `MetricsHook` struct in `dbx.go`
|
|
2. Call it at the appropriate point (nil-check the hook and the field)
|
|
3. See existing hooks in `cluster.go` (queryStart/queryEnd) and `health.go` (OnNodeDown/OnNodeUp)
|
|
|
|
### Add a new balancer strategy
|
|
|
|
1. Implement the `Balancer` interface: `Next(nodes []*Node) *Node`
|
|
2. Must return `nil` if no suitable node is available
|
|
3. Must check `node.IsHealthy()` to skip down nodes
|
|
|
|
## Gotchas
|
|
|
|
- **Close() is required**: `Cluster.Close()` stops the health checker goroutine and closes all pools. Leaking a Cluster leaks goroutines and connections
|
|
- **RunTx panic safety**: `runTx` uses `defer` with `recover()` — it rolls back on panic, then re-panics. Do not catch panics outside `RunTx` expecting the tx to be committed
|
|
- **Context-based Querier injection**: `ExtractQuerier` returns the fallback if no Querier is in context. Always pass the cluster/pool as fallback so code works both inside and outside transactions
|
|
- **Health checker goroutine**: Starts immediately in `NewCluster`. Uses `time.NewTicker` — the first check happens after one interval, not immediately. Nodes start as healthy (`healthy.Store(true)` in `newNode`)
|
|
- **readNodes ordering**: `readNodes()` returns `[replicas..., master]` — the retrier tries replicas first, master is the last fallback
|
|
- **errRow for closed cluster**: When cluster is closed, `QueryRow`/`ReadQueryRow` return `errRow{err: ErrClusterClosed}` — the error surfaces on `Scan()`
|
|
- **No SQL parsing**: Routing is purely method-based. If you call `Exec` with a SELECT, it still goes to master
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
go build ./... # compile
|
|
go test ./... # all tests
|
|
go test -race ./... # tests with race detector
|
|
go test -v -run TestName ./... # single test
|
|
go vet ./... # static analysis
|
|
```
|
|
|
|
## Conventions
|
|
|
|
- **Struct-based Config** with `defaults()` method for zero-value defaults
|
|
- **Functional options** (`Option func(*Config)`) used via `ApplyOptions` (primarily in dbxtest)
|
|
- **stdlib only** testing — no testify, no gomock
|
|
- **Thread safety** — `atomic.Bool` for `Node.healthy` and `Cluster.closed`
|
|
- **dbxtest helpers** — `NewTestCluster` skips on unreachable DB, auto-closes via `t.Cleanup`; `TestLogger` routes to `testing.T`
|
|
- **Sentinel errors** — `ErrNoHealthyNode`, `ErrClusterClosed`, `ErrRetryExhausted`
|
|
- **retryError** uses multi-unwrap (`Unwrap() []error`) so both `ErrRetryExhausted` and the last error can be matched with `errors.Is`
|