Add CI workflow, README, CLAUDE.md, AGENTS.md, and .cursorrules
All checks were successful
CI / test (push) Successful in 51s
All checks were successful
CI / test (push) Successful in 51s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
109
AGENTS.md
Normal file
109
AGENTS.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# AGENTS.md — dbx
|
||||
|
||||
Universal guide for AI coding agents working with this codebase.
|
||||
|
||||
## Overview
|
||||
|
||||
`git.codelab.vc/pkg/dbx` is a Go PostgreSQL cluster library built on **pgx/v5**. It provides master/replica routing, automatic retries, load balancing, background health checking, panic-safe transactions, and context-based Querier injection.
|
||||
|
||||
## Package map
|
||||
|
||||
```
|
||||
dbx/ Root — Cluster, Node, Balancer, retry, health, errors, tx, config
|
||||
├── dbx.go Interfaces: Querier, DB, Logger, MetricsHook
|
||||
├── cluster.go Cluster — routing, write/read operations
|
||||
├── node.go Node — pgxpool.Pool wrapper with health state
|
||||
├── balancer.go Balancer interface + RoundRobinBalancer
|
||||
├── retry.go retrier — exponential backoff with jitter and node fallback
|
||||
├── health.go healthChecker — background goroutine pinging nodes
|
||||
├── tx.go RunTx, RunTxOptions, InjectQuerier, ExtractQuerier
|
||||
├── errors.go Error classification (IsRetryable, IsConnectionError, etc.)
|
||||
├── config.go Config, NodeConfig, PoolConfig, RetryConfig, HealthCheckConfig
|
||||
├── options.go Functional options (WithLogger, WithMetrics, WithRetry, etc.)
|
||||
└── dbxtest/
|
||||
└── dbxtest.go Test helpers: NewTestCluster, TestLogger
|
||||
```
|
||||
|
||||
## Routing architecture
|
||||
|
||||
```
|
||||
┌──────────────┐
|
||||
│ Cluster │
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
│ │
|
||||
Write ops Read ops
|
||||
Exec, Query, QueryRow ReadQuery, ReadQueryRow
|
||||
Begin, BeginTx, RunTx
|
||||
CopyFrom, SendBatch
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────┐ ┌────────────────────────┐
|
||||
│ Master │ │ Balancer → Replicas │
|
||||
└──────────┘ │ fallback → Master │
|
||||
└────────────────────────┘
|
||||
|
||||
Retry loop (retrier.do):
|
||||
For each attempt (up to MaxAttempts):
|
||||
For each node in [target nodes]:
|
||||
if healthy → execute → on retryable error → continue
|
||||
Backoff (exponential + jitter)
|
||||
```
|
||||
|
||||
## Common tasks
|
||||
|
||||
### Add a new node type (e.g., analytics replica)
|
||||
|
||||
1. Add a field to `Cluster` struct (e.g., `analytics []*Node`)
|
||||
2. Add corresponding config to `Config` struct
|
||||
3. Connect nodes in `NewCluster`, add to `all` slice for health checking
|
||||
4. Add routing methods (e.g., `AnalyticsQuery`)
|
||||
|
||||
### Customize retry logic
|
||||
|
||||
1. Provide `RetryConfig.RetryableErrors` — custom `func(error) bool` classifier
|
||||
2. Or modify `IsRetryable()` in `errors.go` to add new PG error codes
|
||||
3. Adjust `MaxAttempts`, `BaseDelay`, `MaxDelay` in `RetryConfig`
|
||||
|
||||
### Add a metrics hook
|
||||
|
||||
1. Add a new callback field to `MetricsHook` struct in `dbx.go`
|
||||
2. Call it at the appropriate point (nil-check the hook and the field)
|
||||
3. See existing hooks in `cluster.go` (queryStart/queryEnd) and `health.go` (OnNodeDown/OnNodeUp)
|
||||
|
||||
### Add a new balancer strategy
|
||||
|
||||
1. Implement the `Balancer` interface: `Next(nodes []*Node) *Node`
|
||||
2. Must return `nil` if no suitable node is available
|
||||
3. Must check `node.IsHealthy()` to skip down nodes
|
||||
|
||||
## Gotchas
|
||||
|
||||
- **Close() is required**: `Cluster.Close()` stops the health checker goroutine and closes all pools. Leaking a Cluster leaks goroutines and connections
|
||||
- **RunTx panic safety**: `runTx` uses `defer` with `recover()` — it rolls back on panic, then re-panics. Do not catch panics outside `RunTx` expecting the tx to be committed
|
||||
- **Context-based Querier injection**: `ExtractQuerier` returns the fallback if no Querier is in context. Always pass the cluster/pool as fallback so code works both inside and outside transactions
|
||||
- **Health checker goroutine**: Starts immediately in `NewCluster`. Uses `time.NewTicker` — the first check happens after one interval, not immediately. Nodes start as healthy (`healthy.Store(true)` in `newNode`)
|
||||
- **readNodes ordering**: `readNodes()` returns `[replicas..., master]` — the retrier tries replicas first, master is the last fallback
|
||||
- **errRow for closed cluster**: When cluster is closed, `QueryRow`/`ReadQueryRow` return `errRow{err: ErrClusterClosed}` — the error surfaces on `Scan()`
|
||||
- **No SQL parsing**: Routing is purely method-based. If you call `Exec` with a SELECT, it still goes to master
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
go build ./... # compile
|
||||
go test ./... # all tests
|
||||
go test -race ./... # tests with race detector
|
||||
go test -v -run TestName ./... # single test
|
||||
go vet ./... # static analysis
|
||||
```
|
||||
|
||||
## Conventions
|
||||
|
||||
- **Struct-based Config** with `defaults()` method for zero-value defaults
|
||||
- **Functional options** (`Option func(*Config)`) used via `ApplyOptions` (primarily in dbxtest)
|
||||
- **stdlib only** testing — no testify, no gomock
|
||||
- **Thread safety** — `atomic.Bool` for `Node.healthy` and `Cluster.closed`
|
||||
- **dbxtest helpers** — `NewTestCluster` skips on unreachable DB, auto-closes via `t.Cleanup`; `TestLogger` routes to `testing.T`
|
||||
- **Sentinel errors** — `ErrNoHealthyNode`, `ErrClusterClosed`, `ErrRetryExhausted`
|
||||
- **retryError** uses multi-unwrap (`Unwrap() []error`) so both `ErrRetryExhausted` and the last error can be matched with `errors.Is`
|
||||
Reference in New Issue
Block a user