Add CI workflow, README, CLAUDE.md, AGENTS.md, and .cursorrules
All checks were successful
CI / test (push) Successful in 51s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-23 00:01:27 +03:00
parent 62df3a2eb3
commit 7d25e1b73e
5 changed files with 409 additions and 0 deletions

109
AGENTS.md Normal file
View File

@@ -0,0 +1,109 @@
# AGENTS.md — dbx
Universal guide for AI coding agents working with this codebase.
## Overview
`git.codelab.vc/pkg/dbx` is a Go PostgreSQL cluster library built on **pgx/v5**. It provides master/replica routing, automatic retries, load balancing, background health checking, panic-safe transactions, and context-based Querier injection.
## Package map
```
dbx/ Root — Cluster, Node, Balancer, retry, health, errors, tx, config
├── dbx.go Interfaces: Querier, DB, Logger, MetricsHook
├── cluster.go Cluster — routing, write/read operations
├── node.go Node — pgxpool.Pool wrapper with health state
├── balancer.go Balancer interface + RoundRobinBalancer
├── retry.go retrier — exponential backoff with jitter and node fallback
├── health.go healthChecker — background goroutine pinging nodes
├── tx.go RunTx, RunTxOptions, InjectQuerier, ExtractQuerier
├── errors.go Error classification (IsRetryable, IsConnectionError, etc.)
├── config.go Config, NodeConfig, PoolConfig, RetryConfig, HealthCheckConfig
├── options.go Functional options (WithLogger, WithMetrics, WithRetry, etc.)
└── dbxtest/
└── dbxtest.go Test helpers: NewTestCluster, TestLogger
```
## Routing architecture
```
┌──────────────┐
│ Cluster │
└──────┬───────┘
┌───────────────┴───────────────┐
│ │
Write ops Read ops
Exec, Query, QueryRow ReadQuery, ReadQueryRow
Begin, BeginTx, RunTx
CopyFrom, SendBatch
│ │
▼ ▼
┌──────────┐ ┌────────────────────────┐
│ Master │ │ Balancer → Replicas │
└──────────┘ │ fallback → Master │
└────────────────────────┘
Retry loop (retrier.do):
For each attempt (up to MaxAttempts):
For each node in [target nodes]:
if healthy → execute → on retryable error → continue
Backoff (exponential + jitter)
```
## Common tasks
### Add a new node type (e.g., analytics replica)
1. Add a field to `Cluster` struct (e.g., `analytics []*Node`)
2. Add corresponding config to `Config` struct
3. Connect nodes in `NewCluster`, add to `all` slice for health checking
4. Add routing methods (e.g., `AnalyticsQuery`)
### Customize retry logic
1. Provide `RetryConfig.RetryableErrors` — custom `func(error) bool` classifier
2. Or modify `IsRetryable()` in `errors.go` to add new PG error codes
3. Adjust `MaxAttempts`, `BaseDelay`, `MaxDelay` in `RetryConfig`
### Add a metrics hook
1. Add a new callback field to `MetricsHook` struct in `dbx.go`
2. Call it at the appropriate point (nil-check the hook and the field)
3. See existing hooks in `cluster.go` (queryStart/queryEnd) and `health.go` (OnNodeDown/OnNodeUp)
### Add a new balancer strategy
1. Implement the `Balancer` interface: `Next(nodes []*Node) *Node`
2. Must return `nil` if no suitable node is available
3. Must check `node.IsHealthy()` to skip down nodes
## Gotchas
- **Close() is required**: `Cluster.Close()` stops the health checker goroutine and closes all pools. Leaking a Cluster leaks goroutines and connections
- **RunTx panic safety**: `runTx` uses `defer` with `recover()` — it rolls back on panic, then re-panics. Do not catch panics outside `RunTx` expecting the tx to be committed
- **Context-based Querier injection**: `ExtractQuerier` returns the fallback if no Querier is in context. Always pass the cluster/pool as fallback so code works both inside and outside transactions
- **Health checker goroutine**: Starts immediately in `NewCluster`. Uses `time.NewTicker` — the first check happens after one interval, not immediately. Nodes start as healthy (`healthy.Store(true)` in `newNode`)
- **readNodes ordering**: `readNodes()` returns `[replicas..., master]` — the retrier tries replicas first, master is the last fallback
- **errRow for closed cluster**: When cluster is closed, `QueryRow`/`ReadQueryRow` return `errRow{err: ErrClusterClosed}` — the error surfaces on `Scan()`
- **No SQL parsing**: Routing is purely method-based. If you call `Exec` with a SELECT, it still goes to master
## Commands
```bash
go build ./... # compile
go test ./... # all tests
go test -race ./... # tests with race detector
go test -v -run TestName ./... # single test
go vet ./... # static analysis
```
## Conventions
- **Struct-based Config** with `defaults()` method for zero-value defaults
- **Functional options** (`Option func(*Config)`) used via `ApplyOptions` (primarily in dbxtest)
- **stdlib only** testing — no testify, no gomock
- **Thread safety** — `atomic.Bool` for `Node.healthy` and `Cluster.closed`
- **dbxtest helpers** — `NewTestCluster` skips on unreachable DB, auto-closes via `t.Cleanup`; `TestLogger` routes to `testing.T`
- **Sentinel errors** — `ErrNoHealthyNode`, `ErrClusterClosed`, `ErrRetryExhausted`
- **retryError** uses multi-unwrap (`Unwrap() []error`) so both `ErrRetryExhausted` and the last error can be matched with `errors.Is`