The chat usage badge was hardcoded to ~8K-token Ollama defaults
(`CONTEXT_BUDGET_CHARS = 24_000`), which made every Fireworks session
look 150%+ full after a few hops even though models like Kimi-K2 carry
256K context windows. Now the budget is selected per-provider:
- Ollama → 24K chars (~8K tok), unchanged
- Fireworks → 384K chars (~128K tok), a safe floor for the smallest
Fireworks chat models (qwen2.5-coder 32K) while not stuffing the bar
for the larger ones
Auto-compact thresholds and the % badge both read this back from the
backend, so they now scale correctly when the user switches providers.
Routes chat-completions through a managed OpenAI-compatible inference
endpoint as an alternative to local Ollama, useful when the agent needs
fast multi-hop reasoning that local hardware can't sustain.
- backend: rename `call_ollama_chat_messages` → `call_chat_messages`,
dispatch by provider; add `call_fireworks` branch (Bearer auth,
`response_format: json_object` mapped from internal `format="json"`)
and `list_fireworks_models` Tauri command
- settings: extend `AiProvider` enum + `AiSettings.fireworks_api_key`
(serde-default for legacy config compat); Fireworks base URL hardcoded
- UI: provider selector in both popover and AppSettingsSheet (only
ollama+fireworks shown; legacy openai/anthropic kept for serde-compat
but normalized to ollama in UI); password input + dynamic model list
for Fireworks; switching provider clears stale model selection
- 4 unit tests: serde round-trip, legacy settings deserialization,
Fireworks chat-completions parsing, models-list parsing
Adds inline data visualisation to the chat agent. After a successful
run_query, the agent can call make_chart(chart_type, x, y, [group,
title, orientation]) and the result is rendered as a bar / line / area
/ pie chart inline in the chat thread, sourced from the previous query
result.
Backend (commands/chat.rs, models/chat.rs)
- New ChartConfig{chart_type, x, y, group?, title?, orientation?} model.
- New AgentAction::MakeChart{config} variant. Parser accepts both
`chart_type` and the alternative `type` field name (qwen3 sometimes
emits the latter). Validates chart_type is one of bar/line/area/pie.
- last_successful_query_result helper finds the most recent successful
run_query in the working thread.
- MakeChart dispatcher: validates that x/y/group columns exist in the
attached query result, emits a tool_result with the same QueryResult
in `result` and the chart_config JSON in `text`. Mismatches surface
as a clear error ("y column `name` is not in the last result.
Available: company_name, legal_name, …").
- build_history compression unchanged: make_chart's tool_result text
field (the small chart_config JSON) is included in LLM history; the
large QueryResult.rows are NOT, since the per-tool branch only emits
text for non-run_query tools.
- System prompt: documents make_chart with concrete usage hints
(top-N → bar, time series → line/area, proportions → pie; skip for
≤2 or >500 rows). 7 new parser/dispatcher tests.
Frontend (src/components/chat/)
- recharts ^3.8 added.
- New ChartPreview component renders bar (vertical+horizontal), line,
area, pie. Supports grouped series via the `group` config field by
pivoting rows into a wide format. Y values coerced to numbers
(parses strings, nulls → 0). Caps to 500 points to keep things
responsive on huge results.
- ChatMessageView routes tool=="make_chart" tool_result through a new
ChartToolResult that parses the config JSON from the message text
and feeds the embedded QueryResult into ChartPreview.
- New labels/icons (BarChart3) and preview-extraction for make_chart
in tool-call collapsed headers (`bar: carrier_name → trip_count`).
Verification: cargo test --lib 77 pass (+7), tsc clean, vitest 20
pass.
The previous symptom: agent succeeded on its 8th run_query (got 30
rows) but the loop ended without a final because that was the last
allowed hop. Result: "Stopped after 8 tool calls" and the data was
wasted. Also: agent kept assuming `legal_entities.name` existed even
after get_columns showed it didn't.
Backend (commands/chat.rs)
- MAX_HOPS 8 -> 10. With list_databases / list_tables / get_columns /
switch_database / run_query / remember / save_query / find_queries
available, complex investigations need a bit more headroom.
- New force_final_synthesis: when the loop falls through MAX_HOPS,
one extra LLM call is made WITHOUT the JSON action protocol,
asking the model to write a plain-text answer based on whatever
data was already collected. This rescues cases where the agent
succeeded on the last hop but had no budget for a final. Output
goes through clean_summary so any stray JSON or fences are stripped.
- Stronger RULES in system prompt:
* Explicit ban on guessing column names: "After get_columns, your
next run_query must use ONLY column names that appear verbatim
in that output."
* Concrete example of how to read PG's "column le.name does not
exist" — the alias `le` tells you which table is missing it.
* Mention the new hop budget (10) so the model spends it
deliberately.
Verification: cargo test --lib 70 pass, tsc clean.
The previous loop burned all 8 hops re-running the same broken query
("operator does not exist: character varying = uuid") because (a) the
agent never saw PostgreSQL's HINT — only the bare error message — and
(b) the prompt's "retry once" rule was advisory, not enforced.
Backend (commands/chat.rs)
- New format_db_error helper. When the error is sqlx::Error::Database
with a PostgreSQL backend, downcast to PgDatabaseError and append
DETAIL and HINT lines. Common PG hints are exactly the spelled-out
fix the agent needs ("You might need to add explicit type casts").
- New last_run_query_error helper to fish the most recent failing SQL
text out of working history for the give-up message.
- Hard server-side guard: track consecutive_query_errors. On
consecutive run_query failures >= 2, force-emit a `final` message
that quotes the last error and suggests next steps (cast hints,
open the table in sidebar, switch to Advanced mode). The model
cannot loop past this regardless of how many hops remain.
- Counter resets to 0 when the model takes any non-RunQuery action
(get_columns, list_tables, etc.) — investigation buys a fresh
error budget.
- Stronger prompt RULES section: explicitly walks through three of
the most common PG error classes ("operator does not exist",
"column does not exist", "relation does not exist") and the
matching fixes. Tells the model the harness force-stops after 2
consecutive failures.
Tests (4 new): format_db_error fallback, last_run_query_error finds
most recent / handles empty / handles no-errors thread.
Verification: cargo test --lib 70 pass (+4), tsc clean, vitest 20
pass.
INTERVAL handling
- pg_value_to_json now decodes PG INTERVAL via PgInterval and renders
it psql-style: `1 year 2 mons 3 days 04:05:06`. Previously
AVG(timestamp - timestamp) and similar interval-returning queries
showed `<unsupported type: INTERVAL>` in chat results.
- 7 unit tests covering zero, days-only, mixed, negative, microsecond
fraction, and the singular/plural unit rules.
Compact reliability
- Sharper system prompt: explicitly instructs plain text starting with
`-`, no JSON, no fences, no field names. qwen3-coder is heavily
trained on the agent JSON protocol and was sometimes returning
`{"action":"final","text":"..."}` even for the compact prompt.
- New clean_summary helper strips ``` fences (with or without lang
identifier) and extracts the underlying string from a JSON envelope
if the model still wraps the answer (looks for text/summary/content/
answer/output keys). 6 unit tests.
- Frontend useChat.compact: success/no-op/error toasts via sonner so
the user sees what happened. "Nothing to compact" appears when there
is no older history beyond the last user turn (previously silent).
Verification: cargo test --lib 66 pass (+13), tsc clean, vitest 20
pass.
Adds visibility into how much of the model context window the chat agent
is using and a way to free space when it fills up.
Backend
- New ContextUsage{used_chars, budget_chars} returned from chat_send
alongside messages (return type ChatTurnResult). Computed by running
build_history once at end of turn and counting char bytes — same data
path as the actual LLM call, so the count is exact for the chosen
budget unit.
- CONTEXT_BUDGET_CHARS = 24,000 (~6-8K tokens). Tuned for Ollama
defaults; can be exposed via AiSettings later.
- New chat_compact Tauri command. Splits the thread at the last user
turn, LLM-summarises everything before it (3-6 bullet points,
language-aware, < 800 chars), and returns a thread of
[Assistant("📋 Compacted N messages: …"), <last_user_turn?>]. The
recent user turn is preserved untouched so the agent can keep
answering it.
- render_thread_for_summary skips QueryResult.rows entirely so a single
large run_query can't blow the summariser's context.
- 3 new unit tests (last_user_turn_index, render skipping rows, empty
thread no-op).
Frontend
- ChatPanel header gets a usage badge: progress bar + `Xk / Yk tok ·
P%`, color-coded green (<30%) / muted (<60%) / amber (<85%) / red
(≥85%). Tooltip explains and nudges /compact when ≥60%.
- Compact button next to Clear in the header.
- Slash commands in ChatComposer: /compact, /clear.
- Empty-state shows the slash-command hint.
- Auto-compact: if the previous turn pushed usage past 85% AND the
thread has more than one message, the next user turn first runs
chat_compact transparently before chat_send. The compaction surfaces
as a visible Assistant("📋 Compacted …") message so the user can see
what the agent kept.
- app-store gets chatUsage map per tab + replaceChatThread + setChatUsage
actions; closeTab and clearChatThread clean up usage too.
Verification: cargo check clean, cargo test --lib 53 pass (+3),
tsc --noEmit clean, vitest run 20 pass.
Removes enterprise/DBA features and replaces the marginal AI bar with a
central chat agent that has progressive-discovery tools, cross-session
memory, saved-query reuse, and inline result actions. Adds ClickHouse
support alongside PostgreSQL/Greenplum.
Cleanup
- Drop ~10k LOC of advanced features: Docker, Snapshots, Validation,
Index Advisor, Role/User Management, Data Generator, ERD, Lookup.
- Trim deps: drop @xyflow/react, dagre, @types/dagre; cut tokio features
to rt-multi-thread/sync/time/net/macros.
- Remove unused TuskError variants and dead helpers (topological_sort,
invalidate_schema_cache).
Multi-DB (PostgreSQL + ClickHouse)
- New src-tauri/src/db/ module: ChClient (HTTP-based, reuses reqwest),
sql_guard (cross-flavor read-only whitelist with 8 tests).
- ConnectionConfig gains db_flavor and secure fields with serde defaults
for backwards-compatible connections.json.
- All connection/query/schema/data commands dispatch by flavor; CH
covers connect, execute_query, list_databases/schemas/tables/views/
columns/completion_schema, paginated table fetch.
- Frontend: dbCapabilities matrix, ConnectionDialog engine selector
with port auto-swap and HTTPS toggle, SqlEditor switches to
StandardSQL dialect for CH, TableDataView surfaces CH connections as
read-only.
AI-first chat agent
- New src/components/chat/ panel with composer, message rendering,
collapsible tool-call/result blocks, top-level ErrorBoundary.
- Backend agent loop in commands/chat.rs with strict-JSON tool
protocol. Nine tools: list_databases, list_tables, get_columns,
switch_database, run_query, remember, save_query, find_queries, final.
Forgiving parser accepts both flat and nested-input shapes.
- Compressed history: only the last 4 run_query results carry sample
rows (≤10, cells truncated to 200 chars) into LLM context; older
results marked omitted.
- System prompt uses lite OVERVIEW (DB list + active-DB tables only)
instead of full DDL — schema details are loaded on demand via
get_columns. CH OVERVIEW shows cross-DB tables since CH allows
db.table queries.
Cross-session memory (F1)
- Per-connection markdown file at app_data_dir/memory/<connection_id>.md,
16KB cap with oldest-block eviction. Agent appends via remember()
tool; the file is injected into LEARNED NOTES section of every system
prompt.
- New Memory sidebar tab with editable textarea, badge for note count,
empty-state with template. Edits picked up on the next agent turn.
Saved-query reuse (F2)
- Tools save_query and find_queries scoped to current connection.
save_query attaches a UUID + timestamp; find_queries returns top 10
matches with SQL preview ≤500 chars.
- Storage shared with the sidebar Saved panel.
Inline result actions (F3)
- run_query result block in chat gets Open-full (90vw × 80vh modal with
full ResultsTable, no row cap) and Export (reuses ExportDialog for
CSV/JSON via existing exportCsv/exportJson commands).
Verification
- cargo check clean, zero warnings.
- cargo test --lib: 50 pass (20 chat parser + 4 memory + 8 sql_guard +
6 clean_sql + 12 escape_ident).
- npx tsc --noEmit clean.
- npx vitest run: 20 pass.