fix: surface PG error HINT/DETAIL and stop the agent after repeated SQL failures

The previous loop burned all 8 hops re-running the same broken query ("operator does not exist: character varying = uuid") because (a) the agent never saw PostgreSQL's HINT — only the bare error message — and (b) the prompt's "retry once" rule was advisory, not enforced. Backend (commands/chat.rs) - New format_db_error helper. When the error is sqlx::Error::Database with a PostgreSQL backend, downcast to PgDatabaseError and append DETAIL and HINT lines. Common PG hints are exactly the spelled-out fix the agent needs ("You might need to add explicit type casts"). - New last_run_query_error helper to fish the most recent failing SQL text out of working history for the give-up message. - Hard server-side guard: track consecutive_query_errors. On consecutive run_query failures >= 2, force-emit a `final` message that quotes the last error and suggests next steps (cast hints, open the table in sidebar, switch to Advanced mode). The model cannot loop past this regardless of how many hops remain. - Counter resets to 0 when the model takes any non-RunQuery action (get_columns, list_tables, etc.) — investigation buys a fresh error budget. - Stronger prompt RULES section: explicitly walks through three of the most common PG error classes ("operator does not exist", "column does not exist", "relation does not exist") and the matching fixes. Tells the model the harness force-stops after 2 consecutive failures. Tests (4 new): format_db_error fallback, last_run_query_error finds most recent / handles empty / handles no-errors thread. Verification: cargo test --lib 70 pass (+4), tsc clean, vitest 20 pass.
2026-05-06 20:11:11 +03:00
parent 83f204816a
commit 5e72a80376
1 changed files with 153 additions and 12 deletions
--- a/src-tauri/src/commands/chat.rs
+++ b/src-tauri/src/commands/chat.rs
@@ -30,6 +30,10 @@ const TEXT_TOOL_CHAR_CAP: usize = 10_000;
 /// is nudged to /compact. Tuned for Ollama defaults (~4-8K tokens).
 /// Token estimate ≈ chars / 3 for mixed Cyrillic/ASCII content.
 const CONTEXT_BUDGET_CHARS: u64 = 24_000;
 /// Stop the loop when the model fails the same SQL hurdle this many times in a
 /// row. Beyond this, additional hops almost always burn the rest of the budget
 /// on identical retries; a definitive `final` with the error is more useful.
 const MAX_CONSECUTIVE_QUERY_ERRORS: usize = 2;
 // ---------------------------------------------------------------------------
 // Action protocol
@@ -243,7 +247,11 @@ WORKFLOW
 RULES
  - Use ONLY identifiers visible to you (overview / list_tables / get_columns output). Don't pluralize, translate, or guess.
  - LIMIT on ad-hoc SELECTs unless aggregating.
-  - On SQL error retry once with a fix; on the second failure respond with `final` explaining what's missing.
+  - When run_query fails, READ the error carefully — especially any `HINT:` line, which often spells out the fix. Common PostgreSQL fixes:
      * `operator does not exist: X = Y` (e.g. `character varying = uuid`) → cast one side, e.g. `a.id::uuid = b.id` or `a.id = b.id::text`. If unsure of types, call get_columns on both tables.
      * `column "X" does not exist` → call get_columns on the table you're querying; the column is named differently.
      * `relation "X" does not exist` → check the OVERVIEW table list; the table may be in a different schema or database.
  - On SQL error retry at most ONCE with a corrected query. On the second consecutive failure, STOP and respond with `final` explaining what's missing — do not loop. The harness will force-stop after 2 consecutive errors regardless.
  - `remember` is for durable facts, not transient observations. Don't memorise query results — only insights about the schema/data model that aren't already in the OVERVIEW.
 ═══════════════════════════════════════════════════════════════
@@ -456,8 +464,31 @@ pub async fn chat_send(
 ) -> TuskResult<ChatTurnResult> {
    let mut new_messages: Vec<ChatMessage> = Vec::new();
    let mut working: Vec<ChatMessage> = messages;
    let mut consecutive_query_errors: usize = 0;
    for _hop in 0..MAX_HOPS {
        // Hard guard: stop once the model has hit the same SQL hurdle multiple
        // times in a row. Beyond this point further hops virtually always
        // re-run the same broken query against the model's better judgment.
        if consecutive_query_errors >= MAX_CONSECUTIVE_QUERY_ERRORS {
            let last_err = last_run_query_error(&working).unwrap_or_default();
            let msg = ChatMessage::Assistant {
                id: new_id("asst"),
                text: format!(
                    "I tried {} times and kept hitting the same SQL error:\n\n{}\n\nThis usually means a schema mismatch — wrong column type, missing cast (often `::uuid` or `::text`), or a join key that doesn't exist as I assumed. Could you double-check the question, or open the table in the sidebar to verify column types? You can also write the SQL manually in Advanced mode.",
                    consecutive_query_errors, last_err
                ),
                created_at: now_ms(),
            };
            new_messages.push(msg.clone());
            working.push(msg);
            let usage = compute_usage(&state, &app, &connection_id, &working).await;
            return Ok(ChatTurnResult {
                messages: new_messages,
                usage,
            });
        }
        // Overview is rebuilt per turn — cheap (cached) and reflects the active DB
        // even if the user (or the agent) just switched it.
        let overview_text = build_overview_context(&state, &connection_id)
@@ -493,6 +524,8 @@ pub async fn chat_send(
            }
        };
        let is_run_query = matches!(&action, AgentAction::RunQuery { .. });
        match action {
            AgentAction::Final { text } => {
                let msg = ChatMessage::Assistant {
@@ -516,24 +549,30 @@ pub async fn chat_send(
                    serde_json::json!({ "sql": sql }).to_string(),
                );
                let result = match execute_query_core(&state, &connection_id, &sql).await {
-                    Ok(qr) => ChatMessage::ToolResult {
+                    Ok(qr) => {
                        consecutive_query_errors = 0;
                        ChatMessage::ToolResult {
                            id: new_id("res"),
                            tool: "run_query".to_string(),
                            is_error: false,
                            text: None,
                            result: Some(qr),
                            created_at: now_ms(),
-                    },
+                        }
                    }
                    Err(e) => {
-                        let hint = match e {
+                        consecutive_query_errors += 1;
-                            TuskError::ReadOnly => "\n\nRead-only mode is on. Toggle it off in the toolbar to allow writes.",
+                        let suffix = match &e {
                            TuskError::ReadOnly => {
                                "\n\nRead-only mode is on. Toggle it off in the toolbar to allow writes."
                            }
                            _ => "",
                        };
                        ChatMessage::ToolResult {
                            id: new_id("res"),
                            tool: "run_query".to_string(),
                            is_error: true,
-                            text: Some(format!("{}{}", e, hint)),
+                            text: Some(format!("{}{}", format_db_error(&e), suffix)),
                            result: None,
                            created_at: now_ms(),
                        }
@@ -631,6 +670,12 @@ pub async fn chat_send(
                push_tool_result(&mut new_messages, &mut working, result);
            }
        }
        // Any non-RunQuery, non-Final action means the model is investigating
        // (e.g. get_columns to verify a type). Give it a fresh error budget.
        if !is_run_query {
            consecutive_query_errors = 0;
        }
    }
    let msg = ChatMessage::Assistant {
@@ -748,6 +793,47 @@ fn render_thread_for_summary(messages: &[ChatMessage]) -> String {
    out
 }
 /// Format a database error with PostgreSQL HINT/DETAIL extracted when present.
 /// Plain `e.to_string()` shows just the bare error message, which makes
 /// errors like `operator does not exist: character varying = uuid` impossible
 /// to act on. PG's own HINT (`You might need to add explicit type casts`) is
 /// often the difference between the agent finding the fix on retry and
 /// looping forever.
 fn format_db_error(e: &TuskError) -> String {
    if let TuskError::Database(sqlx::Error::Database(db_err)) = e {
        let mut out = format!("Database error: {}", db_err.message());
        if let Some(pg) = db_err.try_downcast_ref::<sqlx::postgres::PgDatabaseError>() {
            if let Some(detail) = pg.detail() {
                out.push_str(&format!("\nDETAIL: {}", detail));
            }
            if let Some(hint) = pg.hint() {
                out.push_str(&format!("\nHINT: {}", hint));
            }
        }
        return out;
    }
    e.to_string()
 }
 /// Pull the most recent run_query error text from the working thread, so the
 /// post-loop "I gave up" summary can quote concrete errors back to the user.
 fn last_run_query_error(messages: &[ChatMessage]) -> Option<String> {
    for m in messages.iter().rev() {
        if let ChatMessage::ToolResult {
            tool,
            is_error: true,
            text,
            ..
        } = m
        {
            if tool == "run_query" {
                return text.clone();
            }
        }
    }
    None
 }
 /// Strip JSON envelopes, markdown fences, and known field-extraction patterns
 /// that the agent-trained model tends to emit even for non-agent prompts.
 /// Returns the underlying summary text.
@@ -1107,6 +1193,61 @@ mod tests {
        assert_eq!(clean_summary(s), s);
    }
    #[test]
    fn format_db_error_falls_back_to_to_string_for_non_db_errors() {
        let e = TuskError::Custom("oops".into());
        assert_eq!(format_db_error(&e), e.to_string());
    }
    #[test]
    fn last_run_query_error_finds_most_recent() {
        let msgs = vec![
            ChatMessage::ToolResult {
                id: "r1".into(),
                tool: "run_query".into(),
                is_error: true,
                text: Some("first error".into()),
                result: None,
                created_at: 1,
            },
            ChatMessage::ToolResult {
                id: "r2".into(),
                tool: "get_columns".into(),
                is_error: true,
                text: Some("not run_query".into()),
                result: None,
                created_at: 2,
            },
            ChatMessage::ToolResult {
                id: "r3".into(),
                tool: "run_query".into(),
                is_error: true,
                text: Some("second error".into()),
                result: None,
                created_at: 3,
            },
        ];
        assert_eq!(
            last_run_query_error(&msgs).as_deref(),
            Some("second error")
        );
    }
    #[test]
    fn last_run_query_error_none_for_empty_thread() {
        assert!(last_run_query_error(&[]).is_none());
    }
    #[test]
    fn last_run_query_error_none_when_no_errors() {
        let msgs = vec![ChatMessage::User {
            id: "u1".into(),
            text: "hi".into(),
            created_at: 1,
        }];
        assert!(last_run_query_error(&msgs).is_none());
    }
    #[test]
    fn render_thread_for_summary_includes_roles_and_skips_rows() {
        let msgs = vec![