Routes chat-completions through a managed OpenAI-compatible inference
endpoint as an alternative to local Ollama, useful when the agent needs
fast multi-hop reasoning that local hardware can't sustain.
- backend: rename `call_ollama_chat_messages` → `call_chat_messages`,
dispatch by provider; add `call_fireworks` branch (Bearer auth,
`response_format: json_object` mapped from internal `format="json"`)
and `list_fireworks_models` Tauri command
- settings: extend `AiProvider` enum + `AiSettings.fireworks_api_key`
(serde-default for legacy config compat); Fireworks base URL hardcoded
- UI: provider selector in both popover and AppSettingsSheet (only
ollama+fireworks shown; legacy openai/anthropic kept for serde-compat
but normalized to ollama in UI); password input + dynamic model list
for Fireworks; switching provider clears stale model selection
- 4 unit tests: serde round-trip, legacy settings deserialization,
Fireworks chat-completions parsing, models-list parsing