Description
When a local provider (e.g. LM Studio) is configured as [default_model] and one or more
[[fallback_providers]] (e.g. Gemini, OpenRouter) are defined in config.toml, agents do not fall over to the
cloud fallbacks when the local provider goes offline. Instead they keep retrying the offline endpoint until
the request fails.
Expected Behavior
The agent falls over to Gemini and responds successfully.
Actual Behavior
The agent keeps retrying LM Studio, then returns an error. Gemini is never called.
Steps to Reproduce
- Configure config.toml with a local LM Studio instance as [default_model] and Gemini as
[[fallback_providers]]:
[default_model]
provider = "lmstudio"
model = "google/gemma-4-26b-a4b"
base_url = "http://192.168.1.1:1234/v1"
[[fallback_providers]]
provider = "gemini"
model = "gemini-2.5-pro"
api_key_env = "GEMINI_API_KEY"
- Start the daemon, verify agents respond normally via LM Studio.
- Shut down LM Studio.
- Send a message to any agent.
OpenFang Version
0.5.5
Operating System
Linux (x86_64)
Logs / Screenshots
Root Cause
[[fallback_providers]] are added to self.default_driver at boot time, which is a FallbackDriver chain.
However, resolve_driver() in kernel.rs creates a fresh primary driver for each agent call (an HTTP client
that always succeeds at creation time), and only returns default_driver if this fresh creation fails — which
never happens for HTTP-based providers like LM Studio.
At runtime, when LM Studio is unreachable, the error is classified as Timeout / is_retryable = true, causing
retries of the same provider. The fallback chain in default_driver is never reached.
Specifically, in kernel.rs resolve_driver():
// This branch is never taken for HTTP providers — create_driver() always succeeds
Err(e) => {
if agent_provider == default_provider ... {
Arc::clone(&self.default_driver) // ← FallbackDriver with Gemini
}
}
// ...
Ok(primary) // ← always returns bare LM Studio driver, no fallbacks wired in
Fix
When an agent uses the default provider with no custom overrides and no per-agent [[fallback_models]], wire
the global [[fallback_providers]] into the agent's driver chain at resolution time, so any runtime failure
(not just init failure) triggers the fallback.
Added after the existing fallback_models block in resolve_driver():
let uses_global_defaults = agent_provider == default_provider
&& !has_custom_key
&& !has_custom_url
&& !self.config.fallback_providers.is_empty()
&& !Arc::ptr_eq(&primary, &self.default_driver);
if uses_global_defaults {
let mut chain = vec![(primary, String::new())];
for fb in &self.config.fallback_providers {
if fb.provider == *agent_provider { continue; } // skip duplicate primary
// ... create driver, push to chain
}
if chain.len() > 1 {
return Ok(Arc::new(FallbackDriver::with_models(chain)));
}
}
Notes
- Agents with explicit [[fallback_models]] in their manifest are unaffected — those already create a
FallbackDriver correctly.
- The duplicate-provider filter (fb.provider == *agent_provider) avoids adding a second unreachable LM Studio
entry to the chain, which would otherwise happen when [[fallback_providers]] includes the same provider as
[default_model] (a common configuration pattern).
- The Arc::ptr_eq guard prevents double-wrapping in the rare case where fresh driver creation failed and
primary is already default_driver.
Description
When a local provider (e.g. LM Studio) is configured as [default_model] and one or more
[[fallback_providers]] (e.g. Gemini, OpenRouter) are defined in config.toml, agents do not fall over to the
cloud fallbacks when the local provider goes offline. Instead they keep retrying the offline endpoint until
the request fails.
Expected Behavior
The agent falls over to Gemini and responds successfully.
Actual Behavior
The agent keeps retrying LM Studio, then returns an error. Gemini is never called.
Steps to Reproduce
[[fallback_providers]]:
[default_model]
provider = "lmstudio"
model = "google/gemma-4-26b-a4b"
base_url = "http://192.168.1.1:1234/v1"
[[fallback_providers]]
provider = "gemini"
model = "gemini-2.5-pro"
api_key_env = "GEMINI_API_KEY"
OpenFang Version
0.5.5
Operating System
Linux (x86_64)
Logs / Screenshots
Root Cause
[[fallback_providers]] are added to self.default_driver at boot time, which is a FallbackDriver chain.
However, resolve_driver() in kernel.rs creates a fresh primary driver for each agent call (an HTTP client
that always succeeds at creation time), and only returns default_driver if this fresh creation fails — which
never happens for HTTP-based providers like LM Studio.
At runtime, when LM Studio is unreachable, the error is classified as Timeout / is_retryable = true, causing
retries of the same provider. The fallback chain in default_driver is never reached.
Specifically, in kernel.rs resolve_driver():
// This branch is never taken for HTTP providers — create_driver() always succeeds
Err(e) => {
if agent_provider == default_provider ... {
Arc::clone(&self.default_driver) // ← FallbackDriver with Gemini
}
}
// ...
Ok(primary) // ← always returns bare LM Studio driver, no fallbacks wired in
Fix
When an agent uses the default provider with no custom overrides and no per-agent [[fallback_models]], wire
the global [[fallback_providers]] into the agent's driver chain at resolution time, so any runtime failure
(not just init failure) triggers the fallback.
Added after the existing fallback_models block in resolve_driver():
let uses_global_defaults = agent_provider == default_provider
&& !has_custom_key
&& !has_custom_url
&& !self.config.fallback_providers.is_empty()
&& !Arc::ptr_eq(&primary, &self.default_driver);
if uses_global_defaults {
let mut chain = vec![(primary, String::new())];
for fb in &self.config.fallback_providers {
if fb.provider == *agent_provider { continue; } // skip duplicate primary
// ... create driver, push to chain
}
if chain.len() > 1 {
return Ok(Arc::new(FallbackDriver::with_models(chain)));
}
}
Notes
FallbackDriver correctly.
entry to the chain, which would otherwise happen when [[fallback_providers]] includes the same provider as
[default_model] (a common configuration pattern).
primary is already default_driver.