Short description of current behavior
Hi, I have an big problem related with Agent and ollama!
My setup:
MindsDB v26.1.0-cloud(OpenShift)
When an AI agent is called with any Ollama model, it throws one of these errors:
UnexpectedModelBehavior: Received no tool calls model response, expected a call to 'final_result'
or
pydantic_core.core_schema.validate_json: Expected JSON — got plain text
The agent never returns a response — the request silently fails or throws a 500.
Possible cause:
strict: true in tool definitions (affects all models)
pydantic-ai 1.77.0 sends "strict": true inside every tool definition it posts to the Ollama OpenAI-compatible API. Ollama's constrained decoding (grammar enforcement) is incompatible with the chat templates of small/quantized models and causes the model to either hallucinate badly or refuse to call the tool.
What affect this have:
Any MindsDB AI agent configured to use small/medium models via the on-prem Ollama endpoint will fail 100% of the time, regardless of the question asked. The planning step (which generates a query plan) fails before any SQL is even attempted.
Video or screenshots
No response
Expected behavior
When an AI agent is configured with an Ollama-hosted model (including Mistral variants), queries should complete successfully and return either a SQL result or a text response. The agent should be able to reach the SQL execution step, it should not fail during the internal planning step before any user-visible work begins.
How to reproduce the error
Any natural language query sent to a MindsDB AI agent configured with an Ollama model, e.g.:
SELECT answer FROM my_agent WHERE question = 'What are the distinct values in the database?';
Anything else?
Running MindsDB on:
MindsDB v26.1.0-cloud Docker image
Python 3.10, pydantic-ai 1.77.0 (pinned in requirements.txt)
Ollama server at a local network endpoint (http://:11434)
Models tested: mistral (7B), mistral-nemo:12b, llama3.1:8b, qwen2.5:7b
Model like llama3.3:70b have an higher change of working (some simple questions is working), since the problem is that JSON is always expected.
Short description of current behavior
Hi, I have an big problem related with Agent and ollama!
My setup:
MindsDB v26.1.0-cloud(OpenShift)
When an AI agent is called with any Ollama model, it throws one of these errors:
or
The agent never returns a response — the request silently fails or throws a 500.
Possible cause:
strict: true in tool definitions (affects all models)
pydantic-ai 1.77.0 sends "strict": true inside every tool definition it posts to the Ollama OpenAI-compatible API. Ollama's constrained decoding (grammar enforcement) is incompatible with the chat templates of small/quantized models and causes the model to either hallucinate badly or refuse to call the tool.
What affect this have:
Any MindsDB AI agent configured to use small/medium models via the on-prem Ollama endpoint will fail 100% of the time, regardless of the question asked. The planning step (which generates a query plan) fails before any SQL is even attempted.
Video or screenshots
No response
Expected behavior
When an AI agent is configured with an Ollama-hosted model (including Mistral variants), queries should complete successfully and return either a SQL result or a text response. The agent should be able to reach the SQL execution step, it should not fail during the internal planning step before any user-visible work begins.
How to reproduce the error
Any natural language query sent to a MindsDB AI agent configured with an Ollama model, e.g.:
Anything else?
Running MindsDB on:
MindsDB v26.1.0-cloud Docker image
Python 3.10, pydantic-ai 1.77.0 (pinned in requirements.txt)
Ollama server at a local network endpoint (http://:11434)
Models tested: mistral (7B), mistral-nemo:12b, llama3.1:8b, qwen2.5:7b
Model like llama3.3:70b have an higher change of working (some simple questions is working), since the problem is that JSON is always expected.