Skip to content

Switch to Mistral Small 4 as default LLM#168

Merged
sandragjacinto merged 18 commits into
mainfrom
mistral-small-4
May 5, 2026
Merged

Switch to Mistral Small 4 as default LLM#168
sandragjacinto merged 18 commits into
mainfrom
mistral-small-4

Conversation

@jmsevin
Copy link
Copy Markdown
Collaborator

@jmsevin jmsevin commented Apr 30, 2026

This pull request migrates the backend from using Azure-based LLMs to Mistral models, updates related environment variables and configuration, and upgrades several dependencies to newer versions. It also removes unused code and adapts the agent and LLM proxy logic for the new Mistral integration. The most important changes are summarized below.

Mistral LLM Integration:

  • Updated .env.example, k8s/welearn-api/secrets.dev.yaml, k8s/welearn-api/values.yaml, and pytest.ini to add MISTRAL_API_KEY and MISTRAL_LLM_MODEL_NAME variables for Mistral LLM configuration; removed obsolete Azure Mistral Sweden variables. [1] [2] [3] [4]
  • Changed src/app/core/config.py and src/app/core/lifespan.py to use Mistral LLM as the default chat model, updating the initialization logic accordingly. [1] [2]
  • Refactored src/app/shared/infra/llm_proxy.py to support Mistral as a backend, adding a Mistral client and a new completion method, and removing the previous LiteLLM-based logic. [1] [2] [3] [4]

Agent and Chat Logic Updates:

  • Updated agent creation in src/app/shared/infra/abst_chat.py to use langchain-mistralai's ChatMistralAI, switched from create_react_agent to create_agent, and replaced message handling with the new format. [1] [2] [3]
  • Removed the unused trim_conversation_history function and its references from src/app/services/agent.py and related tests. [1] [2] [3]

Dependency Upgrades:

  • Upgraded langchain-core, langchain-azure-ai, and langgraph to newer versions; added langchain-mistralai and mistralai dependencies; removed huggingface-hub and langchain-huggingface. [1] [2] [3] [4]

Testing and Miscellaneous:

  • Refactored src/app/tests/services/test_llm_proxy.py to mock Mistral completions instead of LiteLLM, and removed obsolete test code.
  • Added a pip install --upgrade pip step in the Dockerfile to ensure the latest pip is used.

Infrastructure and Housekeeping:

  • Updated SOPS metadata in secrets to reflect the latest version and modification time.
  • Removed unused environment variables and cleaned up .env.example.

These changes collectively modernize the backend to use Mistral LLMs, simplify agent logic, and keep dependencies up to date.

@jmsevin
Copy link
Copy Markdown
Collaborator Author

jmsevin commented Apr 30, 2026

Yes, I know, I would squash and merge... ;)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR switches the app’s primary LLM integration from Azure-backed chat models to Mistral, updating runtime wiring, agent construction, config, and supporting dependencies. It mainly affects the chat/tutor stack and the infra layer that brokers LLM calls.

Changes:

  • Replaced Azure/LiteLLM-backed chat initialization with Mistral-backed clients for tutor, chat, and agent flows.
  • Refactored agent execution to use newer LangChain/LangGraph agent APIs and added a chat history endpoint.
  • Updated config, tests, deployment values/secrets, and Python dependencies to support the new Mistral setup.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/app/tutor/service/tutor.py Swaps tutor model initialization from Azure chat to ChatMistralAI.
src/app/tests/services/tutor/test_utils.py Updates tutor utility tests for async cases and current UploadFile construction.
src/app/tests/services/test_llm_proxy.py Rewrites proxy unit tests around Mistral completions.
src/app/tests/services/test_agent.py Removes tests tied to deleted conversation trimming logic.
src/app/shared/infra/llm_proxy.py Replaces LiteLLM completion logic with direct Mistral client calls.
src/app/shared/infra/abst_chat.py Migrates agent creation/execution to newer LangChain APIs and adds history retrieval.
src/app/services/agent.py Removes the old message-trimming hook.
src/app/models/chat.py Extends agent response payloads with thread_id.
src/app/core/lifespan.py Initializes the app LLM proxy with Mistral settings at startup.
src/app/core/config.py Adds Mistral env vars to settings.
src/app/api/api_v1/endpoints/chat.py Adds chat history endpoint and thread-id handling for agent conversations.
pytest.ini Adds Mistral test environment variables.
pyproject.toml Upgrades LangChain/LangGraph stack and adds Mistral dependencies.
poetry.lock Locks the updated dependency graph for the migration.
k8s/welearn-api/values.yaml Adds default Mistral model config to deployment values.
k8s/welearn-api/secrets.staging.yaml Adds staging Mistral API secret and refreshes SOPS metadata.
k8s/welearn-api/secrets.prod.yaml Adds production Mistral API secret and refreshes SOPS metadata.
k8s/welearn-api/secrets.dev.yaml Adds dev Mistral API secret and refreshes SOPS metadata.
Dockerfile Upgrades pip before dependency installation.
.env.example Documents new Mistral env vars and cleans obsolete examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +77 to +79
else:
# We assume that if it's not an Azure model, it's a Mistral model for now. This can be extended in the future to support other types of models.
return await self.mistral_completion(messages)
Comment thread src/app/core/config.py
Comment on lines +39 to +41
# MISTRAL ENV VARS
MISTRAL_API_KEY: str
MISTRAL_LLM_MODEL_NAME: str
Comment thread src/app/core/lifespan.py
Comment on lines 28 to 30
yield
await app.state.qdrant.close()
await app.state.llm.close_client()
messages=[{"role": "user", "content": "Hello"}],
)
self.assertIsInstance(response, str)
self.assertEqual(response, '{"key": "value"}')
Comment on lines +335 to +351
@router.get("/chat/history")
async def get_chat_history(
thread_id: UUID,
chatfactory=Depends(get_chat_service),
) -> list[Dict[str, str | list[Dict[str, str]] | None]]:
if thread_id:
async with await psycopg.AsyncConnection.connect(
DB_URI, autocommit=True, prepare_threshold=0, row_factory=dict_row
) as conn:
await conn.execute("SET SEARCH_PATH to agent_related")
await conn.commit()

memory = AsyncPostgresSaver(conn)
res = await chatfactory.agent_get_history(
thread_id=thread_id, memory=memory
)
return res
@sandragjacinto sandragjacinto merged commit 7305423 into main May 5, 2026
7 checks passed
@sandragjacinto sandragjacinto deleted the mistral-small-4 branch May 5, 2026 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants