Skip to content

Conversation

@Haile-12
Copy link

@Haile-12 Haile-12 commented Dec 5, 2025

Pull Request

Description

i applied those changes

Backend

LLM Client-->llm_clients.py

  • Added streaming with retry and fallback using LangChain for Gemini and OpenAI.

RAG Generator-->rag_generator.py

  • Added streaming RAG generation that yields partial token deltas and a final formatted response.

Chat Endpoint--> chat.py

  • Replaced single-response POST chat handler with SSE streaming endpoint using an event generator.

EventGenerator--> event_generator.py

  • Added new class-based EventGenerator

Frontend

MessageList.ts

  • Scroll once on each new message.
  • Stop auto-scroll if the user scrolls up during streaming.
  • Continue following assistant tokens only while streaming and if the user is at the bottom.

Chat.tsx

  • Pass isSendingMessage down to MessageList as isStreaming to manage scroll during streaming.

MessageBubble.tsx

  • Show the “Thinking…” animation only for the initial placeholder, not during streamed text arrival.

chatService.ts

  • Added a streaming client for /api/chat that reads SSE chunks over fetch.

useChatStore.ts

  • Made sendMessage streaming-first, updating the thinking bubble in real-time as chunks arrive.
  • Implement fallback to the previous non-streaming request.

Checklist

  • I have read the contributing guidelines
  • I have added or updated tests where applicable
  • I have added or updated documentation where applicable
  • I have reviewed and tested my changes, and I have run the code to confirm it works properly

How Has This Been Tested?

The changes were tested end-to-end by sending messages through the chat to verify that streaming responses display correctly with partial token updates, updated UI elements such as auto-scroll and the “Thinking…” animation behave as intended, and the streaming chat functionality works reliably alongside existing features.

Additional Notes

Add any additional context or information about the pull request here.

@birukabza birukabza self-requested a review December 6, 2025 08:15
from enum import Enum
from abc import ABC, abstractmethod
from langchain.chat_models import init_chat_model
from langchain_openai import ChatOpenAI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • u can stream without importing specific model library, utilize the already exisiting init_chat_model. refer here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the file location for this file is not logical.
  • make handle post streaming a background task

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-ensure the generator is wrapped in a try finally block that guarantees the message is saved even if the connection drops.

@Haile-12 Haile-12 force-pushed the feat/chat-streaming branch from 70fd56f to 3f1af96 Compare December 22, 2025 15:55
@Haile-12 Haile-12 force-pushed the feat/chat-streaming branch from de67f37 to b76d941 Compare January 3, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants