This repo contains a minimal example that demonstrates how to preserve token-by-token streaming when calling a LangGraph subgraph. The script contrasts a "broken" pattern vs the exact "updated" pattern that uses stream_mode="messages".
new_streaming_example.py: runnable demo comparing broken vs fixed streaming behaviorrequirements.txt: Python dependencies (LangChain, LangGraph, OpenAI SDK, dotenv)
- Python 3.10+ recommended
- An OpenAI API key with access to the model used in the script (defaults to
gpt-4o-mini)
- Create and activate a virtual environment
- macOS/Linux:
python3 -m venv .venv
source .venv/bin/activate- Windows (PowerShell):
py -m venv .venv
.venv\Scripts\Activate.ps1- Install dependencies
pip install -r requirements.txt- Provide environment variables
The script loads environment variables from a local .env file if present. Create a file named .env in the repository root with your keys:
# Required
OPENAI_API_KEY=your-openai-api-key
# Optional LangSmith (for tracing/observability)
LANGSMITH_PROJECT=default
LANGSMITH_API_KEY=your-langsmith-api-key
LANGCHAIN_TRACING_V2=trueNever commit real secrets. If you're using git, ensure .env is ignored.
python new_streaming_example.pyYou should see two sections:
- BROKEN: shows the non-streaming behavior when invoking the subgraph directly
- FIXED: shows chunked streaming when using
stream_mode="messages"
- "Module not found": Ensure your virtual environment is activated and dependencies are installed.
- No streaming observed in the FIXED path: Confirm your
langgraphversion is >= 0.6.4.python -c "import langgraph; print(langgraph.__version__)" - Authentication errors: Verify
OPENAI_API_KEYis set in.envor your shell environment and that your key has access to the chosen model.
- This example mirrors the fix described by LangGraph maintainers for preserving streaming from subgraphs by explicitly requesting messages streaming:
subgraph.stream(state, stream_mode="messages").