Skip to content

Add Semantic Router integration and improve worker/app lifecycle sync#4

Merged
ricky-chaoju merged 16 commits into
mainfrom
dev
Jan 19, 2026
Merged

Add Semantic Router integration and improve worker/app lifecycle sync#4
ricky-chaoju merged 16 commits into
mainfrom
dev

Conversation

@ricky-chaoju
Copy link
Copy Markdown
Contributor

summary

  • Integrate Semantic Router for automatic model routing
  • Add MoM model exposure and usage tracking
  • Improve worker reconnection and offline/online handling
  • Sync deployment and app status on startup, shutdown, and reconnection
  • Add monitoring support for Semantic Router (dashboard/metrics)

- Add SEMANTIC_ROUTER app type for deploying vllm-sr container
- Create semantic router config generator service
- Add config hot-reload support when deployments change
- Add API Gateway support for model='MoM'/'auto' semantic routing
- Add /api/semantic-router endpoints for status and config management
- Add worker API for writing files to Docker volumes
- Override entrypoint to create symlink for config file
- Add entrypoint support in app deployment
- Fix VRAM display showing too many decimal places
@ricky-chaoju ricky-chaoju merged commit e66dc45 into main Jan 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant