# Agentic - [ ] Agentic Rollout — training real agent apps with strict app–RL-framework separation and lightweight API integration - [ ] Recipes - [ ] DeepEyes v2 — multi-modal visual reasoning agent - [ ] R2E-Gym (DeepSWE) — repo-level software engineering agent training # High Performance - [ ] Compact Deployment — ref/actor/actor_fwd colocated with async & sync support - [ ] Megatron Dynamic CP — dynamic context parallelism for variable-length sequences - [ ] Low-precision training (FP8 / NVFP4 / INT4) — FP8 and INT4 quantized training for memory and throughput gains - [ ] Dynamo as rollout model router (with @nvidia) # Model - [ ] Megatron-Bridge upgrade - [ ] New model support - [ ] GLM-5, GLM-5.1 - [ ] Qwen3.5-Omni - [ ] Qwen3.6 dense, Qwen3.6-MoE (30B and larger) # Algorithms - [ ] OPD algorithm exploration — explore more On-Policy Distillation variants (multi-teacher, asynchronous teacher serving, weighted distillation objectives) beyond baseline validation # Ecosystem - [ ] Ascend (Huawei NPU) https://github.com/redai-infra/Relax/issues/4 - Qwen3.5 Dense/MoE DAPO-math,Open-R1 - [ ] AMD (ROCm / MI-series GPUs) https://github.com/redai-infra/Relax/issues/12 # Misc ...
Agentic
High Performance
Model
Algorithms
Ecosystem
Misc
...