-
Notifications
You must be signed in to change notification settings - Fork 10
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestrefactorCleanup, formatting, or restructuring of existing code.Cleanup, formatting, or restructuring of existing code.styleCode or comments formattingCode or comments formatting
Description
🗺️ Roadmap for LightRFT v0.1.1
Expected Release: January 2026
✨ New Features
-
Multimodal Support
- Add video support for reinforcement finetuning (feat(wzn): add video support for reinforcement finetuning #4)
-
Training & Evaluation
- Implement and optimize evaluation for SRM (Step-wise Reward Model) and GRM (Generative Reward Model) trainers (feat(wzn): implement and optimize evaluation for SRM and GRM trainers #12)
- Add high entropy token selection mechanism (feature(sunjx): add high entropy token selection #6)
-
Analysis
- Add analysis metrics to saved trajectories ( feature(sunjx): add some analysis metrics to saved trajectories #5)
⚙️ Compatibility & Dependencies
- Library Updates
- [WIP] Adapt LightRFT to latest versions of
sglang,vllm, anddeepspeed(polish(pu): adapt lightrft to latest versions of sglang #24)
- [WIP] Adapt LightRFT to latest versions of
- Framework Compatibility
- Rename
dtypetotorch_dtypefor bettertransformerscompatibility (fix(wzn): rename dtype to torch_dtype in for transformers compatibility #7)
- Rename
🐛 Bug Fixes & Maintenance
- Fixes
- Fix bug in GRM dataset message formatting and evaluation logic (fix(wzn): fix bug in GRM dataset message formatting and evaluation logic #8)
- Remove redundant tuple nesting in
prepare_reward_modelreturn when using FSDP (fix(wzn): remove redundant tuple nesting in prepare_reward_model return when using FSDP #15)
- Code Style
- Fix
make fcheckinlightrft/datasetsfor linting errors (style(wzn): fix make fcheck in lightrft/datasets for linting errors #10)
- Fix
📚 Documentation
- Deployment
- Setup documentation deploy actions (doc(nyz): setup doc deploy actions #18)
- Content Updates
- Update Python typing lint (style(nyz): polish typing lint #26)
- Polish API comment doc details (polish(pu): polish api doc #21)
- Update GRM on T2I benchmark results and analysis in best practices (docs(wzn): update GRM on T2I benchmark results and analysis in best practice #9)
- Update general documentation and README for v0.1.1 (doc(nyz): update doc and README for v0.1.1 #2)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestrefactorCleanup, formatting, or restructuring of existing code.Cleanup, formatting, or restructuring of existing code.styleCode or comments formattingCode or comments formatting