Releases: inclusionAI/asystem-awex
Releases · inclusionAI/asystem-awex
v0.7.0
What's Changed
- fix: improve vLLM plugin compatibility and NCCL receive handling by @garrett4wade in #109
New Contributors
- @garrett4wade made their first contribution in #109
Full Changelog: v0.6.0...v0.7.0
v0.6.0
v0.5.0
What's Changed
- feat:suppot infer slice tensor at dim > 0 and optimize memory by @PrometheusComing in #106
- add validate() method to InferenceConfig by @alekseevpavel04 in #105
New Contributors
- @PrometheusComing made their first contribution in #106
- @alekseevpavel04 made their first contribution in #105
Full Changelog: v0.4.0...v0.5.0
v0.4.0
What's Changed
- feat: support mla and linear attention by @chaokunyang in #107
Full Changelog: v0.3.0...v0.4.0
v0.3.0
What's Changed
- feat: add vLLM/NPU integration and improve public integration readiness by @HwVanICI in #102
- Fix vllm engine name propagation in infer conf by @qianderekwang in #103
- feat: add auto release by @chaokunyang in #104
New Contributors
- @HwVanICI made their first contribution in #102
- @qianderekwang made their first contribution in #103
Full Changelog: v0.2.0...v0.3.0
v0.2.0
What's Changed
- doc: add rmda picture by @chaokunyang in #85
- fix: fix unfinished_ranks order by @chaokunyang in #93
- fix sglang expert sharding by @chaokunyang in #95
- fix: fix clone in group_tensors_by_shape_and_dtype by @chaokunyang in #98
- feat: check inter/train tp size by @chaokunyang in #99
- fix: fix badge height by @chaokunyang in #101
Full Changelog: v0.1.0...v0.2.0
v0.1.0
What's Changed
- feat: add basic util for meta server by @chaokunyang in #1
- feat: add meta server for weights meta exchange by @chaokunyang in #2
- test: add http meta server tests by @chaokunyang in #3
- feat: add util to create extra weight exchange nccl process group by @chaokunyang in #4
- feat: add util to check weights diff for weights validation by @chaokunyang in #5
- feat: add weights exchange statistics compute util by @chaokunyang in #6
- feat: add util to check train-infer weight meta consistency by @chaokunyang in #7
- feat: add setup_batch_isend_irecv util to setup nccl batch send/recv by @chaokunyang in #8
- feat: add rank info struct by @chaokunyang in #9
- feat: add sharding strategy by @chaokunyang in #10
- feat: add sglang sharding strategy by @chaokunyang in #11
- feat: add megatron sharding strategy by @chaokunyang in #12
- feat: add weight meta struct def by @chaokunyang in #13
- feat: add weight meta resolver to collect and build meta info for weights f… by @chaokunyang in #14
- feat: add infer meta resolver by @chaokunyang in #15
- feat: add training meta resolver by @chaokunyang in #16
- feat: add more utils for awex by @chaokunyang in #17
- feat: add megatron weight converter by @chaokunyang in #18
- feat: add sglang weight converter by @chaokunyang in #19
- feat: refine param sharding by @chaokunyang in #20
- refactor: move meta resolver to meta package by @chaokunyang in #21
- feat: add customize process group support by @chaokunyang in #22
- refactor: move common util to util package by @chaokunyang in #23
- fix: fix meta server test by @chaokunyang in #24
- fix: fix imports for meta package by @chaokunyang in #25
- fix: fix weights_converter by @chaokunyang in #26
- reaf: add model registry by @chaokunyang in #27
- feat: add basic gpu utils by @chaokunyang in #28
- feat: add basic system utils by @chaokunyang in #29
- chore: add transfer plan tests by @chaokunyang in #30
- feat: add weights writer by @chaokunyang in #31
- feat: add nccl weights writer by @chaokunyang in #32
- feat: add weights reader by @chaokunyang in #33
- feat: add nccl weights reader for separate mode by @chaokunyang in #34
- feat: add nccl weights reader for colocate mode by @chaokunyang in #35
- feat: add astate weights reader/writer by @chaokunyang in #36
- feat: add ling model support by @chaokunyang in #37
- refactor: make nccl transport stateful for future cuda graph opt by @chaokunyang in #38
- feat: add nccl concurrent batch transport with multi cuda stream by @chaokunyang in #39
- fix: fix cuda stream and clean code by @chaokunyang in #40
- feat: decouple train and infer dependency to run on different image by @chaokunyang in #41
- chore: add tensor utils test by @chaokunyang in #42
- chore: add weight utils test by @chaokunyang in #43
- feat: add infer config and use it instead of server args by @chaokunyang in #44
- feat: add tensor util by @chaokunyang in #46
- feat: add transfer plan by @chaokunyang in #47
- feat: add customized logging since sglang/mcore override logging by @chaokunyang in #48
- feat: add train/infer engine abstraction by @chaokunyang in #49
- feat: add sglang engine integration by @chaokunyang in #50
- feat: add megatron engine integration by @chaokunyang in #51
- chore: add more checks for execute_task_in_model_worker by @chaokunyang in #52
- chore: lint code by ruff by @chaokunyang in #53
- feat: add license header by @chaokunyang in #54
- refactor: rename backend to engine by @chaokunyang in #55
- refactor: refine config and engine API by @chaokunyang in #56
- chore: add meta resolver test by @chaokunyang in #57
- fix: fix rename hg_config by @chaokunyang in #58
- chore: add SimpleGPT and hf config for test by @chaokunyang in #59
- chore: add weight writer test by @chaokunyang in #60
- chore: add weight exchange integration test by @chaokunyang in #61
- chore: add missing license header by @chaokunyang in #62
- chore: convert huggingface model to megatron for tests by @chaokunyang in #63
- feat: use interleaved execution to optimize nccl colocate transport by @chaokunyang in #64
- chore: remove batch_isend_irecv for colocate mode by @chaokunyang in #65
- feat: support async isend/irecv for colocate mode by @chaokunyang in #66
- feat: add pull request template by @chaokunyang in #67
- chore: add code owner for pr review notify by @chaokunyang in #68
- feat: add github issue template by @chaokunyang in #69
- feat: add pyproject.toml build by @chaokunyang in #70
- refactor: move nccl ops build to nccl_comm by @chaokunyang in #71
- feat: refactor nccl comm and support qwen model by @chaokunyang in #72
- feat: add AGENTS.md to make ai coding more easy by @chaokunyang in #73
- doc: update readme with architecture picture by @chaokunyang in #74
- doc: add design doc by @chaokunyang in #75
- fix: fix meta resolver test by @chaokunyang in #76
- chore: lint code by ruff by @chaokunyang in #77
- chore: fix ruff error by @chaokunyang in #78
- chore: format markdown docs by @chaokunyang in #79
- chore: add lint ci by @chaokunyang in #80
- refactor: refine API and unify name style by @chaokunyang in #81
- fix: fix ci error by @chaokunyang in #82
- doc: add detailed design doc by @chaokunyang in #83
- doc: add slang execute_task_in_model_worker patch doc by @chaokunyang in #84
New Contributors
- @chaokunyang made their first contribution in #1
Full Changelog: https://github.com/inclusionAI/asystem-awex/commits/v0.1.0