Releases · inclusionAI/asystem-awex · GitHub

22 Apr 09:13

chaokunyang

v0.7.0 Latest

Latest

What's Changed

fix: improve vLLM plugin compatibility and NCCL receive handling by @garrett4wade in #109

New Contributors

@garrett4wade made their first contribution in #109

Full Changelog: v0.6.0...v0.7.0

Contributors

garrett4wade

Assets 2

18 Apr 08:45

chaokunyang

v0.6.0

What's Changed

fix:improve validate by @PrometheusComing in #108

Full Changelog: v0.5.0...v0.6.0

Contributors

PrometheusComing

Assets 2

13 Apr 10:10

chaokunyang

v0.5.0

What's Changed

feat:suppot infer slice tensor at dim > 0 and optimize memory by @PrometheusComing in #106
add validate() method to InferenceConfig by @alekseevpavel04 in #105

New Contributors

@PrometheusComing made their first contribution in #106
@alekseevpavel04 made their first contribution in #105

Full Changelog: v0.4.0...v0.5.0

Contributors

PrometheusComing and alekseevpavel04

Assets 2

10 Apr 13:40

chaokunyang

v0.4.0

What's Changed

feat: support mla and linear attention by @chaokunyang in #107

Full Changelog: v0.3.0...v0.4.0

Contributors

chaokunyang

Assets 2

23 Mar 06:55

chaokunyang

v0.3.0

What's Changed

feat: add vLLM/NPU integration and improve public integration readiness by @HwVanICI in #102
Fix vllm engine name propagation in infer conf by @qianderekwang in #103
feat: add auto release by @chaokunyang in #104

New Contributors

@HwVanICI made their first contribution in #102
@qianderekwang made their first contribution in #103

Full Changelog: v0.2.0...v0.3.0

Contributors

chaokunyang, qianderekwang, and HwVanICI

Assets 2

22 Dec 02:55

chaokunyang

v0.2.0

What's Changed

doc: add rmda picture by @chaokunyang in #85
fix: fix unfinished_ranks order by @chaokunyang in #93
fix sglang expert sharding by @chaokunyang in #95
fix: fix clone in group_tensors_by_shape_and_dtype by @chaokunyang in #98
feat: check inter/train tp size by @chaokunyang in #99
fix: fix badge height by @chaokunyang in #101

Full Changelog: v0.1.0...v0.2.0

Contributors

chaokunyang

Assets 2

20 Nov 03:44

chaokunyang

v0.1.0

What's Changed

feat: add basic util for meta server by @chaokunyang in #1
feat: add meta server for weights meta exchange by @chaokunyang in #2
test: add http meta server tests by @chaokunyang in #3
feat: add util to create extra weight exchange nccl process group by @chaokunyang in #4
feat: add util to check weights diff for weights validation by @chaokunyang in #5
feat: add weights exchange statistics compute util by @chaokunyang in #6
feat: add util to check train-infer weight meta consistency by @chaokunyang in #7
feat: add setup_batch_isend_irecv util to setup nccl batch send/recv by @chaokunyang in #8
feat: add rank info struct by @chaokunyang in #9
feat: add sharding strategy by @chaokunyang in #10
feat: add sglang sharding strategy by @chaokunyang in #11
feat: add megatron sharding strategy by @chaokunyang in #12
feat: add weight meta struct def by @chaokunyang in #13
feat: add weight meta resolver to collect and build meta info for weights f… by @chaokunyang in #14
feat: add infer meta resolver by @chaokunyang in #15
feat: add training meta resolver by @chaokunyang in #16
feat: add more utils for awex by @chaokunyang in #17
feat: add megatron weight converter by @chaokunyang in #18
feat: add sglang weight converter by @chaokunyang in #19
feat: refine param sharding by @chaokunyang in #20
refactor: move meta resolver to meta package by @chaokunyang in #21
feat: add customize process group support by @chaokunyang in #22
refactor: move common util to util package by @chaokunyang in #23
fix: fix meta server test by @chaokunyang in #24
fix: fix imports for meta package by @chaokunyang in #25
fix: fix weights_converter by @chaokunyang in #26
reaf: add model registry by @chaokunyang in #27
feat: add basic gpu utils by @chaokunyang in #28
feat: add basic system utils by @chaokunyang in #29
chore: add transfer plan tests by @chaokunyang in #30
feat: add weights writer by @chaokunyang in #31
feat: add nccl weights writer by @chaokunyang in #32
feat: add weights reader by @chaokunyang in #33
feat: add nccl weights reader for separate mode by @chaokunyang in #34
feat: add nccl weights reader for colocate mode by @chaokunyang in #35
feat: add astate weights reader/writer by @chaokunyang in #36
feat: add ling model support by @chaokunyang in #37
refactor: make nccl transport stateful for future cuda graph opt by @chaokunyang in #38
feat: add nccl concurrent batch transport with multi cuda stream by @chaokunyang in #39
fix: fix cuda stream and clean code by @chaokunyang in #40
feat: decouple train and infer dependency to run on different image by @chaokunyang in #41
chore: add tensor utils test by @chaokunyang in #42
chore: add weight utils test by @chaokunyang in #43
feat: add infer config and use it instead of server args by @chaokunyang in #44
feat: add tensor util by @chaokunyang in #46
feat: add transfer plan by @chaokunyang in #47
feat: add customized logging since sglang/mcore override logging by @chaokunyang in #48
feat: add train/infer engine abstraction by @chaokunyang in #49
feat: add sglang engine integration by @chaokunyang in #50
feat: add megatron engine integration by @chaokunyang in #51
chore: add more checks for execute_task_in_model_worker by @chaokunyang in #52
chore: lint code by ruff by @chaokunyang in #53
feat: add license header by @chaokunyang in #54
refactor: rename backend to engine by @chaokunyang in #55
refactor: refine config and engine API by @chaokunyang in #56
chore: add meta resolver test by @chaokunyang in #57
fix: fix rename hg_config by @chaokunyang in #58
chore: add SimpleGPT and hf config for test by @chaokunyang in #59
chore: add weight writer test by @chaokunyang in #60
chore: add weight exchange integration test by @chaokunyang in #61
chore: add missing license header by @chaokunyang in #62
chore: convert huggingface model to megatron for tests by @chaokunyang in #63
feat: use interleaved execution to optimize nccl colocate transport by @chaokunyang in #64
chore: remove batch_isend_irecv for colocate mode by @chaokunyang in #65
feat: support async isend/irecv for colocate mode by @chaokunyang in #66
feat: add pull request template by @chaokunyang in #67
chore: add code owner for pr review notify by @chaokunyang in #68
feat: add github issue template by @chaokunyang in #69
feat: add pyproject.toml build by @chaokunyang in #70
refactor: move nccl ops build to nccl_comm by @chaokunyang in #71
feat: refactor nccl comm and support qwen model by @chaokunyang in #72
feat: add AGENTS.md to make ai coding more easy by @chaokunyang in #73
doc: update readme with architecture picture by @chaokunyang in #74
doc: add design doc by @chaokunyang in #75
fix: fix meta resolver test by @chaokunyang in #76
chore: lint code by ruff by @chaokunyang in #77
chore: fix ruff error by @chaokunyang in #78
chore: format markdown docs by @chaokunyang in #79
chore: add lint ci by @chaokunyang in #80
refactor: refine API and unify name style by @chaokunyang in #81
fix: fix ci error by @chaokunyang in #82
doc: add detailed design doc by @chaokunyang in #83
doc: add slang execute_task_in_model_worker patch doc by @chaokunyang in #84

New Contributors

@chaokunyang made their first contribution in #1

Full Changelog: https://github.com/inclusionAI/asystem-awex/commits/v0.1.0

Contributors

chaokunyang

Assets 2