Skip to content

Conversation

@Gary828
Copy link

@Gary828 Gary828 commented Jan 2, 2026

sum topk var var_mean all算子的实现
天数平台有问题,所以没有在其上进行测试

cpu

cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7

nvidia

nvidia1 nvidia2 nvidia3 nvidia4 nvidia5 nvidia6 nvidia7

moore

moore1 moore2 moore3 moore4 moore5 moore6 moore7

metax

metax1 metax2 metax3 metax4 metax5 metax6 metax7

Honor code

HONOR_CODE.md

Reference

REFERENCE.md

其他

topk算子:

  • pytorch实现中,value相同时,没有找到固定的逻辑决定哪个indices在前

all算子:

  • pytorch针对strided output的outplace有问题,存在输出结果的覆盖现象(一个offset位置的值被更新两次),所以去除了outplace的测试。下例outplace计算结果和inplace不一致
>>> torch.all(input, dim=2, keepdim=True, out=output)
tensor([[[ True],
         [ True],
         [ True],
         [ True],
         [ True]],

        [[ True],
         [ True],
         [ True],
         [ True],
         [False]],

        [[ True],
         [False],
         [ True],
         [ True],
         [ True]],

        [[ True],
         [ True],
         [ True],
         [ True],
         [ True]]])
>>> torch.all(input, dim=2, keepdim=True).bool()
tensor([[[ True],
         [ True],
         [ True],
         [ True],
         [ True]],

        [[ True],
         [ True],
         [ True],
         [ True],
         [False]],

        [[ True],
         [ True],
         [ True],
         [ True],
         [ True]],

        [[ True],
         [ True],
         [ True],
         [ True],
         [ True]]])


gary and others added 3 commits January 3, 2026 00:44
todo:parameter

add sum cpu impl

kernel.cuh

support sum nvidia

modified sum_infiniop.cc

fix ambiguous zero value for iluvatar

fix iluvatar nan value for sum kernel

add support for moore and metax

fix moore and metax include path

fix moore and metax include path

fix moore bug in sum/operator.cc

fix moore bug in sum_moore.mu

fix moore bug in kernel.cuh

fix dtype bug in kernel.cuh

fix dtype bug in kernel.cuh 1

fix sum test in moore

add var_mean kernel

add var_mean kernel

rename var_mean/moore kernel files

bug fix1224

Remove accidentally committed topk files

support nvidia var_mean

support ops::var cpu

ops::var moore metax v0

ops::var/var_mean moore metax v1

bug fix1224

topk cuda v0

pass cpu test for topk

support topk cuda

to test moore & metax

v0

support all

support ops::all

support moore/metax ops::all v0

fix ops::all kernel

support cpu & nvidia

headfile name fix

fix headfile inclue

fix nv_bfloat16 in metax&moore

fix typo

fix typo1

fix device type

fix filename typo

fix unused variable in iluvatar

ignore ref code

reformat

delete redundant files
@Gary828 Gary828 requested a review from a team January 2, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant