Skip to content

xpu-perf/projects/micro_perf/op_defs/llm_ops /store_kv_cache.py中部分代码逻辑需要确认。 #195

@Riverstrider

Description

@Riverstrider

代码位置:
xpu-perf-main/projects/micro_perf/op_defs/llm_ops/store_kv_cache.py的第273 ~ 288行

问题描述:

  1. 按照输入qkv包shape为[num_tokens, total_dim],274行代码src_k_data = packed_qkv[token_start:token_end, self.k_dim_start:self.k_dim_end] 取出来的src_k_data的shape也是两维[token_num, total_dim];
  2. 275行代码:src_k_data = src_k_data.contiguous().transpose(0, 1)进行转置后,src_k_data的shape为[total_dim, token_num];
  3. 276行代码:dst_k_cache = k_cache[kv_slot_id, :, cache_start:cache_end, :],dst_k_cache的存储内存shape为[slot_id, head_num, token_num, head_dim];
  4. 280行代码:dst_k_cache.copy_(src_k_data),在数据拷贝的时候,无法将src_k_data的shape为[total_dim, token_num] 拷贝成 dst_k_cache的存储内存shape为[slot_id, head_num, token_num, head_dim];

请问这块是否应修改成:
输入qkv包shape修改为3维[num_tokens, head_num, head_dim]; 274行代码修改为 src_k_data = packed_qkv[token_start:token_end, k_head_start:k_head_end, :]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions