feat(vops): restore VF/VOPS vtile implementation (from session logs)#201
feat(vops): restore VF/VOPS vtile implementation (from session logs)#201zhoubot wants to merge 1 commit intozhangstevenunity:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8381b3d708
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| //===----------------------------------------------------------------------===// | ||
|
|
||
| // pto.vf.scope { ... } | ||
| def VFScopeOp : PTO_Op<"vf.scope", [IsolatedFromAbove, NoRegionArguments, SingleBlockImplicitTerminator<"YieldOp">]> { |
There was a problem hiding this comment.
Allow vf.scope to capture surrounding SSA values
Removing data flow in and out of pto.vf.scope makes the op effectively unusable for real kernels: with IsolatedFromAbove, no region arguments, and no yielded values, the body cannot read function arguments or previously computed tiles/pointers, so typical vload/vstore sequences cannot operate on surrounding state. This is a behavioral regression for any lowering that expects vf.scope to wrap vectorized code over existing operands.
Useful? React with 👍 / 👎.
| for (auto &op : *rewriter.getInsertionBlock()) { | ||
| if (auto all = dyn_cast<mlir::pto::VPredAllOp>(&op)) | ||
| return all.getPred(); |
There was a problem hiding this comment.
Reuse only dominating vpred.all in rewrite helper
getOrCreatePredAll scans the whole block and returns the first pto.vpred.all even if it is located after the rewrite insertion point; when that happens, VLoadTailToVLoad/VStoreTailToVStore build new ops that use a value defined later in the block, creating an SSA dominance violation and causing verifier/pass failure on blocks where vpred.all is emitted below the rewritten op.
Useful? React with 👍 / 👎.
This PR restores the VF/VOPS (vector tile) implementation that was previously discussed/implemented but not landed, based on the OpenClaw session log:
What
!pto.preg!pto.uscalar<elem>!pto.vtile<lanes x elem>with stable custom assembly formats.
#pto.target_config<arch=..., isa=..., variant=..., repeat_bytes=..., block_bytes=..., caps={...}>pto.vf.scopepto.vpred.all,pto.vpred.tailpto.vload/vstoreandpto.vload_tail/vstore_tailpto.vdup,pto.uload_row,pto.vload_block,pto.vlane_adaptvadd/vsub/vmul/vmin/vmax/vand/vor/vxor-pto-canonicalize-vopspass:count == lanessimplificationdocs/ir/PTO-IR-vf-vops-design.mdFiles
include/PTO/IR/PTOTypeDefs.tdinclude/PTO/IR/PTOAttrs.tdinclude/PTO/IR/PTOOps.tdinclude/PTO/Transforms/Passes.tdlib/PTO/IR/PTO.cpplib/PTO/Transforms/PTOCanonicalizeVops.cpplib/PTO/Transforms/CMakeLists.txtdocs/ir/PTO-IR-vf-vops-design.mdNotes
This is a restored baseline; follow-up PRs can tighten verifiers (target_config requirement/capability gating) and add FileCheck coverage.