Skip to content

Conversation

@LeiWang1999
Copy link
Member

This pull request includes several changes to the tilelang language, focusing on improving the parser and adding licensing information. The most important changes include modifications to the parser imports, the addition of Apache License headers, and improvements to device handling in the profiler utility.

Parser and import improvements:

  • tilelang/language/__init__.py: Replaced tvm.script.parser.tir imports with local parser imports and updated the use_swizzle, annotate_layout, and import_source functions to use block_attr instead of T.block_attr. [1] [2]

Licensing and documentation:

  • tilelang/language/ast/__init__.py, tilelang/language/ast/_ffi_api.py, tilelang/language/parser/__init__.py, tilelang/language/parser/entry.py, tilelang/language/parser/operation.py: Added Apache License headers to these files, indicating modifications from the original TVM project. [1] [2] [3] [4] [5]

Device handling improvement:

  • tilelang/utils/profiler.py: Modified the func function to use the device of the first input tensor if available, improving device handling.

Other changes:

  • README.md: Updated the image format from SVG to PNG for MatmulExample.
  • tilelang/language/kernel.py: Enhanced the assertion message in the __enter__ method to provide more detailed information.

@LeiWang1999 LeiWang1999 merged commit 8e217c9 into main Jan 18, 2025
3 of 4 checks passed
uv-xiao pushed a commit to uv-xiao/tilelang that referenced this pull request Jan 1, 2026
…atomic operations (tile-ai#9)

* Add utility function for ensuring p2p atomic support

* Implement draft API of barrier blocks in a grid

* lint

* fix typo

* Migrate to `sync.h` and fix the absence of barrier for producers in warp-specialized codegen.

* Refactor for better language interface

* draft implementation of intranode barrierall

* set role of intranode barrier to both in ws rewriter pass

* Fix typo & adapt to TVM's new version

* lint

* improve ipc robustness by returning allocated structure

* fix typo; add readme for ipc extension

* Move barrier intrinsics to `sync.{h,cc}`

* - Modified barrier synchronization implementations in sync.h and sync.cc to replace `__syncthreads()` with `__threadfence()` for improved GPU synchronization.
- Disable warp-specialze and TMA in `test_barrier_gpu.py`
- lint

* remove redundant code in `tensor_from_ptr.cpp` and add support for uint32 and uint64

* remove legacy codegen for `copy_unrolled` (not more an intrin but an extern now)

* Refactor barrier synchronization and introduce new `BarrierAllBlocksSysOp` class

- Updated `get_offset` method to be a member function of `PushWarpOp` and `PullWarpOp` classes.
- Added `get_offset` method to `BarrierAllBlocksSysOp` for calculating offsets.
- Implemented `BarrierAllBlocksSysOp` class in `sync.h` and `sync.cc` for system-level barrier synchronization.
- Modified `barrier_all_blocks_sys` function in `builtin.py` to accept a single barrier argument.
- Added a new test script `test_barrierall_sys.py` to validate the functionality of the new barrier synchronization implementation.
- Lint

* Refactor CUDA synchronization handling

- Removed the `need_sync_` flag from the `CodeGenTileLangCUDA` class to simplify synchronization logic.
- Moved the inclusion of `sync.h` to be conditional on `use_distributed_` to ensure proper synchronization handling in distributed contexts.

* fix typo

* Enhance barrier synchronization in CUDA

- Introduced a macro for computing barrier pointers in `barrier_all_blocks_sys` to improve code readability and maintainability.
- Compute pointers of barriers in time to save regiusters

* lint

* Remove redundant import

* Update tilelang/distributed/testing/sync/test_barrierall_sys.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Reorder memory fence and syncthreads

---------

Co-authored-by: Rachmanino <188805904201@163.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants