forked from iree-org/iree
-
Notifications
You must be signed in to change notification settings - Fork 0
Tensorcore specialization for unaligned k #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
okkwon
wants to merge
27
commits into
main
Choose a base branch
from
tensorcore-specialization-for-unaligned-K
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There are cases where it can fail, so skip reporting it as a pass failure.
The function is always used for reduction cases, so let's remove the flag. In addition, add `peel` to peel the serial loop when the warp-level tiling is requested.
4f6ff80 to
cf2beec
Compare
cf2beec to
1f8c8c9
Compare
okkwon
pushed a commit
that referenced
this pull request
Aug 30, 2023
Caught by ASan: ``` 370: ================================================================= 370: ==3911909==ERROR: LeakSanitizer: detected memory leaks 370: 370: Direct leak of 376 byte(s) in 1 object(s) allocated from: 370: #0 0x6a9b022 in calloc (iree-build/tools/iree-run-mlir+0x6a9b022) 370: #1 0x6ad5d47 in iree_allocator_system_alloc iree/runtime/src/iree/base/allocator.c:104:17 370: #2 0x6ad5d47 in iree_allocator_system_ctl iree/runtime/src/iree/base/allocator.c:144:14 370: iree-org#3 0x6ad56ad in iree_allocator_issue_alloc iree/runtime/src/iree/base/allocator.c:27:10 370: iree-org#4 0x6ad56ad in iree_allocator_malloc iree/runtime/src/iree/base/allocator.c:32:10 370: iree-org#5 0x1acf2486 in iree_vm_bytecode_module_create iree/runtime/src/iree/vm/bytecode/module.c:836:3 370: iree-org#6 0x6afdf31 in iree_tooling_create_run_context iree/runtime/src/iree/tooling/run_module.c:107:9 370: iree-org#7 0x6afdf31 in iree_tooling_run_module_with_data iree/runtime/src/iree/tooling/run_module.c:340:3 370: iree-org#8 0x6ad2a24 in iree::(anonymous namespace)::CompileAndRunFile(iree_compiler_session_t*, char const*) iree/tools/iree-run-mlir-main.cc:359:3 370: iree-org#9 0x6ad2a24 in main iree/tools/iree-run-mlir-main.cc:520:20 370: iree-org#10 0x7fce3bc456c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 ```
okkwon
pushed a commit
that referenced
this pull request
Sep 27, 2023
* Initial checkout and CI setup.
okkwon
pushed a commit
that referenced
this pull request
Nov 20, 2023
…e_thread_request_affinity` (iree-org#15499) TSan report: ``` WARNING: ThreadSanitizer: data race (pid=45817) Read of size 4 at 0x0001084004e0 by thread T2: #0 iree_thread_request_affinity threading_darwin.c:230 (local-task_vmvx_semaphore_submission_test:arm64+0x100078f40) #1 iree_task_worker_main worker.c:385 (local-task_vmvx_semaphore_submission_test:arm64+0x100071594) #2 iree_thread_start_routine threading_darwin.c:72 (local-task_vmvx_semaphore_submission_test:arm64+0x100078e3c) Previous write of size 4 at 0x0001084004e0 by main thread: #0 iree_thread_create threading_darwin.c:140 (local-task_vmvx_semaphore_submission_test:arm64+0x100078ca4) #1 iree_task_worker_initialize worker.c:66 (local-task_vmvx_semaphore_submission_test:arm64+0x1000714f8) #2 iree_task_executor_create executor.c:161 (local-task_vmvx_semaphore_submission_test:arm64+0x10006b2b0) ``` The read of `thread->mach_port` at https://github.com/openxla/iree/blob/ccc4c3719cea467477a783f1c9e9f1fc06b4c508/runtime/src/iree/base/internal/threading_darwin.c#L230 is not ordered relatively to the write of that variable in the parent thread after `pthread_mach_thread_np` returns: https://github.com/openxla/iree/blob/ccc4c3719cea467477a783f1c9e9f1fc06b4c508/runtime/src/iree/base/internal/threading_darwin.c#L140 The proposed fix is that the worker thread shouldn't need to access its own `thread->mach_port`, it can equivalently call `mach_task_self()`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.