Skip to content

Restrict native shuffle ops to Float32, Int32, and UInt32 for Metal#7

Merged
epilliat merged 1 commit into
epilliat:mainfrom
WilliBee:metal_backend_squashed
May 6, 2026
Merged

Restrict native shuffle ops to Float32, Int32, and UInt32 for Metal#7
epilliat merged 1 commit into
epilliat:mainfrom
WilliBee:metal_backend_squashed

Conversation

@WilliBee
Copy link
Copy Markdown
Contributor

Continuing from #5 (review)

  • restricted native shuffle ops to Float32, Int32, and UInt32 for Metal
  • left the Float64 Shuffle Up test deactivated as Metal Arrays do not support Float64
  • reactivated ComplexType @shfl test for Metal by creating a ComplexTypeMetal with Float16 instead of Float64

@WilliBee WilliBee mentioned this pull request Apr 22, 2026
Copy link
Copy Markdown
Owner

@epilliat epilliat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good — small, scoped, and CI-green. Removing the 8/16-bit Metal fast paths is the right call: Apple's AIR shuffle intrinsics are 32-bit only, so the previous code was emitting ccalls to non-existent symbols. The recursive widening in src/warp.jl:122-137 transparently routes narrow types through UInt32, so every bitstype that worked before still works — just one extra convert/reinterpret hop. The new ComplexTypeMetal test (Float16 in place of Float64, since MtlArray doesn't support FP64) is exactly the right coverage for the nested-struct path. Thanks!

@epilliat epilliat merged commit dc56d6a into epilliat:main May 6, 2026
1 check passed
@epilliat epilliat mentioned this pull request May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants