Skip to content

Consider maximizing grid utilization #1321

@maleadt

Description

@maleadt

We currently maximize block utilization (taking the max threads), which may leave SMs underutilized. We should consider first selecting an optimal amount of blocks, before maximizing the thread could:

    config = launch_configuration(kernel.fun)
    threads = min(length(ps), config.threads)
    # XXX: this kernel performs much better with all blocks active
    blocks = max(cld(length(ps), threads), config.blocks)
    threads = cld(length(ps), blocks)

I'm sure this will lead to some kernels performing worse, though, but it's probably a good thing to test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions