Skip to content

Make sure TopK is deterministic when needed #312

@janimo

Description

@janimo

The TopK operation relies on the Candle backend's asort() method which is unstable due to using partial_cmp().
This means inputs containing NaN/Inf may be sorted nondeterministically. While in MoE router layers TopK should only get regular float valued logits, we should consider reimplementing it with stable sort if/when this becomes necessary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions