Skip to content

Why the use of stochastic rounding in forward pass? #2

@TheTinyTeddy

Description

@TheTinyTeddy

Many thanks for the great work!

My understanding is that the deterministic round-to-nearest even is applied in the forward pass for the best accuracy, while stochastic rounding is applied in the backward pass to avoid quantization bias. However, in your paper and implementation where SR is applied in both forward and backward passes. So I was wondering if there is a reason for That?

Kind regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions