-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
The current PrepareB function combines quantization and rearrangement. The rearragement is dependent on register length. We're going to want to distribute int8 models in an architecture-independent fashion (probably as row major) then have them rearranged at load. The Quantize function already converts to int8 format without rearranging. So what's needed is an int8 rearrangement function.
Possibly with a preprocessing template, though that sounds complicated.
Also worth considering if this should be done in-place or copying.
Metadata
Metadata
Assignees
Labels
No labels