Skip to content

PrepareB but take integers instead of float #42

@kpu

Description

@kpu

The current PrepareB function combines quantization and rearrangement. The rearragement is dependent on register length. We're going to want to distribute int8 models in an architecture-independent fashion (probably as row major) then have them rearranged at load. The Quantize function already converts to int8 format without rearranging. So what's needed is an int8 rearrangement function.

Possibly with a preprocessing template, though that sounds complicated.

Also worth considering if this should be done in-place or copying.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions