Some doubts regarding quantization in academia and industry.

I have some questions. Previously, I quantized some models using torch.fx, and both the size and speed of the models changed. However, I noticed that these tools are not used in academia. So, I used your code to learn how to quantize a model in an academic context, but it seems that there is no change in the size and speed of the model, and the stored data remains unchanged as well. Is this normal? And is it not necessary to fuse operators or similar elements?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some doubts regarding quantization in academia and industry. #32

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Some doubts regarding quantization in academia and industry. #32

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions