You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After int4 quantization by basic_quant_mix.py, the qwen2-32b model changed from 62G to 52G, which is too large.
Understood that the quantitative model has saved EETQ.
How can we save only the quantization model without saving EETQ.
After int4 quantization by basic_quant_mix.py, the qwen2-32b model changed from 62G to 52G, which is too large.
Understood that the quantitative model has saved EETQ.
How can we save only the quantization model without saving EETQ.