[TASK] Add INT8 Quantization Support in the Frontend


## Deliverables

- A pull request (PR) enabling end-to-end DeepSeek inference with INT8 quantization</p>

## Task Description

- Run the DeepSeek R1 inference [pipeline](https://github.com/buddy-compiler/buddy-mlir/tree/main/examples/BuddyDeepSeekR1) to become familiar with the workflow.
- Apply PyTorch INT8 quantization to DeepSeek R1 and evaluate inference performance on an AVX512-VNNI machine.
-  Extend the buddy-mlir frontend to support PyTorch INT8-quantized models.
- After generating MLIR for the full model, build a complete end-to-end inference example based on the [existing example](https://github.com/buddy-compiler/buddy-mlir/tree/main/examples/BuddyDeepSeekR1). 

## Timeline

Phase | Time
-- | --
Coding Phase | Oct 30, 2025 – Nov 6, 2025
Code Review | Begins Nov 7, 2025

If finished ahead of schedule, the review process may begin earlier.</p>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TASK] Add INT8 Quantization Support in the Frontend #605

Deliverables

Task Description

Timeline

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Phase	Time
Coding Phase	Oct 30, 2025 – Nov 6, 2025
Code Review	Begins Nov 7, 2025

[TASK] Add INT8 Quantization Support in the Frontend #605

Description

Deliverables

Task Description

Timeline

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions