## Deliverables * A PR that adds separate **tokens/s metrics** for **prefilling** and **decoding** in the current `buddy-deepseek-r1-run`. * A PR that adds **buddy-deepseek-r1-cli**. ## Task Description ### Task 1: Compute and Display Tokens/s * Implement the calculation of tokens-per-second. * Display separate metrics for prefilling and decoding. <img width="1834" height="1572" alt="Image" src="https://github.com/user-attachments/assets/31d21eeb-9274-4796-a7e6-3bcfb024a4ef" /> ### Task 2: Add a CLI Tool * Add a tool that, when executed, **does not print each token and its timestamp**, but instead **streams only the final generated text** in real time. * Use the `llama.cpp` CLI tool as a reference. ## Timeline * **Coding phase:** 2025.11.17 – 2025.11.18 * **Code review:** Begins on 2025.11.19
Deliverables
buddy-deepseek-r1-run.Task Description
Task 1: Compute and Display Tokens/s
Task 2: Add a CLI Tool
llama.cppCLI tool as a reference.Timeline