Skip to content

[quantization][draft] Prefill-decode logic#570

Draft
stamalakhov wants to merge 1 commit intoSamsung:mainfrom
stamalakhov:model_cache_br
Draft

[quantization][draft] Prefill-decode logic#570
stamalakhov wants to merge 1 commit intoSamsung:mainfrom
stamalakhov:model_cache_br

Conversation

@stamalakhov
Copy link
Copy Markdown
Contributor

@stamalakhov stamalakhov commented Mar 23, 2026

This PR outputs kv-tuples in case use_cache was set.

Related: #586

TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

@stamalakhov stamalakhov self-assigned this Mar 23, 2026
@stamalakhov stamalakhov force-pushed the model_cache_br branch 3 times, most recently from 23e4c61 to 192349a Compare March 23, 2026 14:31
@stamalakhov stamalakhov force-pushed the model_cache_br branch 4 times, most recently from e20b7aa to 317079e Compare March 31, 2026 07:35
@stamalakhov stamalakhov changed the title [quantization][draft] Ouput kv-tuples [quantization][draft] Prefill-decode logic Mar 31, 2026
@stamalakhov stamalakhov force-pushed the model_cache_br branch 5 times, most recently from f1e56d0 to 6255e18 Compare April 2, 2026 09:23
This PR outputs kv-tuples in case `use_cache` was set.

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant