Bug: vLLM multimodal caches are not reset after weight updates
Description
Currently, verl resets the vLLM prefix/KV cache after rollout weight updates, but it does not reset the multimodal cache or encoder cache.
For multimodal rollouts, vLLM may cache multimodal inputs and encoder outputs. These cached values can depend on the current model weights. After update_weights, reusing cached multimodal or encoder outputs from the previous weights may lead to stale features being used during rollout.
Expected behavior
After updating rollout model weights, all relevant vLLM caches should be invalidated:
- prefix/KV cache
- multimodal cache
- encoder cache, when supported by the installed vLLM version
Proposed solution
Reset all available vLLM rollout caches after weight updates.
I opened a PR with the proposed fix here:
#6522
The PR adds a clear_all_caches helper and calls it after update_weights, replacing the current prefix-only cache reset.
Bug: vLLM multimodal caches are not reset after weight updates
Description
Currently, verl resets the vLLM prefix/KV cache after rollout weight updates, but it does not reset the multimodal cache or encoder cache.
For multimodal rollouts, vLLM may cache multimodal inputs and encoder outputs. These cached values can depend on the current model weights. After
update_weights, reusing cached multimodal or encoder outputs from the previous weights may lead to stale features being used during rollout.Expected behavior
After updating rollout model weights, all relevant vLLM caches should be invalidated:
Proposed solution
Reset all available vLLM rollout caches after weight updates.
I opened a PR with the proposed fix here:
#6522
The PR adds a
clear_all_cacheshelper and calls it afterupdate_weights, replacing the current prefix-only cache reset.