Skip to content

Port SHM non-GPU transport to cpu_context_pickle branch and fix correctness gaps from #278 review#280

Merged
hlin99 merged 5 commits into
copilot/add-nongpucontext-shm-implementationfrom
copilot/sub-pr-278
May 20, 2026
Merged

Port SHM non-GPU transport to cpu_context_pickle branch and fix correctness gaps from #278 review#280
hlin99 merged 5 commits into
copilot/add-nongpucontext-shm-implementationfrom
copilot/sub-pr-278

Conversation

Copy link
Copy Markdown

Copilot AI commented May 20, 2026

  • Identify merge conflicts between copilot/sub-pr-278 and the base branch\n- [x] Resolve conflict in server.py commit_store (locked + None-check semantics)\n- [x] Resolve conflicts in test_non_cuda_context.py (keep both tests with correct semantics)\n- [x] Verify all 20 tests pass\n- [x] Commit the merge resolution

Copy link
Copy Markdown
Author

Copilot AI commented May 20, 2026

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • download-r2.pytorch.org
    • Triggering command: /usr/bin/python python -m pip install numpy torch --index-url REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title [WIP] [WIP] Address feedback on SHM-based non-GPU KV transport implementation Port SHM non-GPU transport to cpu_context_pickle branch and fix correctness gaps from #278 review May 20, 2026
Copilot finished work on behalf of hlin99 May 20, 2026 06:49
Copilot AI requested a review from hlin99 May 20, 2026 06:49
…//github.com/hlin99/LMCache into copilot/sub-pr-278

# Conflicts:
#	lmcache/v1/multiprocess/server.py
#	tests/v1/multiprocess/test_non_cuda_context.py

Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Copilot finished work on behalf of hlin99 May 20, 2026 07:01
Copy link
Copy Markdown
Owner

@hlin99 hlin99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PR #278 + #280 Combined

整体质量不错,#280 修复了大部分 #278 review 中的问题。还有两个残留需要处理:


🟡 Issue 1: prepare_store 空 reserved 时仍存 entry 到 _pending_shm_writes,导致微量内存泄漏

reserve_write 返回空 dict(key 全部已存在),reserved_keys = []slots = []

当前行为链:

  1. Server prepare_store → 存 _pending_shm_writes[transfer_key] = [],返回 {"slots": []}
  2. Worker prepare_store → 检测到空 slots,返回 None(✅ #280 已修复)
  3. Worker 不调用 commit_store
  4. Server 侧 _pending_shm_writes 中的空 entry 永远不会被 pop

虽然每个泄漏 entry 很小,但长时间运行 + 大量重复 prompt 会持续累积。

建议修复(server.py prepare_store):

# 在 "transfer_key = ..." 之前加:
if not reserved_keys:
    return PrepareStoreResponse(context={})
这样 worker 收到空 context "slots" key),直接返回 Noneserver 也不存空 entry同时避免了多余的 reserve_write dict遍历的开销。

🟡 Issue 2: MixedMemoryAllocator(shm_name=...) 参数兼容性未验证
memory_manager.py  create_memory_allocator 调用Python
return MixedMemoryAllocator(
    config.size_in_bytes,
    align_bytes=config.align_bytes,
    shm_name=shm_name,
)
如果 MixedMemoryAllocator.__init__ 没有 shm_name 参数SHM 模式初始化时会直接 TypeError crash请确认 MixedMemoryAllocator 是否已支持该参数如果不支持需要在本 PR 中添加

When reserve_write returns empty (all object keys already cached),
return PrepareStoreResponse(context={}) immediately without storing
an entry in _pending_shm_writes. This prevents leaked entries that
would never be popped since the worker won't call commit_store.

Agent-Logs-Url: https://github.com/hlin99/LMCache/sessions/182111d5-1737-49c0-be65-0287d5b9d6c5

Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented May 20, 2026

Review: PR #278 + #280 Combined

整体质量不错,#280 修复了大部分 #278 review 中的问题。还有两个残留需要处理:...

Fixed both issues in 338abb7:

Issue 1: Restored the early-return in prepare_store when reserved_keys is empty — now returns PrepareStoreResponse(context={}) without storing an entry in _pending_shm_writes. This prevents the memory leak since the worker (seeing no "slots" key) won't call commit_store, and no orphan entry accumulates.

Issue 2: Verified — MixedMemoryAllocator.__init__ already accepts shm_name via **kwargs (line 2092 of memory_management.py), so the call in memory_manager.py works correctly.

Copilot finished work on behalf of hlin99 May 20, 2026 07:11
Copilot AI requested a review from hlin99 May 20, 2026 07:11
@hlin99 hlin99 marked this pull request as ready for review May 20, 2026 07:13
@hlin99 hlin99 merged commit 2a02acb into copilot/add-nongpucontext-shm-implementation May 20, 2026
@hlin99 hlin99 deleted the copilot/sub-pr-278 branch May 28, 2026 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants