Pass MPServerConfig.shm_name into NonGPUTransferModule#308
Closed
Copilot wants to merge 4 commits into
Closed
Conversation
Copilot
AI
changed the title
[WIP] Pass mp_config to NonGPUTransferModule to access shm_name directly
Pass May 27, 2026
MPServerConfig.shm_name into NonGPUTransferModule
…Manager The allocator creates the SHM segment using the name from L1MemoryManagerConfig.shm_name. Without this override the allocator uses the default pid-based name while _compute_shm_pool_info returns the user-specified name, causing workers to fail with FileNotFoundError.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
--shm-namewas defined onMPServerConfigbut never reached the non-GPU transfer path, so--shm-name ""could not disable SHM and force pickle mode. This change keepsrun_cache_server()unchanged and routes the override directly intoNonGPUTransferModule, where the SHM decision is made.Server wiring
_build_modules()inlmcache/v1/multiprocess/server.pyto constructNonGPUTransferModule(ctx, mp_config)fortransfer_mode="non_gpu".Non-GPU SHM override handling
NonGPUTransferModule.__init__to accept optionalMPServerConfig.mp_config.shm_nameas a module-local override instead of mutating shared storage config._compute_shm_pool_info()to prefer the override when present:None→ preserve existing auto-detect behavior from storage config""→ disable SHM and return pickle-mode metadatalmcache_l1_pool_*and return the configured pool sizeRegression coverage
_build_modules()passing the MP config intoNonGPUTransferModuleExample of the new flow:
Original prompt
Pass
mp_configtoNonGPUTransferModuleso it can readshm_namedirectly, without changingrun_cache_server()logic.Changes needed:
1.
lmcache/v1/multiprocess/server.py—_build_modules()Change:
To:
2.
lmcache/v1/multiprocess/modules/non_gpu_transfer.pyUpdate
__init__to acceptmp_config:Update
_compute_shm_pool_infoto use the override:3. Ensure
MPServerConfighasshm_namefieldCheck
lmcache/v1/multiprocess/config.py— it should already have ashm_namefield and--shm-nameCLI arg. If not, add:And in
add_mp_server_args:Why:
--shm-name ""should disable SHM and force pickle mode. Currently the override never reaches_compute_shm_pool_infobecause nothing passes it frommp_configto the non-GPU module. This fix keeps the arg inMPServerConfigand letsNonGPUTransferModuleconsume it directly.The following is the prior conversation context from the user's chat exploration (may be truncated):
User: 哪个文件,多少行
Assistant: 文件:
lmcache/v1/multiprocess/server.py第 193 行之后、第 194 行之前插入:
即在
maybe_initialize_trace_recorder(...)和ctx = MPCacheEngineContext(...)之间。User: 可以不改server文件吗
Assistant: 可以。两个方案:
方案 A:改
engine_context.py——把shm_nameoverride 传入 context然后
server.py调用处只需加个参数:ctx = MPCacheEngineContext(..., shm_name_override=mp_config.shm_name)— 但这还是改了 server.py 一行。方案 B(完全不改 server.py):改
config.py的parse_args_to_config把
--shm-name从 MP server args 移到 storage manager args 里,让parse_args_to_config()直接设置memory_config.shm_name。这样storage_manager_config构建时就已经是正确的值,server.py 完全不用动。但这改变了
--shm-name的归属——从 "MP server 参数" 变成 "storage manager 参数"。语义上其实也说得通,因为 shm_name 本来就是 L1 memory 的配置。最省事的还是在 server.py 加那 3 行。你不想改 server.py 的原因是什么?如果是想减少 diff/冲突,方案 B 改动反而更大。
User: 这个shm是non cuda context才有用的,server.py是上层抽象
Assistant: 对,那放在
NonGPUTransferModule.__init__里最合理:然后
_build_modules里:这样 `run_cache_serve...
This pull request was created from Copilot chat.