Fix four RCE / path-traversal vulnerabilities in Model Maker (Fixes #6267)#1
Open
kennethkcox wants to merge 2 commits into
Open
Fix four RCE / path-traversal vulnerabilities in Model Maker (Fixes #6267)#1kennethkcox wants to merge 2 commits into
kennethkcox wants to merge 2 commits into
Conversation
Finding 1 (pytorch_converter.py): add weights_only=True to torch.load() to prevent arbitrary code execution via __reduce__ payloads in .pt files. Finding 2 (model_util.py, image_classifier.py, text_classifier.py): - Add safe_mode=True to load_keras_model() as defense-in-depth for .keras format (safe_mode does not protect SavedModel format). - Replace tf.keras.models.load_model() with model.load_weights() in load_image_classifier() and load_bert_classifier(), where the model architecture is already built by _create_model(). This eliminates Lambda-layer deserialization entirely for the user-supplied model restore paths that were the described attack surface. Finding 3 (file_util.py): use tarfile filter='data' (Python 3.11.4+) to block path-traversal (tar slip) attacks during archive extraction, with a manual path-validation fallback for older Python versions. Finding 4 (cache_files.py): replace yaml.load(..., Loader=yaml.FullLoader) with yaml.safe_load() to prevent YAML deserialization RCE. FullLoader still allows construction of arbitrary Python objects (e.g. subprocess.Popen) via !!python/object/new tags; SafeLoader does not. https://claude.ai/code/session_01GqJo5CJ7UF2Toyr2PcjziK
- file_util.py: add `import sys` (stdlib), move `import requests` to third-party block (blank-line separated), replace try/except TypeError version detection with explicit sys.version_info >= (3, 11, 4) check, update Raises docstring to document the new RuntimeError on unsafe paths. - model_util.py: add Note section to load_keras_model() docstring documenting safe_mode scope and the recommended load_weights() pattern for SavedModel-format callers. https://claude.ai/code/session_01GqJo5CJ7UF2Toyr2PcjziK
Owner
Author
|
cc @google-ai-edge/mediapipe-team (if exists) / maintainers Ready for review—fixes google-ai-edge#6267 RCEs with zero regressions. CLA ready if needed. PoCs in issue. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes google-ai-edge#6267
Summary
Four security vulnerabilities identified by static analysis and manually confirmed against MediaPipe source are addressed here. All fixes are minimal and scoped to the vulnerable call sites.
Finding 1 — Arbitrary code execution via
torch.load()(pytorch_converter.py:36)torch.load()was called withoutweights_only=True, allowing a malicious.ptcheckpoint to execute arbitrary Python via__reduce__payloads during deserialization.PyTorch 2.6 changed the default to
weights_only=True, but making it explicit is correct defence-in-depth regardless of installed version.Finding 2 — RCE via Keras Lambda layer deserialization (
image_classifier.py:233,text_classifier.py:489,model_util.py:71)tf.keras.models.load_model()was called with a user-suppliedsaved_model_path. A SavedModel containing a Lambda layer executes the Lambda body unconditionally during loading.safe_mode=Truedoes not block this for the SavedModel format — it only applies to the Keras-native.kerasconfig format.Fix for the user-supplied restore paths (the actual attack surface): since
_create_model()already builds the model architecture before the load call, replaceload_model()withload_weights(), which loads tensor values only and never reconstructs graph nodes or Lambda functions.Defence-in-depth for the internal utility:
load_keras_model()inmodel_util.pyis only called with hardcoded MediaPipe-controlled URLs (VGG19 perceptual loss model, gesture embedder), never with user input.safe_mode=Trueis added there for the.keras-format protection it provides, and the docstring documents its scope limitation.Finding 3 — Path traversal (tar slip) during archive extraction (
file_util.py:77)tarf.extractall(tmpdir)was called without a filter, allowing a crafted archive to write files outside the extraction directory (e.g.../model_maker_cache/<hash>_metadata.yaml).filter='data'(Python 3.11.4+) blocks absolute paths,..traversal, symlinks escaping the tree, and dangerous member types. The fallback covers older Python versions with an explicit path-boundary check.Finding 4 — YAML deserialization RCE via chained tar slip (
cache_files.py:106)yaml.load(f, Loader=yaml.FullLoader)was used to read cache metadata.FullLoaderpermits Python-specific tags (e.g.!!python/object/new:subprocess.Popen) in older PyYAML versions and is exploitable as the landing target for a chained tar-slip attack (Finding 3 delivers the payload; Finding 4 executes it).yaml.safe_load()restricts output to plain Python primitives and accepts no Python object tags.Files changed
mediapipe/tasks/python/genai/converter/pytorch_converter.pyweights_only=Truemediapipe/model_maker/python/core/utils/model_util.pysafe_mode=True+ docstringmediapipe/model_maker/python/vision/image_classifier/image_classifier.pyload_weights()mediapipe/model_maker/python/text/text_classifier/text_classifier.pyload_weights()mediapipe/model_maker/python/core/utils/file_util.pyimport sys+ docstringmediapipe/model_maker/python/core/data/cache_files.pyyaml.safe_load()Changes identified and implemented with Claude Code (claude-sonnet-4-6).