Skip to content

🛡️ Sentinel: [CRITICAL] Fix XXE Injection Vulnerability#692

Open
badMade wants to merge 2 commits into
mainfrom
sentinel/xxe-vulnerability-fix-4659841705943269996
Open

🛡️ Sentinel: [CRITICAL] Fix XXE Injection Vulnerability#692
badMade wants to merge 2 commits into
mainfrom
sentinel/xxe-vulnerability-fix-4659841705943269996

Conversation

@badMade
Copy link
Copy Markdown
Owner

@badMade badMade commented Jun 2, 2026

  • 🚨 Severity: CRITICAL
  • 💡 Vulnerability: Multiple scripts and webhooks (including watch_rss.py, wecom_callback.py, and search_arxiv.py) were parsing untrusted XML streams using Python's native xml.etree.ElementTree module. The native standard library is vulnerable to XML External Entity (XXE) injection attacks.
  • 🎯 Impact: By crafting malicious XML payloads (e.g. RSS feeds, callback data), an attacker could force the server to read arbitrary local files, perform internal network requests (SSRF), or cause a Denial of Service (Billion Laughs attack).
  • 🔧 Fix: Added the defusedxml dependency to pyproject.toml and updated uv.lock. Replaced imports of xml.etree.ElementTree with defusedxml.ElementTree to securely parse untrusted XML. Updated the .jules/sentinel.md journal with this critical learning.
  • ✅ Verification: Ran pytest tests/gateway/test_wecom_callback.py and manually tested the watch_rss.py and search_arxiv.py tools to ensure normal XML parsing behavior remains functionally identical. Verified that attempting to parse malformed XML triggers the correct ParseError.

PR created automatically by Jules for task 4659841705943269996 started by @badMade

Multiple modules were parsing untrusted external XML data (RSS feeds, WeCom API callbacks, Arxiv feeds) using the native `xml.etree.ElementTree` parser, which is known to be vulnerable to XML External Entity (XXE) injection attacks, including arbitrary file reading, denial of service (DoS), and Server-Side Request Forgery (SSRF).

Fixed by replacing `xml.etree.ElementTree` with `defusedxml.ElementTree` across `optional-skills/devops/watchers/scripts/watch_rss.py`, `gateway/platforms/wecom_callback.py`, and `skills/research/arxiv/scripts/search_arxiv.py`. Added `defusedxml` to `pyproject.toml` dependencies and recorded the finding in Sentinel's journal.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🔎 Lint report: sentinel/xxe-vulnerability-fix-4659841705943269996 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8244 on HEAD, 8253 on base (✅ -9)

🆕 New issues (6):

Rule Count
unresolved-import 3
invalid-argument-type 3
First entries
optional-skills/devops/watchers/scripts/watch_rss.py:22: [unresolved-import] unresolved-import: Cannot resolve imported module `defusedxml.ElementTree`
run_agent.py:13576: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
gateway/platforms/wecom_callback.py:20: [unresolved-import] unresolved-import: Cannot resolve imported module `defusedxml.ElementTree`
skills/research/arxiv/scripts/search_arxiv.py:16: [unresolved-import] unresolved-import: Cannot resolve imported module `defusedxml.ElementTree`
run_agent.py:7317: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:13573: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`

✅ Fixed issues (7):

Rule Count
invalid-argument-type 3
unresolved-attribute 2
no-matching-overload 1
not-subscriptable 1
First entries
skills/research/arxiv/scripts/search_arxiv.py:70: [no-matching-overload] no-matching-overload: No overload of bound method `str.join` matches arguments
skills/research/arxiv/scripts/search_arxiv.py:69: [unresolved-attribute] unresolved-attribute: Attribute `text` is not defined on `None` in union `Element[Unknown] | None`
run_agent.py:7317: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13573: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13576: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
skills/research/arxiv/scripts/search_arxiv.py:67: [not-subscriptable] not-subscriptable: Cannot subscript object of type `None` with no `__getitem__` method
skills/research/arxiv/scripts/search_arxiv.py:69: [unresolved-attribute] unresolved-attribute: Attribute `strip` is not defined on `None` in union `str | None`

Unchanged: 4350 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request mitigates XXE injection vulnerabilities by replacing the native xml.etree.ElementTree library with defusedxml.ElementTree in several modules, adding defusedxml to the project dependencies, and documenting the vulnerability prevention guidelines. However, in watch_rss.py, replacing the import causes an issue because defusedxml.ElementTree does not expose ParseError, which will lead to an AttributeError at runtime when handling invalid XML. A code suggestion is provided to import and bind ParseError to ET to resolve this.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread optional-skills/devops/watchers/scripts/watch_rss.py
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@badMade badMade marked this pull request as ready for review June 3, 2026 00:21
Copilot AI review requested due to automatic review settings June 3, 2026 00:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR mitigates XXE-style XML parsing risks by switching several entrypoints that parse untrusted XML (RSS watcher, WeCom callback adapter, arXiv search script) from Python’s standard xml.etree.ElementTree to defusedxml, and by adding defusedxml as a dependency so the safer parser is available at runtime.

Changes:

  • Add defusedxml to core dependencies (pyproject.toml) and update uv.lock.
  • Replace xml.etree.ElementTree usage with defusedxml.ElementTree in the identified XML-ingesting scripts/adapters.
  • Record the security learning in .jules/sentinel.md.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pyproject.toml Adds defusedxml to core dependencies.
uv.lock Lockfile update reflecting the new dependency set.
optional-skills/devops/watchers/scripts/watch_rss.py Switches RSS/Atom XML parsing to defusedxml.
gateway/platforms/wecom_callback.py Switches WeCom callback XML parsing to defusedxml.
skills/research/arxiv/scripts/search_arxiv.py Switches arXiv API response XML parsing to defusedxml.
.jules/sentinel.md Documents the XXE prevention learning and guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread optional-skills/devops/watchers/scripts/watch_rss.py
Comment thread pyproject.toml
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Auto-merge: checks failing

The following checks did not pass:

  • copilot (failure)
  • copilot (failure)
  • copilot (failure)

Please fix the failing checks before this PR can be merged.

View workflow run

@badMade
Copy link
Copy Markdown
Owner Author

badMade commented Jun 3, 2026

@copilot fix:

failing checks
auto-merge
auto-merge — 2 CI check(s) failing.
Auto-merge / Auto-merge on review + passing checks (pull_request_review)
Auto-merge / Auto-merge on review + passing checks (pull_request_review)Cancelled after 1m
skipped checks
Docker Build and Publish / merge (pull_request)
Docker Build and Publish / merge (pull_request)Skipped 27 minutes ago
Docker Build and Publish / move-latest (pull_request)
Docker Build and Publish / move-latest (pull_request)Skipped 27 minutes ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants