Skip to content

🛡️ Sentinel: [HIGH] Fix XXE vulnerabilities in XML parsing#668

Open
badMade wants to merge 2 commits into
mainfrom
sentinel-xxe-prevention-1172194492217024626
Open

🛡️ Sentinel: [HIGH] Fix XXE vulnerabilities in XML parsing#668
badMade wants to merge 2 commits into
mainfrom
sentinel-xxe-prevention-1172194492217024626

Conversation

@badMade
Copy link
Copy Markdown
Owner

@badMade badMade commented May 31, 2026

🚨 Severity: HIGH
💡 Vulnerability: XML External Entity (XXE) vulnerabilities when parsing untrusted XML responses and payloads using Python's native xml.etree.ElementTree.
🎯 Impact: Malicious XML payloads could exploit XXE to read local files on the server, perform Server-Side Request Forgery (SSRF), or cause Denial of Service (Billion Laughs attack).
🔧 Fix: Replaced xml.etree.ElementTree with defusedxml.ElementTree in search_arxiv.py, watch_rss.py, and wecom_callback.py. defusedxml acts as a drop-in replacement that explicitly disables external entity resolution and billion laughs expansion.
✅ Verification: Ran pytest locally on gateway/platforms and optional-skills tests. Validated that defusedxml is available in uv.lock. Included entry in .jules/sentinel.md to prevent future regressions.


PR created automatically by Jules for task 1172194492217024626 started by @badMade

Replaced native `xml.etree.ElementTree` with `defusedxml.ElementTree` in XML parsing logic for arXiv, RSS watchers, and WeCom callbacks. Native XML parsers in Python are vulnerable to XML External Entity (XXE) attacks when parsing untrusted external data. This change mitigates these attacks.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@github-actions
Copy link
Copy Markdown

Auto-merge: pending review

All CI checks have passed, but no qualifying review activity has been detected yet.

Auto-merge requires one of the following, plus the reviewed label:

  • an approved review,
  • an inline review comment from someone other than the PR author, or
  • a comment or review from a recognised code-review bot (e.g. @gemini-code-assist, @copilot-pull-request-reviewer, @coderabbitai) — quota / rate-limit responses still count as an attempt.

View workflow run

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a security enhancement to prevent XML External Entity (XXE) vulnerabilities by replacing the standard xml.etree.ElementTree with defusedxml.ElementTree across several scripts, including wecom_callback.py, watch_rss.py, and search_arxiv.py. Additionally, the .jules/sentinel.md file has been updated to document this vulnerability and its prevention. There are no review comments, and I have no feedback to provide.

@badMade
Copy link
Copy Markdown
Owner Author

badMade commented May 31, 2026

@claude code review

@badMade
Copy link
Copy Markdown
Owner Author

badMade commented May 31, 2026

@jules code review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 31, 2026

Auto-merge: review received — action required

Review activity detected (1 comment(s)/review(s) from review bot(s): gemini-code-assist[bot]).

To confirm you have reviewed and accepted the feedback, please add the reviewed label to this PR. Auto-merge will then proceed.

View workflow run

@google-labs-jules
Copy link
Copy Markdown

@jules code review

Code review requested. A code review has been initiated, and no issues or errors were found during testing and review.

Replaced native `xml.etree.ElementTree` with `defusedxml.ElementTree` in XML parsing logic for arXiv, RSS watchers, and WeCom callbacks. Native XML parsers in Python are vulnerable to XML External Entity (XXE) attacks when parsing untrusted external data. This change mitigates these attacks.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@badMade
Copy link
Copy Markdown
Owner Author

badMade commented May 31, 2026

@copilot, resolve the merge conflicts in this pull request.

@github-actions
Copy link
Copy Markdown

Auto-merge: no CI detected

No CI check runs were found for commit 3807002.
Auto-merge will not proceed until CI is configured and running.

View workflow run

@badMade badMade marked this pull request as ready for review May 31, 2026 23:27
Copilot AI review requested due to automatic review settings May 31, 2026 23:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to mitigate XML External Entity (XXE) and related XML parser abuse risks by switching XML parsing from Python’s standard xml.etree.ElementTree to defusedxml.ElementTree in a few places that parse untrusted XML (network responses and inbound callbacks).

Changes:

  • Swap xml.etree.ElementTree imports to defusedxml.ElementTree in the arXiv search script and RSS watcher script.
  • Swap xml.etree.ElementTree to defusedxml.ElementTree in the WeCom callback gateway adapter.
  • Add a Sentinel entry documenting the XXE prevention guidance.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
skills/research/arxiv/scripts/search_arxiv.py Uses defusedxml.ElementTree for parsing arXiv API XML responses.
optional-skills/devops/watchers/scripts/watch_rss.py Uses defusedxml.ElementTree for parsing RSS/Atom XML feeds.
gateway/platforms/wecom_callback.py Uses defusedxml.ElementTree for parsing inbound WeCom callback XML.
.jules/sentinel.md Documents the XXE prevention learning/prevention guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

import time
from typing import Any, Dict, List, Optional
from xml.etree import ElementTree as ET
import defusedxml.ElementTree as ET
import urllib.request
import urllib.parse
import xml.etree.ElementTree as ET
import defusedxml.ElementTree as ET
import urllib.request
from pathlib import Path
from xml.etree import ElementTree as ET
import defusedxml.ElementTree as ET
Comment thread .jules/sentinel.md
**Learning:** SQLite does not natively support parameterization for the FROM clause (e.g., subqueries or table names). Attempting to string-interpolate user input into a subquery creates an injection vector, especially when trying to enforce a LIMIT clause on user-provided queries.
**Prevention:** To prevent SQL injection when applying limits to user-provided SQL queries, execute the raw user query directly and restrict the output rows in Python using `cursor.fetchmany(limit)` instead of trying to wrap the query in another SELECT with a LIMIT clause.

## 2024-05-26 - Security Enhancement: XXE Prevention
Copy link
Copy Markdown

Copilot AI commented May 31, 2026

@copilot, resolve the merge conflicts in this pull request.

Resolved in 9883911.

@copilot, resolve the merge conflicts in this pull request.

Resolved in 9883911.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants