Skip to content

Harden link download and shortening#21

Merged
Rajrooter merged 1 commit into
mainfrom
codex/review-codebase-for-improvements-and-fixes
Feb 3, 2026
Merged

Harden link download and shortening#21
Rajrooter merged 1 commit into
mainfrom
codex/review-codebase-for-improvements-and-fixes

Conversation

@Rajrooter

@Rajrooter Rajrooter commented Feb 3, 2026

Copy link
Copy Markdown
Owner

User description

Motivation

  • Reduce risk of resource exhaustion and unexpected scheme handling when downloading user-provided URLs by validating schemes and size before reading content.
  • Improve network security for link shortening by using an encrypted endpoint to reduce MITM tampering risk.

Description

  • In download_bytes added an early scheme check to reject non-http/https URLs and a Content-Length header gate to avoid reading files larger than MAX_DOWNLOAD_BYTES before downloading the body.
  • Kept the existing post-read size check as a fallback to ensure the actual payload remains under MAX_DOWNLOAD_BYTES.
  • Updated the TinyURL API call in shorten_link to use https://tinyurl.com/api-create.php instead of http.

Testing

  • No automated tests were executed for this change (review-only patch).

Codex Task


PR Type

Enhancement, Bug fix


Description

  • Add URL scheme validation to reject non-HTTP/HTTPS downloads

  • Check Content-Length header before downloading to prevent resource exhaustion

  • Upgrade TinyURL API endpoint from HTTP to HTTPS for security

  • Maintain fallback size check after download for defense-in-depth


Diagram Walkthrough

flowchart LR
  A["download_bytes function"] --> B["Validate URL scheme"]
  B --> C["Check Content-Length header"]
  C --> D["Download with size limit"]
  D --> E["Verify actual payload size"]
  E --> F["Return bytes or None"]
  G["shorten_link function"] --> H["Use HTTPS endpoint"]
  H --> I["Secure TinyURL API call"]
Loading

File Walkthrough

Relevant files
Security enhancement
main.py
Add URL validation and upgrade to HTTPS                                   

main.py

  • Added URL scheme validation in download_bytes to reject non-HTTP/HTTPS
    URLs early
  • Implemented Content-Length header check to prevent downloading
    oversized files
  • Kept existing post-download size validation as fallback safeguard
  • Upgraded shorten_link TinyURL API endpoint from HTTP to HTTPS
+11/-1   

@vercel

vercel Bot commented Feb 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
discord-link-bot Ready Ready Preview, Comment Feb 3, 2026 1:08pm
discord-link-bot-wfz6 Ready Ready Preview, Comment Feb 3, 2026 1:08pm

@qodo-code-review

Copy link
Copy Markdown

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Silent failure paths: New failure/edge-case branches return None or swallow ValueError without any
logging/context, making it hard to diagnose why a download was rejected (scheme, invalid
Content-Length, or oversized payload).

Referred Code
parsed = urlparse(url)
if parsed.scheme not in {"http", "https"}:
    return None
async with aiohttp.ClientSession() as session:
    async with session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
        if resp.status == 200:
            content_length = resp.headers.get("Content-Length")
            if content_length:
                try:
                    if int(content_length) > MAX_DOWNLOAD_BYTES:
                        return None
                except ValueError:
                    pass

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing security logs: The PR adds new early-return paths for URL downloads (invalid scheme / oversized
Content-Length) but does not emit any audit/security log entry to record the blocked
attempt and outcome.

Referred Code
parsed = urlparse(url)
if parsed.scheme not in {"http", "https"}:
    return None
async with aiohttp.ClientSession() as session:
    async with session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
        if resp.status == 200:
            content_length = resp.headers.get("Content-Length")
            if content_length:
                try:
                    if int(content_length) > MAX_DOWNLOAD_BYTES:
                        return None
                except ValueError:
                    pass

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review

Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
stream chunks to enforce size limit

Modify download_bytes to read the response in chunks, checking the cumulative
size against MAX_DOWNLOAD_BYTES during the download to prevent memory exhaustion
from large files.

main.py [213-215]

-data = await resp.read()
-if len(data) <= MAX_DOWNLOAD_BYTES:
-    return data
+data = bytearray()
+async for chunk in resp.content.iter_chunked(1024):
+    data.extend(chunk)
+    if len(data) > MAX_DOWNLOAD_BYTES:
+        return None
+return bytes(data)
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: This suggestion provides a robust solution to prevent memory exhaustion by streaming the download, which is superior to the current implementation that reads the entire file into memory before checking its size.

High
Abort download on invalid Content-Length header

In the download_bytes function, abort the download if the Content-Length header
is invalid by returning None inside the except ValueError block instead of using
pass.

main.py [206-210]

 try:
     if int(content_length) > MAX_DOWNLOAD_BYTES:
         return None
 except ValueError:
-    pass
+    return None
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a flaw in the new error handling logic that would bypass the Content-Length check, defeating one of the PR's security enhancements and potentially leading to resource exhaustion.

Medium
Security
require valid Content-Type header

In download_bytes, enforce the presence of the Content-Type header by changing
the conditional from (not content_type) or ... to content_type and ....

main.py [212]

-if (not content_type) or any(ct in content_type for ct in ALLOWED_CONTENT_TYPES):
+if content_type and any(ct in content_type for ct in ALLOWED_CONTENT_TYPES):
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: This is a good suggestion for stricter validation, as it correctly points out that allowing a missing Content-Type header could lead to processing unintended file types.

Low
  • More

@Rajrooter Rajrooter merged commit 1aaea7e into main Feb 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant