Skip to content

Comments

fix: prevent path prefix collision in _validate_file_paths#2346

Open
ashnaaseth2325-oss wants to merge 2 commits intoOWASP:masterfrom
ashnaaseth2325-oss:fix/path-prefix-collision-v2
Open

fix: prevent path prefix collision in _validate_file_paths#2346
ashnaaseth2325-oss wants to merge 2 commits intoOWASP:masterfrom
ashnaaseth2325-oss:fix/path-prefix-collision-v2

Conversation

@ashnaaseth2325-oss
Copy link

Summary

This PR fixes a path-prefix collision vulnerability in scripts/convert.py, specifically in _validate_file_paths().

The function is intended to enforce that both source_path and output_dir remain confined within convert_vars.BASE_PATH before being passed to the LibreOffice subprocess. However, the containment check used a naïve string prefix comparison:

base_path = os.path.abspath(convert_vars.BASE_PATH)

if not source_path.startswith(base_path):
    return False, f"Source path outside base directory: {source_path}", ""

if not output_dir.startswith(base_path):
    return False, f"Output directory outside base directory: {output_dir}", ""

This implementation incorrectly treats string prefix matching as filesystem containment. As a result, sibling directories such as:

/opt/cornucopia_extra/
/opt/cornucopia2/

would pass validation if BASE_PATH = /opt/cornucopia.

Notably, the same file already contains the correct containment pattern in _safe_extractall() using + os.sep, but that safeguard was not applied here.

This creates a bypass of the security boundary that protects the LibreOffice subprocess.


Steps to Reproduce

Assume:

BASE_PATH = /opt/cornucopia
  1. Create a sibling directory:

    mkdir -p /opt/cornucopia_extra
  2. Place a valid document inside:

    cp sensitive.odt /opt/cornucopia_extra/
  3. Run:

    python scripts/convert.py \
        --inputfile /opt/cornucopia_extra/sensitive.odt \
        --outputfile output/test.pdf \
        --pdf
  4. The buggy check evaluates:

    "/opt/cornucopia_extra/sensitive.odt".startswith("/opt/cornucopia")

    True (incorrectly allowed)

LibreOffice then processes the file and writes output without error.


Impact

This vulnerability allows:

  • Reading arbitrary documents located in sibling directories
  • Writing output PDFs outside the intended project subtree
  • Bypassing the intended filesystem boundary before invoking an external parser (LibreOffice)
  • Abuse in CI/CD environments where predictable sibling directories exist

This maps to:

  • OWASP A01:2021 — Broken Access Control
  • CWE-22 — Improper Limitation of a Pathname to a Restricted Directory

The issue is particularly critical because _validate_file_paths() is the only explicit security gate before spawning an external process that parses untrusted document formats.


Root Cause

The containment logic uses:

path.startswith(base_path)

instead of:

path.startswith(base_path + os.sep)

This confuses string prefix comparison with filesystem boundary enforcement.

A correct implementation already exists in _safe_extractall():

if not member_path.startswith(abs_target + os.sep):
    raise ValueError(...)

However, _validate_file_paths() was not updated when that fix was introduced, leading to inconsistent security enforcement within the same file.


Fix

The fix mirrors the already-correct _safe_extractall() implementation and ensures consistency:

base_path = os.path.realpath(convert_vars.BASE_PATH)

if not source_path.startswith(base_path + os.sep):
    return False, ...

if not output_dir.startswith(base_path + os.sep):
    return False, ...

Changes include:

  • Appending os.sep to prevent prefix collisions
  • Using os.path.realpath() to resolve symlinks
  • Aligning behavior with _safe_extractall()
  • Adding regression test coverage to block sibling-prefix bypass

This is a minimal, contained fix with no behavioral change for legitimate paths.


Result

After this fix:

  • LibreOffice subprocess access is strictly confined to BASE_PATH/
  • Prefix-collision sibling paths are correctly rejected
  • Security logic is consistent across the file
  • CI/CD attack surface is eliminated
  • Path containment checks are filesystem-accurate

The fix restores the intended security boundary with minimal code changes and clear regression protection.

@ashnaaseth2325-oss
Copy link
Author

Hi @sydseter ,@rewtd @cw-owasp , I noticed _validate_file_paths() was still using a raw startswith(base_path) check, which allows sibling prefix collisions (e.g., /base_evil). This PR aligns it with _safe_extractall() using base_path + os.sep and realpath() for proper containment.

@sydseter
Copy link
Collaborator

@ashnaaseth2325-oss, ok, fine, I guess. Remember to make sure your commits are verified.

Replace abspath with realpath and append os.sep to base_path checks
to block sibling-directory traversal (e.g. /base_evil bypassing /base).
Mirrors the correct pattern already used in _safe_extractall (CWE-22).

Signed-off-by: ashnaaseth2325-oss <ashnaaseth2325@gmail.com>
@ashnaaseth2325-oss ashnaaseth2325-oss force-pushed the fix/path-prefix-collision-v2 branch from 7cac812 to 2d4abf2 Compare February 23, 2026 11:06
@ashnaaseth2325-oss
Copy link
Author

Hi @sydseter
I’ve re-signed the commits and they are now verified. Please feel free to take another look when convenient. Thanks for the guidance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants