Skip to content

Fix(findrive): reject empty/whitespace content in upload_file before DB write#472

Closed
Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Jean-Regis-M:patch-55
Closed

Fix(findrive): reject empty/whitespace content in upload_file before DB write#472
Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Jean-Regis-M:patch-55

Conversation

@Jean-Regis-M
Copy link
Copy Markdown
Contributor

@Jean-Regis-M Jean-Regis-M commented Apr 10, 2026

Summary

Fixes #367 upload_file accepted empty and whitespace-only content, persisting
a record with content_text="" and file_size=0, causing silent failures when
agents read the file back.


Problem

The only guard in upload_file is an upper-bound size check:

if len(content.encode("utf-8")) > max_size:
    return {"error": "..."}

For content="", len("".encode()) evaluates to 0, so 0 > max_size is
always False. The call falls through to repo.create_file() with blank
content_text and file_size=0 written to the database with no error raised.

When an agent later calls get_file, it receives extracted_text: "" no
error, no signal, just empty content entering the LLM context window silently.


Root Cause

len("".encode("utf-8")) > max_size0 > 512000False.

The size guard is an upper-bound check only. There is no lower-bound presence
check, so empty and whitespace-only strings pass validation entirely and reach
repo.create_file() unconditionally.

Classification: Validation gap missing content presence check before the size guard.


Fix

Two lines added immediately before the max_size guard in upload_file:

# ADDED  reject before size check and any DB interaction
if not content or not content.strip():
    return {"error": "File content must not be empty"}

max_size = config.get("max_file_size_kb", 500) * 1024
if len(content.encode("utf-8")) > max_size:
    return {"error": f"File exceeds maximum size of {config.get('max_file_size_kb', 500)}KB"}
  • not content catches "" and None
  • not content.strip() catches whitespace-only strings (" ", "\n\t")
  • Returns before db_session is opened no DB write occurs on invalid input
  • Return shape matches every other early-return guard in the same function

Zero other lines touched.


Behaviour

Input Before After
content="" Stored with file_size=0 {"error": "File content must not be empty"}
content=" \n" Stored with file_size=0 {"error": "File content must not be empty"}
content="invoice text..." ✅ Stored normally ✅ Stored normally path unchanged
content exceeds 500 KB Size error returned Size error returned unchanged

Testing

# Regression being fixed  must now pass
pytest tests/unit/mcp/test_findrive.py::TestUploadFile::test_fd_upload_008_empty_content_accepted_without_validation -v

# Existing happy path  must continue to pass
pytest tests/unit/mcp/test_findrive.py::TestUploadFile::test_fd_upload_001_upload_returns_file_id_and_metadata -v

Tasks

  • Traced execution path confirming 0 > max_size always evaluates False for empty input
  • Confirmed no presence check existed anywhere before repo.create_file() was reached
  • Added not content or not content.strip() guard covering both empty and whitespace-only cases
  • Placed guard before db_session open, no DB interaction occurs on invalid input
  • Verified return shape is consistent with existing early-return guards in upload_file
  • Confirmed no other functions, files, or call sites were modified
  • test_fd_upload_008 passes empty content now correctly rejected
  • test_fd_upload_001 passes valid upload path fully intact

…_file

Root cause:
The max_size guard evaluated len(''.encode()) > max_size as False,
allowing empty content to pass through to repo.create_file() with
file_size=0 and blank content_text.

Solution:
Added presence check (not content or not content.strip()) immediately
before the size guard, returning an error dict on empty/whitespace input.

Impact:
Early return prevents any DB write; no existing valid-content paths
are affected; error response shape is consistent with existing guards.

Signed-off-by: JEAN REGIS <240509606@firat.edu.tr>
@stealthwhizz
Copy link
Copy Markdown
Contributor

@Jean-Regis-M Hey, just wanted to flag that this PR references Fixes #268 but that issue is about InvoiceCountEvaluator sharing badge progress across users - unrelated to this fix. Based on the description and the test name test_fd_upload_008, this actually fixes #367 (FD-UPLOAD-008: Empty content accepted; file_size stored as 0). Could you update the PR description to reference the correct issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug_158_MUST_FIX: FD-UPLOAD-008 — Empty content accepted; file_size stored as 0

2 participants