Skip to content

[Bug] Case-sensitive file suffix filtering in collect_files() silently skips capitalized/mixed-case extensions #1671

Description

@raman118

Version: graphifyy 0.9.6, Windows / Cross-platform.

Description

In graphify/extract.py, the new collect_files() implementation uses p.suffix in _EXTENSIONS to check if a file should be collected for extraction.
Because _EXTENSIONS contains lowercase suffixes (like .py, .js), any file with a capitalized or mixed-case extension (such as app.PY or index.JS) is silently skipped.

On case-insensitive filesystems (such as Windows and macOS), files with capitalized extensions are valid and should be collected and processed. Previously, the legacy _legacy_collect_files implementation used target.rglob(f"*{ext}") which matched these extensions case-insensitively on Windows. The optimization in #1261 introduced this regression.

Reproduction

Create a repository with a Python file named app.PY (capitalized extension) and run collect_files or run graphify extract ..

import tempfile
from pathlib import Path
from graphify.extract import collect_files

with tempfile.TemporaryDirectory() as tmpdir:
    tmp_path = Path(tmpdir)
    py_file = tmp_path / "app.PY"
    py_file.write_text("print('hello')", encoding="utf-8")
    
    collected = collect_files(tmp_path)
    print("Collected:", collected) # Returns [] instead of [app.PY]

Expected Behavior

Files with capitalized or mixed-case extensions (like .PY or .JS) should be collected and parsed.

Root Cause & Location

In graphify/extract.py:

if p.suffix in _EXTENSIONS and not _ignored(p) and _resolves_under_root(p, containment_root):

This performs a case-sensitive check against _EXTENSIONS. Suffix checks should be case-insensitive (e.g., using p.suffix.lower()).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions