Skip to content

Fix ignore pattern handling#3

Open
NiveditJain wants to merge 3 commits into
LarsenCundric:mainfrom
NiveditJain:fix-ignore-pattern-handling
Open

Fix ignore pattern handling#3
NiveditJain wants to merge 3 commits into
LarsenCundric:mainfrom
NiveditJain:fix-ignore-pattern-handling

Conversation

@NiveditJain

Copy link
Copy Markdown

Summary

  • fix --ignore glob handling so patterns like ~/.vscode-server/** are parsed safely
  • support absolute paths, scan-root-relative paths, bare directory names, ~, *, ?, and **
  • apply ignore patterns in watch mode as well as normal scans
  • skip ignored child directories before adding them to the traversal queue
  • safely ignore invalid config entries instead of crashing
  • document --ignore usage and config support in the README
  • preserve root-project bloat detection when running dev-purge inside a project

Why this is needed

I run dev-purge from a cron job, so ignore rules need to be reliable and non-interactive-safe. If an ignore pattern from CLI/config is malformed, unsupported, or interpreted too narrowly, the scheduled cleanup can either scan paths that should be excluded or fail unexpectedly. This makes folder exclusion predictable for automated runs.

Ignore examples now supported

dev-purge --ignore node_modules
dev-purge --ignore ./legacy-app
dev-purge --ignore '~/Library/Caches/**'

Config file:

{
  ignore: [~/.vscode-server/**, ./vendor]
}

Checks

  • node --check src/scanner.js
  • node --check src/index.js
  • fixture-tested no ignore, absolute glob, relative dir, relative exact path, relative glob, bare directory name, and relative scan root argument
  • verified npm start -- --dry-run --depth 0 -s 0 --ignore node_modules excludes root node_modules

NiveditJain and others added 2 commits May 5, 2026 11:33
Adds a repeatable --ignore CLI option and support for ~/.config/dev-purge/config.json with an ignore array. Patterns are glob-like (supports ~ and **). Scanner respects ignore patterns when discovering projects and bloat dirs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@LarsenCundric

Copy link
Copy Markdown
Owner

Nice fix, ship after one tweak. Notes:

P1 — getFlagValues skips values starting with - (src/index.js:45)
--ignore -tmp is silently dropped and --ignore --legacy swallows the next flag. The single-value getFlagValue doesn't do this — should be consistent. Either accept any next token, or only skip when the next token is in flagsWithValues.

P2 — Move config read inside run()
Top-level await readFile(cfgPath, ...) runs before any flag parsing. If XDG_CONFIG_HOME points at a slow/hung mount, --help hangs. Defer to runtime.

P2 — Guard empty pattern after expandHome
Add if (!expanded) return null; so a stray ~ that expands to "" (rare embedded envs) can't become an accidental match.

P3 — Warn on invalid config entries
cfg.ignore = ["~/.cache", 42] silently drops the 42. For cron-driven use you wanted predictable behavior — a console.warn on rejected patterns helps debug bad configs.

P3 — Note case-sensitivity in README
Bare-name matcher is case-sensitive, which is correct but surprising on macOS HFS+. One-line README note.

Nice catches in this PR

  • **/-prefix glob handling is correct (consumes a path segment, not just chars)
  • The trailing ** "base regex" so foo/** also matches foo — thoughtful
  • The bundled bugfix for root-of-project bloat (canCollectBloatAtDir) is a real fix

LGTM after P1.

- Make getFlagValue/getFlagValues/positional parsing consistent via a
  known-flag check, so an ignore value that starts with "-" (e.g.
  `--ignore -skip`) is captured instead of silently dropped, and a real
  flag after `--ignore` (e.g. `--ignore --json`) is not swallowed. (P1)
- Defer the config.json read into a runtime loadIgnorePatterns() so a slow
  or hung $XDG_CONFIG_HOME mount can no longer hang `dev-purge --help`. (P2)
- Warn on invalid (non-string/empty) config "ignore" entries instead of
  dropping them silently — predictable behavior for cron-driven runs. (P3)
- Guard against a bare "~" expanding to "" in envs with no HOME, which
  would otherwise compile to a match-everything pattern. (P2)
- README: note case-sensitivity of bare-name matching and config validation. (P3)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@NiveditJain

Copy link
Copy Markdown
Author

Thanks for the thorough review — all points addressed in 4bc346a.

P1 — getFlagValues dropping --prefixed values
Replaced the startsWith("-") heuristic with a shared known-flag check (KNOWN_FLAGS / isValueToken) used by getFlagValue, getFlagValues, and the positional parser, so they're now consistent. A token after a flag is consumed as a value unless it's a flag we actually recognize. So:

  • --ignore -skip now captures -skip (verified: ignores only the -skip/ subtree, the sibling project still scans).
  • --ignore --json no longer swallows --json--json is recognized, so it's left alone and still takes effect.
  • A genuine path like -cache/** is treated as a value, not a flag.

P2 — Defer config read out of module top-level
Moved into loadIgnorePatterns(), called inside run()/runWatch() (after the --help short-circuit). Verified --help returns instantly and reads no config even when $XDG_CONFIG_HOME points at a config file.

P2 — Guard empty pattern after expandHome
compileIgnorePattern now returns null when expansion yields "" (bare ~ with no HOME), so it can't become a match-everything pattern.

P3 — Warn on invalid config entries
Non-string / empty ignore entries are now skipped with a console.warn naming the offending value and config path (goes to stderr, so --json stdout stays clean). Verified ["-skip", 42, "", " ", "node_modules"] warns on 42/""/" " and applies the two valid patterns.

P3 — README case-sensitivity note
Added a note that matching is case-sensitive (with the macOS HFS+/APFS caveat), plus a line documenting the config-validation behavior.

Re #2: I'm rebasing that PR on top of this one and dropping its duplicated glob/ignore logic so there's a single implementation — details to follow on that PR. Ready for another look here.

NiveditJain added a commit to NiveditJain/dev-purge that referenced this pull request May 26, 2026
Adds best-effort cleanup of Docker leftovers alongside the filesystem scan:
exited containers (status=exited) and dangling images (dangling=true) only —
running containers and tagged images are never targeted. Docker is invoked via
execFile with an argv array (no shell) with a 15s timeout.

New surface:
- `--containers-only` / `--images-only` (mutually exclusive) and
  `--category containers` / `--category images` to scope to runtime artifacts;
  runtime-only modes skip the filesystem walk entirely.
- `--older-than` is applied to dated artifacts; artifacts with no parseable
  creation date are kept (safe choice) with a warning explaining why.
- Per-item delete with separate success/failure tracking.

Rebased on top of LarsenCundric#3 (ignore-pattern handling) so there is a single glob/ignore
implementation: the duplicated `globToRegExp`, `--ignore`/config loading, and
`canCollectBloatAtDir` root-bloat fix that an earlier version of this branch
carried are dropped and inherited from LarsenCundric#3 instead.

Review fixes:
- error when --containers-only and --images-only are combined
- warn (don't silently drop) on unparseable Docker JSON output lines
- warn when --older-than keeps an artifact due to an unparseable date
- show the real Docker error dimmed when runtime cleanup is skipped
- combined early-return so runtime-only runs print/act correctly

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@NiveditJain

Copy link
Copy Markdown
Author

@LarsenCundricthis is step 1 in the order below.

Ready to merge + ship 🚀

Both PRs are MERGEABLE / CLEAN and all review feedback is addressed. Since I only have read access here, the merge + publish are yours to do. Suggested order:

1. Merge #3 (fix-ignore-pattern-handling)   ← first
2. Merge #2 (feature/docker-purge)           ← stacked on #3; its diff collapses to just the Docker changes once #3 lands
3. On main:  npm version minor               # 1.0.0 -> 1.1.0
4.           npm publish                      # publishes the new features to npm

No release CI is configured, so step 3–4 are a manual publish by the npm package owner. Happy to prep the version bump as a separate PR if you’d prefer that to be in git first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants