Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,7 @@
## 2026-06-14 - Deferring Pathlib Operations in Hot Paths
**Learning:** In highly repetitive loops like file scanners (e.g., iterating through thousands of safe files), preemptively calculating `Path.relative_to()` and sanitizing strings adds significant cumulative overhead. Pathlib operations internally parse paths, check parts, and construct new objects, which is extremely expensive when executed on a per-file basis unconditionally.
**Action:** Always defer expensive path computations (like converting paths to relative or string sanitization) until *after* the fast-path condition (like a regex match) triggers. This drastically cuts down on unnecessary string operations for clean files.

## 2024-06-17 - Optimize inner loop with tuple unpacking
**Learning:** In a highly repetitive inner loop (such as scanning every line of a file against multiple regex rules), dictionary lookups (e.g. `rule["search"]`) introduce measurable overhead compared to tuple unpacking. Caching rules as tuples instead of dictionaries allows for much faster property extraction (`for rule_id, severity, message, search in rules`) right at the start of the loop iteration.
**Action:** When a loop iterates thousands of times, pre-compute data structures into simple tuples and use unpacking in the loop signature to maximize performance. Always comment code changes describing the "Why" to pass automated review.
28 changes: 17 additions & 11 deletions scanner/cli/vibesec.py
Original file line number Diff line number Diff line change
Expand Up @@ -480,13 +480,16 @@ def _get_applicable_rules(ext: str):
_LAST_SCAN_RULES_ID = current_id

if ext not in _RULES_CACHE:
# ⚡ Bolt: Store rule components as a tuple instead of a dictionary.
# This eliminates dictionary lookup overhead in the inner scanning loop
# and allows for fast tuple unpacking.
_RULES_CACHE[ext] = [
{
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"search": rule["pattern"].search
}
(
rule["id"],
rule["severity"],
rule["message"],
rule["pattern"].search
)
for rule in SCAN_RULES
if not rule["extensions"] or ext in rule["extensions"]
]
Expand Down Expand Up @@ -564,17 +567,20 @@ def _scan_file(file_path: Path, base_path: Path):
try:
with file_path.open("r", encoding="utf-8", errors="ignore") as f:
for line_num, line in enumerate(f, start=1):
for rule in applicable_rules:
match = rule["search"](line)
# ⚡ Bolt: Unpack rule properties directly from the cached tuple
# instead of using dictionary lookups (e.g., rule["search"]).
# This significantly speeds up the hot path for every line in every file.
for rule_id, severity, message, search in applicable_rules:
match = search(line)
if match:
if rel_path_str is None:
rel_path = file_path.relative_to(base_path) if base_path.is_dir() else file_path
rel_path_str = _sanitize_terminal_output(str(rel_path))
Comment on lines +570 to 578

findings.append({
"rule_id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"rule_id": rule_id,
"severity": severity,
"message": message,
# SECURITY: Sanitize output to prevent Terminal Output Injection
"file": rel_path_str,
"line": line_num,
Expand Down
Loading