fix(security): SEC010 reads function name + walks past inner blocks#11
Merged
Conversation
…r blocks
Round 9 production validation surfaced two related bugs in
enclosing_token_context that hid SEC010 from real Java code:
1. **Inner blocks terminated the upward walk too early.**
`for (...) { int idx = new Random().nextInt(...) }` walks up
from the call through `variable_declarator` →
`local_variable_declaration` → `block` (for body). The previous
code matched any `block` and called `break` after running the
opaque-id heuristic, never reaching the surrounding
`method_declaration`. Fix: only `break` at function-shape
nodes; inner blocks (for / if / while bodies) just contribute
their text to the opaque-id check and let the walk continue.
2. **Function-name needle check was missing.** Once the walk
reaches the function-shape node, read its `name` field and
match against the same security-needle list used on
assignments. `generateSessionToken` contains `session` and
`token`; `make_short_code` contains `short_code` (already
covered via the assignment text in many cases, but now
reliable when the call uses a generic loop variable like
`idx`).
3. **Java/C# `method_declaration` and `constructor_declaration`
were not in the function-shape list.** Added.
Tests: 3 new — Java `generateSessionToken`-style production case
(was the FN that triggered this fix); Python `make_session_token`
parallel; `roll_dice` negative case to confirm we don't FP on
non-security functions.
156 / 156 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 6, 2026
Round 9 validation surfaced that SEC009's `last == "md5" | "sha1"` matcher only fired on Python `hashlib.md5(...)` and never on Go, Java, C#, PHP, or Node — exactly the same multi-language coverage gap PR #10 fixed for SEC010. Per-language matchers, mirroring PR #10's architecture: - **Python**: `hashlib.md5` / `hashlib.sha1`. Receiver-anchored, excludes `Crypto.Hash.MD5.new` from PyCryptodome (out of scope for now). - **Node / JS**: `crypto.createHash('md5'|'sha1'|'sha-1')`. The algorithm lives in the first string arg; new `first_arg_is_weak_alg_string()` helper inspects the literal. - **Go**: `md5.Sum` / `md5.New` / `sha1.Sum` / `sha1.New` with Layer 1 import resolution against `crypto/md5` and `crypto/sha1`. Round 9's `h := md5.Sum([]byte(password))` case. - **Java / Kotlin**: `MessageDigest.getInstance("MD5")` (string arg), Apache Commons `DigestUtils.md5Hex` / `sha1Hex`. Round 9's `MessageDigest.getInstance("MD5")` case. - **C#**: `MD5.Create()` / `SHA1.Create()` / `MD5CryptoServiceProvider` / `MD5Managed` and SHA1 equivalents. Receiver-anchored to avoid matching `SHA1024` / `SomeMD5Field`. - **PHP**: global `md5(...)` / `sha1(...)`, plus `hash('md5', ...)` / `hash('sha1', ...)` (string arg). `hash('sha256', ...)` stays silent. Also fixes `enclosing_security_context` with the same improvements PR #11 applied to `enclosing_token_context`: - Multi-language assignment node kinds (Go `short_var_declaration`, Java `local_variable_declaration`, etc.). - Function-name needle check at function-shape level. Round 9 case: `func HashPassword(password string)` calls `md5.Sum` via local `h :=`; the function name carries the `password` needle even though the local assignment doesn't. - Walks past inner blocks (don't break at for / if body, only at function shape). Tests: 8 new — Node createHash md5 (positive) / sha256 (negative), Go md5.Sum + crypto/md5 import, Java MessageDigest.getInstance MD5 (positive) / SHA-256 (negative), PHP md5 (positive) / hash sha256 (negative), C# MD5.Create. 164 / 164 tests pass. Live MCP confirmation: Round 9's starting-go/auth.go and starting-java/Auth.java now both fire all three planted findings (SEC009, SEC010, SEC012) instead of 2 / 1 respectively. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 7, 2026
Adds the empirical validation work that drove PRs #6 — #12 into the repo as a reproducibility archive under `experiments/`. Contents: - `comparison-report.md` (1199 lines, rolling Round 1 → Round 9 analysis). Documents the three Aegis ROI mechanisms surfaced from data: 1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models) 2. Structural guardrail (cycle / public_symbol_removed — 0/14 hits, dead weight on clean architectures) 3. Anti-paralysis ritual (weak models complete tasks they would otherwise abandon) - 4 starting-code fixtures (Python brownfield, Go brownfield, Java brownfield, Python multi-module). - 11 prompt files. Each task has paired `-a.txt` (no Aegis) and `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction). - 52 round directories: per-model deliverables + `run.log` for codex-driven rounds. Naming: <model>-<task>-<a|b>. - `aegis_validate.py` — Python wrapper around aegis-mcp stdio JSON-RPC, used by agents to run validation after each write. - 3 eval scripts that diff each round's deliverables against the planted SEC bugs. Excluded via `.gitignore` and rsync filter (would have been ~970MB of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain (Go toolchain copies codex downloads) / compiled binaries / git metadata of nested repos. Final archive: 17MB. Direct lineage from this archive into Aegis code: | Round | Surfaced | Fixed in | |---|---|---| | Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 | | Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 | | Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 | Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 7, 2026
#14) - Main README.md and README.zh-TW.md: new "Experiments archive" / 「實驗資料」 section linking to experiments/README.md - experiments/README.md: replaces the prose-only intro with four analysis charts: 1. Round structure overview (52 dirs, 26 paired, 11 models) 2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model remaining-bug counts (real numbers from re-validating the archive against current rule library) 3. Plan C multi-module task-completion matrix surfacing the anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs g54mini-mc-b completed same task) 4. Mermaid flowchart of the three Aegis ROI mechanisms (rule-hit → fix; structural guardrail; anti-paralysis ritual) - Plus a "Direct lineage" chart connecting each experiment finding to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12). Also re-imports `experiments/31flashlite-amb-b/` as plain files — the previous commit accidentally captured it as a gitlink because codex had created a nested `.git/` inside the round dir during the agent run. Removed the nested `.git/` and re-added the 6 actual deliverable files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Round 9's production validation (gpt-5.4-mini × Go/Java brownfield)
caught two related bugs in `enclosing_token_context` that hid
SEC010 from real Java code, even though my unit tests passed:
The Java production case that didn't fire:
```java
public static String generateSessionToken() {
String chars = "abcdef0123456789";
StringBuilder token = new StringBuilder();
for (int i = 0; i < 16; i++) {
int idx = new Random().nextInt(chars.length()); // ← weak, but silent
token.append(chars.charAt(idx));
}
return token.toString();
}
```
Walking up from `new Random().nextInt(...)`:
Three fixes
terminate the upward walk. Inner `block` / `function_body` nodes
contribute text to the opaque-id heuristic, then keep climbing.
function-shape node, run the same needle list against its
`name` field. `generateSessionToken` matches `session` /
`token`. `make_short_code` matches `short_code`.
the function-shape match list. Java tree-sitter uses these,
not `method_definition`.
Tests
the Round 9 production case. Was the FN that triggered this PR.
parallel using `make_session_token` to verify the same fix
benefits Python.
name `roll_dice` has no needle, opaque-id heuristic doesn't
fire either, must stay silent.
`cargo test --workspace`: 156 / 156 pass (was 153; +3 new).
Live MCP confirmation
Pre-fix:
```
$ aegis_validate.py starting-java/Auth.java
FINDINGS ...: 18 total (security=1, signal=16, workspace=1)
[security] SEC012 @line 20 ...
```
Post-fix:
```
$ aegis_validate.py starting-java/Auth.java
FINDINGS ...: 19 total (security=2, signal=16, workspace=1)
[security] SEC012 @line 20 ...
[security] SEC010 @line 28 weak RNG used for security token ...
```
Test plan
`os.urandom`, `SecureRandom`, `crypto/rand` all still silent)
silent
🤖 Generated with Claude Code