Skip to content

fix(security): SEC010 reads function name + walks past inner blocks#11

Merged
wei9072 merged 1 commit into
mainfrom
fix/sec010-fn-name-needle
May 6, 2026
Merged

fix(security): SEC010 reads function name + walks past inner blocks#11
wei9072 merged 1 commit into
mainfrom
fix/sec010-fn-name-needle

Conversation

@wei9072

@wei9072 wei9072 commented May 6, 2026

Copy link
Copy Markdown
Owner

Summary

Round 9's production validation (gpt-5.4-mini × Go/Java brownfield)
caught two related bugs in `enclosing_token_context` that hid
SEC010 from real Java code, even though my unit tests passed:

The Java production case that didn't fire:

```java
public static String generateSessionToken() {
String chars = "abcdef0123456789";
StringBuilder token = new StringBuilder();
for (int i = 0; i < 16; i++) {
int idx = new Random().nextInt(chars.length()); // ← weak, but silent
token.append(chars.charAt(idx));
}
return token.toString();
}
```

Walking up from `new Random().nextInt(...)`:

  • variable_declarator (`idx = new Random().nextInt(...)`) — needle check on "idx" fails
  • local_variable_declaration — same
  • block (for body) — opaque-id heuristic on body text fails (no `string.ascii_letters`); previous code `break`d here
  • for_statement
  • block (method body)
  • method_declaration — name is `generateSessionToken`, contains `session` AND `token`, but we never reached it

Three fixes

  1. Don't `break` at inner blocks. Only function-shape nodes
    terminate the upward walk. Inner `block` / `function_body` nodes
    contribute text to the opaque-id heuristic, then keep climbing.
  2. Read function name field. When the walk lands at a
    function-shape node, run the same needle list against its
    `name` field. `generateSessionToken` matches `session` /
    `token`. `make_short_code` matches `short_code`.
  3. Add `method_declaration` + `constructor_declaration` to
    the function-shape match list. Java tree-sitter uses these,
    not `method_definition`.

Tests

  • `sec010_java_new_random_in_session_token_function_blocks` —
    the Round 9 production case. Was the FN that triggered this PR.
  • `sec010_python_random_in_token_function_blocks` — Python
    parallel using `make_session_token` to verify the same fix
    benefits Python.
  • `sec010_random_in_dice_function_does_not_block` — function
    name `roll_dice` has no needle, opaque-id heuristic doesn't
    fire either, must stay silent.

`cargo test --workspace`: 156 / 156 pass (was 153; +3 new).

Live MCP confirmation

Pre-fix:
```
$ aegis_validate.py starting-java/Auth.java
FINDINGS ...: 18 total (security=1, signal=16, workspace=1)
[security] SEC012 @line 20 ...
```

Post-fix:
```
$ aegis_validate.py starting-java/Auth.java
FINDINGS ...: 19 total (security=2, signal=16, workspace=1)
[security] SEC012 @line 20 ...
[security] SEC010 @line 28 weak RNG used for security token ...
```

Test plan

🤖 Generated with Claude Code

…r blocks

Round 9 production validation surfaced two related bugs in
enclosing_token_context that hid SEC010 from real Java code:

1. **Inner blocks terminated the upward walk too early.**
   `for (...) { int idx = new Random().nextInt(...) }` walks up
   from the call through `variable_declarator` →
   `local_variable_declaration` → `block` (for body). The previous
   code matched any `block` and called `break` after running the
   opaque-id heuristic, never reaching the surrounding
   `method_declaration`. Fix: only `break` at function-shape
   nodes; inner blocks (for / if / while bodies) just contribute
   their text to the opaque-id check and let the walk continue.

2. **Function-name needle check was missing.** Once the walk
   reaches the function-shape node, read its `name` field and
   match against the same security-needle list used on
   assignments. `generateSessionToken` contains `session` and
   `token`; `make_short_code` contains `short_code` (already
   covered via the assignment text in many cases, but now
   reliable when the call uses a generic loop variable like
   `idx`).

3. **Java/C# `method_declaration` and `constructor_declaration`
   were not in the function-shape list.** Added.

Tests: 3 new — Java `generateSessionToken`-style production case
(was the FN that triggered this fix); Python `make_session_token`
parallel; `roll_dice` negative case to confirm we don't FP on
non-security functions.

156 / 156 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@wei9072 wei9072 merged commit 6ded2c0 into main May 6, 2026
1 check passed
@wei9072 wei9072 deleted the fix/sec010-fn-name-needle branch May 6, 2026 04:02
wei9072 added a commit that referenced this pull request May 6, 2026
Round 9 validation surfaced that SEC009's `last == "md5" | "sha1"`
matcher only fired on Python `hashlib.md5(...)` and never on Go,
Java, C#, PHP, or Node — exactly the same multi-language coverage
gap PR #10 fixed for SEC010.

Per-language matchers, mirroring PR #10's architecture:

- **Python**: `hashlib.md5` / `hashlib.sha1`. Receiver-anchored,
  excludes `Crypto.Hash.MD5.new` from PyCryptodome (out of scope
  for now).
- **Node / JS**: `crypto.createHash('md5'|'sha1'|'sha-1')`. The
  algorithm lives in the first string arg; new
  `first_arg_is_weak_alg_string()` helper inspects the literal.
- **Go**: `md5.Sum` / `md5.New` / `sha1.Sum` / `sha1.New` with
  Layer 1 import resolution against `crypto/md5` and `crypto/sha1`.
  Round 9's `h := md5.Sum([]byte(password))` case.
- **Java / Kotlin**: `MessageDigest.getInstance("MD5")` (string
  arg), Apache Commons `DigestUtils.md5Hex` / `sha1Hex`. Round 9's
  `MessageDigest.getInstance("MD5")` case.
- **C#**: `MD5.Create()` / `SHA1.Create()` / `MD5CryptoServiceProvider`
  / `MD5Managed` and SHA1 equivalents. Receiver-anchored to avoid
  matching `SHA1024` / `SomeMD5Field`.
- **PHP**: global `md5(...)` / `sha1(...)`, plus `hash('md5', ...)`
  / `hash('sha1', ...)` (string arg). `hash('sha256', ...)` stays
  silent.

Also fixes `enclosing_security_context` with the same improvements
PR #11 applied to `enclosing_token_context`:

- Multi-language assignment node kinds (Go `short_var_declaration`,
  Java `local_variable_declaration`, etc.).
- Function-name needle check at function-shape level. Round 9 case:
  `func HashPassword(password string)` calls `md5.Sum` via local
  `h :=`; the function name carries the `password` needle even
  though the local assignment doesn't.
- Walks past inner blocks (don't break at for / if body, only at
  function shape).

Tests: 8 new — Node createHash md5 (positive) / sha256 (negative),
Go md5.Sum + crypto/md5 import, Java MessageDigest.getInstance MD5
(positive) / SHA-256 (negative), PHP md5 (positive) / hash sha256
(negative), C# MD5.Create.

164 / 164 tests pass.

Live MCP confirmation: Round 9's starting-go/auth.go and
starting-java/Auth.java now both fire all three planted findings
(SEC009, SEC010, SEC012) instead of 2 / 1 respectively.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
wei9072 added a commit that referenced this pull request May 7, 2026
Adds the empirical validation work that drove PRs #6#12 into
the repo as a reproducibility archive under `experiments/`.

Contents:

- `comparison-report.md` (1199 lines, rolling Round 1 → Round 9
  analysis). Documents the three Aegis ROI mechanisms surfaced
  from data:
    1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models)
    2. Structural guardrail (cycle / public_symbol_removed —
       0/14 hits, dead weight on clean architectures)
    3. Anti-paralysis ritual (weak models complete tasks they
       would otherwise abandon)
- 4 starting-code fixtures (Python brownfield, Go brownfield,
  Java brownfield, Python multi-module).
- 11 prompt files. Each task has paired `-a.txt` (no Aegis) and
  `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction).
- 52 round directories: per-model deliverables + `run.log` for
  codex-driven rounds. Naming: <model>-<task>-<a|b>.
- `aegis_validate.py` — Python wrapper around aegis-mcp stdio
  JSON-RPC, used by agents to run validation after each write.
- 3 eval scripts that diff each round's deliverables against the
  planted SEC bugs.

Excluded via `.gitignore` and rsync filter (would have been ~970MB
of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain
(Go toolchain copies codex downloads) / compiled binaries / git
metadata of nested repos. Final archive: 17MB.

Direct lineage from this archive into Aegis code:

| Round | Surfaced | Fixed in |
|---|---|---|
| Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 |
| Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 |
| Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 |

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
wei9072 added a commit that referenced this pull request May 7, 2026
#14)

- Main README.md and README.zh-TW.md: new "Experiments archive" /
  「實驗資料」 section linking to experiments/README.md
- experiments/README.md: replaces the prose-only intro with four
  analysis charts:
    1. Round structure overview (52 dirs, 26 paired, 11 models)
    2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model
       remaining-bug counts (real numbers from re-validating the
       archive against current rule library)
    3. Plan C multi-module task-completion matrix surfacing the
       anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs
       g54mini-mc-b completed same task)
    4. Mermaid flowchart of the three Aegis ROI mechanisms
       (rule-hit → fix; structural guardrail; anti-paralysis ritual)
- Plus a "Direct lineage" chart connecting each experiment finding
  to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12).

Also re-imports `experiments/31flashlite-amb-b/` as plain files —
the previous commit accidentally captured it as a gitlink because
codex had created a nested `.git/` inside the round dir during the
agent run. Removed the nested `.git/` and re-added the 6 actual
deliverable files.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant