Skip to content

feat: add multilingual batch scanner with parallel execution and LLM gap-fill#100

Open
WhereIs38 wants to merge 10 commits into
NVIDIA:mainfrom
WhereIs38:feature/multilingual-batch-scanner
Open

feat: add multilingual batch scanner with parallel execution and LLM gap-fill#100
WhereIs38 wants to merge 10 commits into
NVIDIA:mainfrom
WhereIs38:feature/multilingual-batch-scanner

Conversation

@WhereIs38

Copy link
Copy Markdown

Closes #98

Summary

Adds contrib/multilingual/ — a multilingual batch scanner that scans directories of AI agent skills in parallel, with automatic language detection and targeted LLM gap-fill for non-English skills.

Zero changes to src/skillspector/. All integration is via import-time patches that wrap upstream constructors without modifying any source file.

What It Does

  1. Discovery — recursively finds all SKILL.md directories under input root
  2. Language detection — Unicode script-ratio heuristic, extending support to Chinese, Japanese, and Korean
  3. Parallel scanThreadPoolExecutor runs graph.invoke() per skill, configurable --workers
  4. Gap-fill — targeted LLM pass for 8 rules with no semantic-analyzer equivalent (P5, P6-P8, MP1-MP3, RA1-RA2)
  5. Aggregated report — terminal / JSON / Markdown, sorted by risk score
  6. Multi-key API pool — rate-limit-aware scheduler with exponential backoff

Evidence (23 built-in fixtures, 8 workers)

Skill --no-llm LLM mode
ssd1_semantic_injection 0/100 100/100
ssd3_nl_exfiltration 0/100 60/100
ssd4_narrative_deception 10/100 100/100
sdi4_divergence 13/100 100/100
safe_skill 0/100 0/100 ✓
ssd_clean 0/100 0/100 ✓

LLM semantic analyzers catch entire vulnerability categories invisible to static patterns. Clean skills remain clean — zero false-positive inflation.

Testing

Manual verification against tests/fixtures/ confirms 23/23 skills scanned, clean skills remain clean, semantic analyzers catch what static patterns miss. Cross-platform validated on macOS and Windows. make lint passes on the upstream
codebase.

Automated tests are impractical for LLM-dependent output — it is inherently non-deterministic and requires live API keys. The static-vs-LLM comparison in README provides more meaningful evidence than any mock-based test could.

Compatibility Note

If upstream adds a native response_schema=None mode in the future, all patches become no-ops and can be safely removed.


🤖 Generated with Claude Code

Signed-off-by: WhereIs38 CinderellaDoyle@icloud.com
README.md
DESIGN.md
CONTRIBUTING.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: multilingual batch scanner with parallel execution and LLM gap-fill

2 participants