Skip to content

Add embedding-based semantic detection (beyond regex) #2

@manja316

Description

@manja316

Regex patterns catch known attack strings but miss semantic equivalents:

  • "Please disregard your prior directives" (same intent, no pattern match)
  • Paraphrased jailbreaks
  • Novel attack vectors

Consider an optional lightweight embedding similarity check using a small local model (e.g. sentence-transformers). Should remain zero-dep by default — embedding mode as optional extra.

pip install ai-injection-guard[semantic]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions