Skip to content

punk-security/SAIST

Repository files navigation

Maintenance Maintainer Docker Pulls

🪄 SAIST - Static AI-powered Scanning Tool

Scan anything with ✨ AI ✨ — spot vulnerabilities fast.


🚀 About

SAIST (Static AI-powered Scanning Tool) is an open-source project that scans codebases for vulnerabilities using AI.
It supports multiple LLMs, and can scan full codebases, diffs between commits, or even GitHub PRs automatically.

Bonus: It can even generate DevSecOps poems if you're feeling whimsical. 🎤

Lots of vendors are rushing to charge a crazy amount of money to simply throw your code through ChatGPT.

Well, now you can cut out the middle man and scan them yourself using SAIST (and choose whichever LLM you like).

We support OLLAMA for local / offline code scanning.


✨ Features

  • AI-powered vulnerability scanning for entire codebases
  • Diff scanning: Git commits, branches, or PRs
  • Multi-LLM support: OpenAI, Anthropic, Bedrock, DeepSeek, Gemini, Ollama
  • Filesystem, Git, GitHub PR scanning modes
  • Pattern-based file inclusion/exclusion using .saist.include and .saist.ignore
  • Project-specific analysis skills loaded from Markdown files to teach SAIST app routing, authorization, framework conventions, and other local security context
  • LLM-generated analysis skills for bootstrapping those files in a separate run
  • Interactive chat with your findings
  • Web server UI to view results
  • CSV export of findings
  • PDF report: Generate PDF reports of SAIST findings
  • CI/CD pipeline friendly (exit 1 on findings)

🛠️ Installation

Run direct

git clone https://github.com/punk-security/saist.git
cd saist
pip install -r requirements.txt

Run via docker

docker pull punksecurity/saist

📦 Usage

saist/main.py --llm <llm_provider> [options] {filesystem | git | github | poem}
# or via docker
docker run punksecurity/saist --llm <llm_provider> [options] {filesystem | git | github | poem}

Set your LLM API key with environment variable:

export SAIST_LLM_API_KEY=your-api-key

⚡ Examples

Task Command
Get a DevSecOps poem saist/main.py --llm openai poem
Scan a local folder saist/main.py --llm deepseek filesystem /path/to/code
Scan a local folder file-by-file saist/main.py --llm deepseek --deep filesystem /path/to/code
Scan a local folder with ollama from within docker docker run --network=host -v <folder_path>:/vulnerableapp -v $PWD/reporting:/app/reporting punksecurity/saist --llm ollama --llm-model gemma3:4b fileystem /vulnerableapp
Scan a local Git repo saist/main.py --llm openai git /path/to/repo
Scan a local Git repo (branch diff) saist/main.py --llm openai git /path/to/repo --ref-for-compare main --ref-to-compare feature-branch
Scan a GitHub PR (and update the PR) saist/main.py --llm anthropic github yourorg/yourrepo 1234 --github-token your-token
Launch web server to view findings saist/main.py --llm deepseek --web filesystem /path/to/code
Interactive shell after scanning saist/main.py --llm ollama --interactive filesystem /path/to/code
Export findings as CSV saist/main.py --llm openai --csv filesystem /path/to/code
Generate analysis skills saist/main.py --llm openai --generate-skills filesystem /path/to/code
Scan with docker and export findings as PDF report docker run -v <folder_path>:/vulnerableapp -v $PWD/reporting:/app/reporting punksecurity/saist --llm openai --pdf filesystem /vulnerableapp
Scan with docker and export findings as PDF report with a project title docker run -v <folder_path>:/vulnerableapp -v $PWD/reporting:/app/reporting punksecurity/saist --llm openai --pdf --project-name "Project Name" filesystem /vulnerableapp
Scan with docker and retain cache for future runs docker run -v <folder_path>:/vulnerableapp -v $PWD/SAISTCache:/app/SAISTCache punksecurity/saist --llm openai filesystem /vulnerableapp
Change caching folder saist/main.py --llm openai --cache-folder /path/to/cache filesystem /path/to/code
Disable findings cache saist/main.py --llm openai --disable-caching filesystem /path/to/code

🗂️ File Filtering

saist respects file include/exclude rules via two optional files in the root of the project:

File Purpose
saist.include List of .gitignore-style patterns to include
saist.ignore List of .gitignore-style patterns to ignore
  • Patterns follow .gitignore syntax.
  • If saist.include does not exist, default extensions are used (e.g., .py, .js, .java, .go, etc).
  • Examples:
    • **/*.py includes all Python files
    • src/**/*.ts includes TypeScript files inside src
    • build/ will ignore the entire build folder
    • *.log will ignore all log files

You can also provide include/exclude patterns using the command-line arguments --include and --exclude.

  • Patterns provided via command-line arguments are appended to any patterns loaded from the rule files.
  • Examples:
    • --include '**/*.py' --include '**/*.ts' includes all Python and TypeScript files
    • --include '**' --exclude '*.log' includes all files except those ending in .log
    • --exclude 'node_modules/' excludes the entire node_modules directory

📝 Example

saist.include

**/*.py
**/*.ts
src/**/*.js

saist.ignore

tests/
docs/

This setup will:

  • Only scan .py, .ts, and specific .js files
  • Ignore anything under tests/ and docs/

🧠 Analysis Skills

SAIST can load project-specific analysis skill files from .saist/skills/*.md. These files are added to the security review prompt so future scans understand application-specific details such as routing, authentication, authorization, framework conventions, data access, validation boundaries, dependencies, configuration, and security-sensitive workflows.

Generate an initial set of skill files as a separate run:

saist/main.py --llm openai --generate-skills filesystem /path/to/code

Then review or edit the generated Markdown files and run SAIST normally. Skill files are loaded automatically on future scans:

saist/main.py --llm openai filesystem /path/to/code

Useful options:

Option Description
--skills-path Folder containing skill Markdown files. Defaults to .saist/skills under the scanned project.
--generate-skills Ask the configured LLM to generate skill files and then exit.
--overwrite-skills Replace existing skill files during generation. Without this, existing files are preserved.
--disable-skills Do not load skill files during analysis.
--skills-max-bytes Limit total skill guidance added to analysis prompts.
--skills-sample-files / --skills-sample-bytes Control how much project context is sampled when generating skills.

When skills are loaded, SAIST salts its findings cache with the skill content so updated guidance gets a fresh analysis run.


📄 PDF report generation

saist allows you to generate PDF reports summarizing your findings, making it easier to share insights with your team.

To create a PDF report, use the --pdf flag when running the scan. By default, the report will be saved to reporting/report.pdf. You can customize the filename by using the --pdf-filename option followed by your desired filename.

To add a project name onto the title page of the PDF report, use the --project-name option followed by your desired title.

PDF reports are generated with the built-in ReportLab renderer, so no external document-rendering toolchain is required.

🐋 Example (Docker)

To run saist using Docker and access the generated PDF report, you can mount a volume to ensure that the report is accessible on your host machine. Below is an example command that demonstrates how to do this with the filesystem SCM adapter.

docker run -v$PWD/code:/code -v$PWD/reporting:/app/reporting punksecurity/saist --pdf --llm <llm_provider> [options] filesystem /code
Volume Desciption
-v $PWD/code:/code Mounts the code directory from your host to the /code directory inside the container. This is where your codebase is located for scanning.
-v $PWD/reporting:/app/reporting Mounts the reporting directory from your host to the /app/reporting directory inside the container. This is where the generated PDF report will be saved, making it accessible on your host machine.

⚙️ Command Options

Option Description
--llm Select LLM (anthropic, azure-foundry, bedrock, deepseek, gemini, ollama, openai)
--llm-api-key API key for your LLM
--llm-model (Optional) Specific model (e.g., gpt-4o)
--thinking Pydantic AI thinking effort: minimal, low, medium, high, xhigh, or disabled
--openai-base-uri Base URI for OpenAI-compatible services. Can also be set with SAIST_OPENAI_BASE_URI.
--azure-openai-endpoint Azure AI Foundry or Azure OpenAI endpoint. Can also be set with AZURE_OPENAI_ENDPOINT; /openai/v1/ endpoints use the Responses API without api-version.
--azure-openai-api-version Azure OpenAI API version for non-v1 endpoints. Can also be set with OPENAI_API_VERSION.
--interactive Chat with the LLM after scan
--web Launch a local web server
--disable-tools Disable tool use during file analysis to reduce LLM token usage
--deep For filesystem scans, analyze every file individually. Without this, filesystem scans send a file inventory and let the LLM inspect files with tools, then report file coverage.
--iterations Number of tool-driven filesystem scan passes to run when --deep is not set. Defaults to 1; concurrency is capped by --llm-rate-limit.
--skills-path Folder containing SAIST analysis skill Markdown files
--generate-skills Generate SAIST analysis skill files and exit
--overwrite-skills Replace existing skill files during skill generation
--disable-skills Do not load skill files during analysis
--disable-caching Disable finding caching during file analysis
--skip-line-length-check Skip checking files for a maximum line length
--max-line-length Maximum allowed line length, files with lines longer than this value will be skipped
--i, --include Pattern to explicitly include
--e, --exclude Pattern to explicitly ignore
--dry-run Exit after parsing configuration and collecting files, does not perform any analysis, useful for validating rules
--cache-folder Change the default cache folder
--csv Output findings to findings.csv
--pdf Output findings to PDF report (report.pdf)
--project-name Set the project name for the PDF report's title page (e.g. "Project name")
--ci Exit with code 1 if vulnerabilities found
-v, --verbose Increase output verbosity
Git-specific:
--ref-for-compare / --ref-to-compare Compare Git refs
--commit-for-compare / --commit-to-compare Compare Git commits
GitHub-specific:
--github-token GitHub token
repository / pr Repo and Pull Request ID

🧩 Architecture

  • Pluggable SCM adapters (filesystem, git, GitHub)
  • Modular LLM connectors
  • Async scanning for performance
  • Fine-grained file selection with patterns
  • Diff parsing for precise code review

🛣️ Roadmap

  • Ability to influence the prompts
  • Create a Github action
  • Add additional LLM support
  • Add additional SCM sources
  • SaaS platform version (maybe 👀)

🤝 Contributing

Pull requests are welcome!


⭐ Final Note

If you like it — star it ⭐, use it, and share feedback!
AI-assisted code scanning just got a lot more magical. 🪄

About

SAIST - Static AI-powered Scanning Tool! Scan literally anything with ✨ AI ✨

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages