Skip to content

Security: missing permission checks and input validation on training & subcontractor APIs #1

@ReBoticsAI

Description

@ReBoticsAI

Summary

Internal audit found several security gaps in the training API and subcontractor service that should be addressed before production use.

Findings

1. Missing permission checks on training data endpoints (Critical)

Five @frappe.whitelist() endpoints in ai_employees/api/training.py lack frappe.only_for("System Manager") checks, meaning any authenticated user can call them:

  • export_training_data (line 11) — exports AI interaction logs as JSONL
  • detect_model_architecture (line 48) — reads model config
  • get_dataset_preview (line 68) — dataset statistics
  • generate_knowledge_training_data (line 85) — bulk knowledge export to disk
  • get_knowledge_stats (line 125) — knowledge generation stats

For comparison, start_training (line 157) and cancel_training (line 171) correctly use frappe.only_for("System Manager").

Fix: Add frappe.only_for("System Manager") as the first line of each function body.

2. Unsanitized Docker container ID in subprocess call (High)

api/training.py:191-197 passes job.docker_container_id directly to subprocess.run(["docker", "stop", ...]) without validation. While shell=False prevents shell injection, a crafted value could still cause argument injection.

Fix: Validate against a strict regex like ^[a-zA-Z0-9][a-zA-Z0-9_.-]+$ before passing to subprocess.

3. Jinja template injection in subcontractor service (Medium)

services/subcontractor_service.py:196 uses Environment(loader=BaseLoader(), autoescape=False) to render CLI arg templates. The prompt variable (user input) is passed into the Jinja render context, creating a server-side template injection (SSTI) vector if the prompt contains Jinja syntax.

Fix: Use jinja2.sandbox.SandboxedEnvironment instead of the base Environment.

4. f-string SQL pattern in setup_dev_workforce (Low)

api/setup_dev_workforce.py:763-832 uses frappe.db.sql(f"DELETE FROM \tab{dt}`")— thedt` values are currently hardcoded string literals so this isn't exploitable today, but the pattern is dangerous if future changes source values from user input.

Fix: Add a defensive comment and/or validate dt against frappe.db.exists("DocType", dt) before use.

Priority

Items 1 and 2 should be fixed before any multi-user or production deployment. Items 3 and 4 are lower priority but worth addressing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions