Summary
Internal audit found several security gaps in the training API and subcontractor service that should be addressed before production use.
Findings
1. Missing permission checks on training data endpoints (Critical)
Five @frappe.whitelist() endpoints in ai_employees/api/training.py lack frappe.only_for("System Manager") checks, meaning any authenticated user can call them:
export_training_data (line 11) — exports AI interaction logs as JSONL
detect_model_architecture (line 48) — reads model config
get_dataset_preview (line 68) — dataset statistics
generate_knowledge_training_data (line 85) — bulk knowledge export to disk
get_knowledge_stats (line 125) — knowledge generation stats
For comparison, start_training (line 157) and cancel_training (line 171) correctly use frappe.only_for("System Manager").
Fix: Add frappe.only_for("System Manager") as the first line of each function body.
2. Unsanitized Docker container ID in subprocess call (High)
api/training.py:191-197 passes job.docker_container_id directly to subprocess.run(["docker", "stop", ...]) without validation. While shell=False prevents shell injection, a crafted value could still cause argument injection.
Fix: Validate against a strict regex like ^[a-zA-Z0-9][a-zA-Z0-9_.-]+$ before passing to subprocess.
3. Jinja template injection in subcontractor service (Medium)
services/subcontractor_service.py:196 uses Environment(loader=BaseLoader(), autoescape=False) to render CLI arg templates. The prompt variable (user input) is passed into the Jinja render context, creating a server-side template injection (SSTI) vector if the prompt contains Jinja syntax.
Fix: Use jinja2.sandbox.SandboxedEnvironment instead of the base Environment.
4. f-string SQL pattern in setup_dev_workforce (Low)
api/setup_dev_workforce.py:763-832 uses frappe.db.sql(f"DELETE FROM \tab{dt}`")— thedt` values are currently hardcoded string literals so this isn't exploitable today, but the pattern is dangerous if future changes source values from user input.
Fix: Add a defensive comment and/or validate dt against frappe.db.exists("DocType", dt) before use.
Priority
Items 1 and 2 should be fixed before any multi-user or production deployment. Items 3 and 4 are lower priority but worth addressing.
Summary
Internal audit found several security gaps in the training API and subcontractor service that should be addressed before production use.
Findings
1. Missing permission checks on training data endpoints (Critical)
Five
@frappe.whitelist()endpoints inai_employees/api/training.pylackfrappe.only_for("System Manager")checks, meaning any authenticated user can call them:export_training_data(line 11) — exports AI interaction logs as JSONLdetect_model_architecture(line 48) — reads model configget_dataset_preview(line 68) — dataset statisticsgenerate_knowledge_training_data(line 85) — bulk knowledge export to diskget_knowledge_stats(line 125) — knowledge generation statsFor comparison,
start_training(line 157) andcancel_training(line 171) correctly usefrappe.only_for("System Manager").Fix: Add
frappe.only_for("System Manager")as the first line of each function body.2. Unsanitized Docker container ID in subprocess call (High)
api/training.py:191-197passesjob.docker_container_iddirectly tosubprocess.run(["docker", "stop", ...])without validation. Whileshell=Falseprevents shell injection, a crafted value could still cause argument injection.Fix: Validate against a strict regex like
^[a-zA-Z0-9][a-zA-Z0-9_.-]+$before passing to subprocess.3. Jinja template injection in subcontractor service (Medium)
services/subcontractor_service.py:196usesEnvironment(loader=BaseLoader(), autoescape=False)to render CLI arg templates. Thepromptvariable (user input) is passed into the Jinja render context, creating a server-side template injection (SSTI) vector if the prompt contains Jinja syntax.Fix: Use
jinja2.sandbox.SandboxedEnvironmentinstead of the baseEnvironment.4. f-string SQL pattern in setup_dev_workforce (Low)
api/setup_dev_workforce.py:763-832usesfrappe.db.sql(f"DELETE FROM \tab{dt}`")— thedt` values are currently hardcoded string literals so this isn't exploitable today, but the pattern is dangerous if future changes source values from user input.Fix: Add a defensive comment and/or validate
dtagainstfrappe.db.exists("DocType", dt)before use.Priority
Items 1 and 2 should be fixed before any multi-user or production deployment. Items 3 and 4 are lower priority but worth addressing.