Security: missing permission checks and input validation on training & subcontractor APIs

## Summary

Internal audit found several security gaps in the training API and subcontractor service that should be addressed before production use.

## Findings

### 1. Missing permission checks on training data endpoints (Critical)

Five `@frappe.whitelist()` endpoints in `ai_employees/api/training.py` lack `frappe.only_for("System Manager")` checks, meaning any authenticated user can call them:

- `export_training_data` (line 11) — exports AI interaction logs as JSONL
- `detect_model_architecture` (line 48) — reads model config
- `get_dataset_preview` (line 68) — dataset statistics
- `generate_knowledge_training_data` (line 85) — bulk knowledge export to disk
- `get_knowledge_stats` (line 125) — knowledge generation stats

For comparison, `start_training` (line 157) and `cancel_training` (line 171) correctly use `frappe.only_for("System Manager")`.

**Fix:** Add `frappe.only_for("System Manager")` as the first line of each function body.

### 2. Unsanitized Docker container ID in subprocess call (High)

`api/training.py:191-197` passes `job.docker_container_id` directly to `subprocess.run(["docker", "stop", ...])` without validation. While `shell=False` prevents shell injection, a crafted value could still cause argument injection.

**Fix:** Validate against a strict regex like `^[a-zA-Z0-9][a-zA-Z0-9_.-]+$` before passing to subprocess.

### 3. Jinja template injection in subcontractor service (Medium)

`services/subcontractor_service.py:196` uses `Environment(loader=BaseLoader(), autoescape=False)` to render CLI arg templates. The `prompt` variable (user input) is passed into the Jinja render context, creating a server-side template injection (SSTI) vector if the prompt contains Jinja syntax.

**Fix:** Use `jinja2.sandbox.SandboxedEnvironment` instead of the base `Environment`.

### 4. f-string SQL pattern in setup_dev_workforce (Low)

`api/setup_dev_workforce.py:763-832` uses `frappe.db.sql(f"DELETE FROM \`tab{dt}\`")` — the `dt` values are currently hardcoded string literals so this isn't exploitable today, but the pattern is dangerous if future changes source values from user input.

**Fix:** Add a defensive comment and/or validate `dt` against `frappe.db.exists("DocType", dt)` before use.

## Priority

Items 1 and 2 should be fixed before any multi-user or production deployment. Items 3 and 4 are lower priority but worth addressing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security: missing permission checks and input validation on training & subcontractor APIs #1

Summary

Findings

1. Missing permission checks on training data endpoints (Critical)

2. Unsanitized Docker container ID in subprocess call (High)

3. Jinja template injection in subcontractor service (Medium)

4. f-string SQL pattern in setup_dev_workforce (Low)

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Security: missing permission checks and input validation on training & subcontractor APIs #1

Description

Summary

Findings

1. Missing permission checks on training data endpoints (Critical)

2. Unsanitized Docker container ID in subprocess call (High)

3. Jinja template injection in subcontractor service (Medium)

4. f-string SQL pattern in setup_dev_workforce (Low)

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions