Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions .cursor/plugins/firewatch-dev/skills/fw-debug-jira-auth/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
---
name: fw-debug-jira-auth
description: Step-by-step diagnostic workflow for Jira authentication and token issues. Use when firewatch fails with JIRAError on connection, returns 401/403 from Jira, or when tokens need renewal or rotation.
---

# Debug Jira Auth

## Trigger

Firewatch fails during Jira authentication or token-dependent operations. Symptoms include `JIRAError` during `JIRA()` initialization in `jira_base.py`, 401 Unauthorized or 403 Forbidden responses from Jira REST API calls, or CI job failures at the `jira-config-gen` or `report` step with authentication-related errors.

## Diagnostic Workflow

### 1. Verify Config File

The `Jira` class reads credentials from a JSON file specified via `--jira-config-path`. Validate the file structure:

**Required fields**:

| Field | Type | Purpose |
|-------|------|---------|
| `url` | string | Jira server URL (e.g., `https://issues.redhat.com`) |
| `token` | string | API token or PAT |

**Optional fields**:

| Field | Type | Purpose |
|-------|------|---------|
| `email` | string | Presence selects Jira Cloud Basic Auth mode |
| `proxies` | object | HTTP/HTTPS proxy URLs for stage environments |

**Quick checks**:

- Confirm the file is valid JSON (`python -c "import json; json.load(open('<path>'))"`)
- Confirm `url` is a reachable Jira instance
- Confirm `token` is non-empty and does not contain leading/trailing whitespace or newlines
- If the URL contains `stage`, confirm `proxies` is set (the `jira-config-gen` template adds this automatically, but manually-created configs may omit it)

### 2. Identify Auth Mode and Validate Token

The auth mode is determined by the presence of the `email` field in the config file:

| Mode | Config | Auth Method | Token Source |
|------|--------|-------------|--------------|
| Jira Cloud | `email` present | `basic_auth = (email, token)` | Atlassian API token from id.atlassian.com |
| Jira Server/DC | `email` absent | `token_auth = token` | Personal Access Token from Jira user profile |

**Validation via curl**:

Jira Cloud:

```bash
curl -s -o /dev/null -w "%{http_code}" \
-u "<email>:<token>" \
"https://<instance>.atlassian.net/rest/api/3/myself"
```

Jira Server/DC:

```bash
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer <token>" \
"https://<server>/rest/api/3/myself"
```

A `200` confirms the token is valid. A `401` means the token is expired, revoked, or incorrect. A `403` means the token is valid but lacks permissions.

**Common mistakes**:

- Using a Cloud API token without the `email` field (sends as PAT, gets 401)
- Using a Server/DC PAT with the `email` field (sends as Basic Auth, gets 401)
- Copying the token with trailing newlines from `cat` output

### 3. Check CI Secret Mounting

In OpenShift CI (Prow), the Jira token is provisioned as a Kubernetes secret mounted into the pod.

| Variable | Default | Purpose |
|----------|---------|---------|
| `FIREWATCH_JIRA_API_TOKEN_PATH` | `/tmp/secrets/jira/access_token` | Path to the mounted token file |

The CI step script typically runs:

```bash
echo "${JIRA_TOKEN}" > /tmp/token
firewatch jira-config-gen --token-path /tmp/token --server-url "${JIRA_SERVER_URL}"
```

**Quick checks**:

- Verify the token file exists at the expected path and is non-empty
- Confirm the file contains the raw token string (not JSON, not base64-encoded)
- Check for trailing newlines: `wc -l /tmp/secrets/jira/access_token` should return `0` (no newline) or `1` (single trailing newline, which `JiraConfig.token()` strips via `.strip()`)
- Verify the K8s secret is correctly bound in the pod spec for the CI step

### 4. Token Renewal Procedures

**Jira Cloud (Atlassian API tokens)**:

1. Log in to <https://id.atlassian.com/manage-profile/security/api-tokens>
2. Revoke the expired token if still listed
3. Create a new API token with a descriptive label
4. Note: Atlassian org admins can enforce token expiration policies; check with your org admin if tokens expire unexpectedly soon

**Jira Server/Data Center (PATs)**:

1. Log in to the Jira instance as the service account user (e.g., `firewatch-tool`)
2. Navigate to Profile > Personal Access Tokens
3. Create a new token, setting an appropriate expiration date
4. Note: Server admins can revoke PATs at any time; check with your Jira admin if a token stops working before its expiration date

### 5. CI Secret Rotation

After generating a new token, update the CI secret. No firewatch code changes are needed; the token file path remains the same.

1. Generate the new token in Jira (see step 4)
2. Update the Kubernetes secret in the CI cluster that mounts to `FIREWATCH_JIRA_API_TOKEN_PATH`
3. Trigger a new CI job run to verify the updated token works
4. The `jira-config-gen` command reads the token from the file path at runtime, so the rendered config will pick up the new value automatically

### 6. Common Failure Modes Reference

| Symptom | Likely Cause | Where to Look |
|---------|-------------|---------------|
| `JIRAError` during `Jira.__init__` | Expired or invalid token | Token validation (step 2) |
| `401 Unauthorized` on any Jira call | Token expired, revoked, or wrong auth mode | Auth mode (step 2), renewal (step 4) |
| `403 Forbidden` on specific operations | Token valid but user lacks project permissions | README "Jira User Permissions" section |
| `403 Forbidden` on all operations | Token valid but user not added to the project | Add user to the Jira project under the `Developer` role |
| Connection timeout with stage URL | Missing `proxies` in config | Config file (step 1); ensure `proxies` includes `http://squid.corp.redhat.com:3128` |
| `jira-config-gen` succeeds but operations fail | Wrong auth mode for server type (Cloud token used as PAT or vice versa) | Auth mode (step 2) |
| Empty or missing token file in CI | K8s secret not mounted or wrong path | CI secret mounting (step 3) |
| `click.Abort()` from `jira-config-gen` | Token file unreadable at `--token-path` | File permissions, path correctness |
| Retry exhaustion (3 retries then exception) | `@ignore_exceptions(retry=3)` does not distinguish auth errors from transient failures | Token validation (step 2); the codebase has no special 401 handling |

## Key Source Files

- `src/objects/jira_base.py`: `Jira` class, authentication setup in `__init__`, `_jira_request` for direct REST calls
- `src/jira_config_gen/jira_config_gen.py`: `JiraConfig` class, token file reading, Jinja2 template rendering
- `src/commands/jira_config_gen.py`: CLI options for `jira-config-gen` command
- `src/templates/jira.config.j2`: Jinja2 template with auto-proxy for stage URLs
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ repos:
hooks:
- id: mypy
exclude: ^(tests/|scripts/)
additional_dependencies: [types-requests]
- repo: https://github.com/PyCQA/flake8
rev: "7.3.0"
hooks:
Expand Down
195 changes: 195 additions & 0 deletions check-token-expiry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
#!/usr/bin/env python3

from __future__ import annotations

import argparse
import os
import sys
from datetime import date, datetime, timezone
from typing import Any

import hvac
import requests

from src.objects.slack_base import SlackClient

DEFAULT_VAULT_ADDR = "https://vault.ci.openshift.org"
DEFAULT_VAULT_KV_PATH = "kv/selfservice/firewatch-tool/jira-credentials"
NOTIFY_THRESHOLDS_DAYS = (30, 14, 7, 3, 1)


def eprint(*args: Any, **kwargs: Any) -> None:
print(*args, file=sys.stderr, **kwargs)


def split_kv_mount_and_path(full_path: str) -> tuple[str, str]:
p = full_path.strip().strip("/")
if "/" not in p:
raise ValueError(f"invalid KV path (expected mount/path): {full_path!r}")
mount, rest = p.split("/", 1)
if not mount or not rest:
raise ValueError(f"invalid KV path: {full_path!r}")
return mount, rest


def read_secret_from_vault(
vault_addr: str,
vault_token: str,
kv_path: str,
) -> dict[str, Any]:
mount, path = split_kv_mount_and_path(kv_path)
client = hvac.Client(url=vault_addr, token=vault_token)
if not client.is_authenticated():
raise RuntimeError("Vault authentication failed")
resp = client.secrets.kv.v2.read_secret_version(path=path, mount_point=mount)
data = resp.get("data", {}).get("data")
if not isinstance(data, dict):
raise RuntimeError("Vault response missing secret data")
return data


def parse_expiry_date(expires_at: str) -> date:
s = expires_at.strip()
if not s:
raise ValueError("expires_at is empty")
if s.endswith("Z"):
s = s[:-1] + "+00:00"
try:
dt = datetime.fromisoformat(s)
except ValueError as exc:
raise ValueError(f"invalid expires_at ISO 8601: {expires_at!r}") from exc
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc)
return dt.date()
return dt.date()


def days_until_expiry(expiry: date, today: date | None = None) -> int:
ref = today if today is not None else date.today()
return (expiry - ref).days


def should_notify_approaching(days_left: int) -> bool:
return 1 <= days_left <= max(NOTIFY_THRESHOLDS_DAYS)


def build_approaching_message(
kv_path: str,
days_left: int,
expiry: date,
) -> str:
return (
f"Firewatch Jira API token is approaching expiration.\n"
f"KV path: {kv_path}\n"
f"Days remaining: {days_left}\n"
f"Expiry date: {expiry.isoformat()}\n"
f"Follow the token rotation runbook to renew before expiry."
)


def build_expired_message(kv_path: str, expiry: date, days_overdue: int) -> str:
return (
f"URGENT: Firewatch Jira API token has expired.\n"
f"KV path: {kv_path}\n"
f"Expiry date: {expiry.isoformat()}\n"
f"Expired {days_overdue} day(s) ago.\n"
f"Rotate immediately using the token rotation runbook."
)


def build_expires_today_message(kv_path: str, expiry: date) -> str:
return (
f"URGENT: Firewatch Jira API token expires today.\n"
f"KV path: {kv_path}\n"
f"Expiry date: {expiry.isoformat()}\n"
f"Rotate immediately using the token rotation runbook."
)


def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
p = argparse.ArgumentParser(description="Check Firewatch Jira token expiry and alert Slack.")
p.add_argument(
"--vault-addr",
default=os.environ.get("VAULT_ADDR", DEFAULT_VAULT_ADDR),
help=f"Vault address (default: env VAULT_ADDR or {DEFAULT_VAULT_ADDR})",
)
p.add_argument(
"--vault-token",
default=os.environ.get("VAULT_TOKEN"),
help="Vault token (default: env VAULT_TOKEN)",
)
p.add_argument(
"--vault-kv-path",
default=os.environ.get("VAULT_KV_PATH", DEFAULT_VAULT_KV_PATH),
help="KV secret path mount/rest (default: env VAULT_KV_PATH or built-in default)",
)
p.add_argument(
"--slack-webhook-url",
default=os.environ.get("SLACK_WEBHOOK_URL"),
help="Slack incoming webhook URL (default: env SLACK_WEBHOOK_URL)",
)
return p.parse_args(argv)


def main(argv: list[str] | None = None) -> int:
args = parse_args(argv)
if not args.vault_token:
eprint("error: VAULT_TOKEN or --vault-token is required")
return 1
if not args.slack_webhook_url:
eprint("error: SLACK_WEBHOOK_URL or --slack-webhook-url is required")
return 1

try:
secret = read_secret_from_vault(
args.vault_addr,
args.vault_token,
args.vault_kv_path,
)
except Exception as exc:
eprint(f"error: Vault read failed: {exc}")
return 1

raw_expires = secret.get("expires_at")
if raw_expires is None:
eprint("error: secret missing expires_at")
return 1
if not isinstance(raw_expires, str):
eprint("error: expires_at must be a string")
return 1

try:
expiry = parse_expiry_date(raw_expires)
except ValueError as exc:
eprint(f"error: {exc}")
return 1

days_left = days_until_expiry(expiry)
eprint(f"expires_at={expiry.isoformat()} days_remaining={days_left}")

try:
if days_left < 0:
msg = build_expired_message(args.vault_kv_path, expiry, abs(days_left))
SlackClient.post_webhook(args.slack_webhook_url, msg)
eprint("sent Slack: expired token alert")
return 2
if days_left == 0:
msg = build_expires_today_message(args.vault_kv_path, expiry)
SlackClient.post_webhook(args.slack_webhook_url, msg)
eprint("sent Slack: expires today alert")
return 2
if should_notify_approaching(days_left):
msg = build_approaching_message(args.vault_kv_path, days_left, expiry)
SlackClient.post_webhook(args.slack_webhook_url, msg)
eprint("sent Slack: approaching expiry")
else:
eprint("no notification (outside alert window)")
except requests.RequestException as exc:
eprint(f"error: Slack webhook request failed: {exc}")
return 1

return 0


if __name__ == "__main__":
raise SystemExit(main())
Loading
Loading