From 961e713cc45b211ffcb4234cdf0026a1c43a877f Mon Sep 17 00:00:00 2001
From: ananthkamath <akamath@endor.ai>
Date: Sat, 25 Apr 2026 01:31:57 +0530
Subject: [PATCH] feat: add export dependencies skill

---
 .claude/skills/export-dependencies/SKILL.md |  92 +++++++++++++++
 export_dependencies/README.md               | 120 +++++++++++++++++++-
 export_dependencies/main.py                 |  64 +++++++----
 3 files changed, 253 insertions(+), 23 deletions(-)
 create mode 100644 .claude/skills/export-dependencies/SKILL.md
diff --git a/.claude/skills/export-dependencies/SKILL.md b/.claude/skills/export-dependencies/SKILL.md
new file mode 100644
index 0000000..39fc54f
--- /dev/null
+++ b/.claude/skills/export-dependencies/SKILL.md
@@ -0,0 +1,92 @@
+---
+name: export-dependencies
+description: Export unique dependencies with scorecard metrics and/or license info for an Endor Labs namespace
+argument-hint: "[--namespace my-namespace] [--report-type licenses|scores|full]"
+---
+
+Help the user run the `export_dependencies/main.py` script. Follow these steps:
+
+## 1. Determine report type from user intent
+
+Map the user's natural language request to a `--report-type` value before asking any other questions:
+
+| If the user says… | Use |
+|---|---|
+| "fetch all licenses", "show license info", "what licenses am I using" | `--report-type licenses` |
+| "fetch endor scores", "show dependency scores", "scorecard for my dependencies" | `--report-type scores` |
+| "fetch all dependencies", "full report", "dependencies with licenses and scores", "export dependencies" (generic) | `--report-type full` (default) |
+
+If the intent is **vague or unclear**, ask before proceeding:
+
+> "What would you like included in the report?
+> 1. **Licenses only** — dependency names and their license info
+> 2. **Scores only** — dependency names and Endor scorecard scores
+> 3. **Full report** — dependency names, scores, and licenses (default)
+
+## 2. Collect parameters
+
+If $ARGUMENTS contains the needed flags, parse them directly. Otherwise ask the user for:
+
+**Required:**
+- `--namespace` (or `-n`) — the Endor Labs namespace (or `ENDOR_NAMESPACE` env var)
+
+**Auth (one of two options):**
+- Option A — Bearer token: `--token` (or `ENDOR_TOKEN` env var)
+- Option B — API credentials: `--api-key` + `--api-secret` (or `ENDOR_API_CREDENTIALS_KEY` + `ENDOR_API_CREDENTIALS_SECRET` env vars)
+
+**Optional:**
+- `--workers N` — parallel workers for metric lookups (default: 20; increase for speed, decrease if rate-limited)
+- `--debug` — print per-page progress and diagnostic lines during aggregation
+
+## 3. Check environment setup
+
+Before running, verify dependencies are installed:
+
+```bash
+cd export_dependencies
+pip install -r requirements.txt
+```
+
+## 4. Run the script
+
+Run from the `export_dependencies/` directory:
+
+```bash
+# Licenses only
+python main.py --namespace <namespace> --token "$ENDOR_TOKEN" --report-type licenses
+
+# Scores only
+python main.py --namespace <namespace> --token "$ENDOR_TOKEN" --report-type scores
+
+# Full report (default — same as omitting --report-type)
+python main.py --namespace <namespace> --token "$ENDOR_TOKEN"
+
+# Using API credentials
+python main.py --namespace <namespace> \
+  --api-key "$ENDOR_API_CREDENTIALS_KEY" \
+  --api-secret "$ENDOR_API_CREDENTIALS_SECRET" \
+  --report-type licenses
+```
+
+> **Note:** This script can take several minutes for large namespaces. Warn the user before starting.
+
+## 5. Report results
+
+After the script finishes:
+- Confirm the output CSV path (printed as the last line by the script).
+- State the number of unique dependencies written and which columns were included.
+- If it fails, show the exact error and help diagnose: auth failure, namespace not found, rate limits (suggest lowering `--workers`), or network errors.
+
+**Columns by report type:**
+
+| Report type | Columns written |
+|---|---|
+| `full` (default) | name, package_version_uuid, count, overall_score, SCORE_CATEGORY_POPULARITY, SCORE_CATEGORY_CODE_QUALITY, SCORE_CATEGORY_SECURITY, SCORE_CATEGORY_ACTIVITY, licenses |
+| `licenses` | name, count, licenses |
+| `scores` | name, package_version_uuid, count, overall_score, SCORE_CATEGORY_POPULARITY, SCORE_CATEGORY_CODE_QUALITY, SCORE_CATEGORY_SECURITY, SCORE_CATEGORY_ACTIVITY |
+
+**Common issues:**
+- Blank metric columns: metrics may not be available for those specific package versions — expected for some entries.
+- Many "metrics query objects: 0": verify the dependency package version UUIDs correspond to OSS metric entries.
+- Intermittent network errors: lower `--workers`; the script has built-in retries with exponential backoff.
+- 401/403 mid-run: the script auto-refreshes the token and retries once when using API credentials.
diff --git a/export_dependencies/README.md b/export_dependencies/README.md
index bdf78f9..89984ce 100644
--- a/export_dependencies/README.md
+++ b/export_dependencies/README.md
@@ -30,7 +30,7 @@ Namespace is required:
 ### Usage
 
 ```bash
-# Using a token
+# Using a token (full report — default)
 python main.py --namespace my-namespace --token "$ENDOR_TOKEN"
 
 # Using API credentials
@@ -45,6 +45,27 @@ python main.py -n my-namespace --token "$ENDOR_TOKEN" --debug
 python main.py -n my-namespace --token "$ENDOR_TOKEN" --workers 40
 ```
 
+### Report types
+
+Use `--report-type` to control which columns are written to the CSV. The default (`full`) preserves existing behavior.
+
+| `--report-type` | Columns written |
+|---|---|
+| `full` *(default)* | name, package_version_uuid, count, overall_score, SCORE_CATEGORY_POPULARITY, SCORE_CATEGORY_CODE_QUALITY, SCORE_CATEGORY_SECURITY, SCORE_CATEGORY_ACTIVITY, licenses |
+| `licenses` | name, count, licenses |
+| `scores` | name, package_version_uuid, count, overall_score, SCORE_CATEGORY_POPULARITY, SCORE_CATEGORY_CODE_QUALITY, SCORE_CATEGORY_SECURITY, SCORE_CATEGORY_ACTIVITY |
+
+```bash
+# Licenses only
+python main.py --namespace my-namespace --token "$ENDOR_TOKEN" --report-type licenses
+
+# Scores only
+python main.py --namespace my-namespace --token "$ENDOR_TOKEN" --report-type scores
+
+# Full report (explicit — same as omitting --report-type)
+python main.py --namespace my-namespace --token "$ENDOR_TOKEN" --report-type full
+```
+
 Progress is printed on a single updating line (overwritten in place). With `--debug`, additional diagnostic lines appear during unique-dependency aggregation pagination. For example:
 ```
 Aggregating unique dependencies.  This make take a few minutes ...
@@ -109,4 +130,101 @@ pypi://urllib3@1.26.20,66d0988469c594feb187c89a,42,6.5,8,5,4,9,BSD-3-Clause:MIT:
 - Ensure the `ENDOR_NAMESPACE` is correct and that your token/credentials are valid.
 - If you encounter intermittent network errors (e.g., temporary DNS failures), try lowering `--workers` to reduce concurrent connections, or simply re-run; the built-in connection pooling and retries already handle many transient issues.
 
+## Claude Code Skill
+
+This script ships with a Claude Code skill (`/export-dependencies`) that lets you run the export interactively without memorizing flags.
+
+### Prerequisites
+
+The skill is available when Claude Code is opened from the `scripts/` directory (where the `.claude/skills/` folder lives). No extra installation is needed.
+
+### How to use
+
+Invoke the skill in Claude Code:
+
+```
+/export-dependencies
+```
+
+Claude will:
+1. Infer the report type from your natural language request, or ask if it's unclear.
+2. Ask for any required parameters you haven't provided (`--namespace`, auth credentials).
+3. Check that dependencies are installed before running.
+4. Execute the script, warn you if the namespace is large (export can take several minutes), and confirm the output file created.
+
+The skill understands natural language — you don't need to remember flags:
+
+| Say… | Runs with |
+|---|---|
+| "fetch all licenses I use in my namespace" | `--report-type licenses` |
+| "fetch Endor scores for all my dependencies" | `--report-type scores` |
+| "fetch all dependencies" or "full report" | `--report-type full` (default) |
+| "fetch all dependencies with licenses and scores" | `--report-type full` |
+
+You can also pass flags directly to skip the prompts:
+
+```
+/export-dependencies --namespace my-namespace --token $ENDOR_TOKEN
+/export-dependencies --namespace my-namespace --token $ENDOR_TOKEN --report-type licenses
+/export-dependencies --namespace my-namespace --token $ENDOR_TOKEN --report-type scores
+/export-dependencies --namespace my-namespace --token $ENDOR_TOKEN --workers 40
+```
+
+### Example sessions
+
+**Licenses only:**
+```
+User: fetch all licenses that I use in my namespace using export dependencies
+
+Claude: I'll run an export with license information only. What's your namespace?
+
+User: acme-corp
+
+Claude: Checking environment…
+  ✓ requests and python-dotenv installed
+  ✓ ENDOR_TOKEN set
+
+Running:
+  python main.py --namespace acme-corp --token "$ENDOR_TOKEN" --report-type licenses
+
+Number of unique dependencies after de-duplication: 3 421 (removed 189 duplicates)
+Fetching dependency metrics in parallel with 20 workers ...
+completed 3421/3421
+
+Script completed.
+Output: generated_reports/unique_dependencies_acme-corp_20260424_091823.csv
+  3 421 unique dependencies written (columns: name, count, licenses).
+```
+
+**Scores only:**
+```
+User: fetch Endor scores for all dependencies in my namespace using export dependencies
+
+Claude: I'll run an export with scorecard scores only. What's your namespace?
+
+User: acme-corp
+
+Running:
+  python main.py --namespace acme-corp --token "$ENDOR_TOKEN" --report-type scores
+
+Script completed.
+Output: generated_reports/unique_dependencies_acme-corp_20260424_092541.csv
+  3 421 unique dependencies written (columns: name, package_version_uuid, count, overall_score, category scores).
+```
+
+**Full report (default):**
+```
+User: fetch all dependencies in my namespace using export dependencies
+
+Running:
+  python main.py --namespace acme-corp --token "$ENDOR_TOKEN"
+
+Script completed.
+Output: generated_reports/unique_dependencies_acme-corp_20260424_093012.csv
+  3 421 unique dependencies written with scorecard scores and license info.
+```
+
+## No Warranty
+
+This software is provided on an "as is" basis, without warranty of any kind. You are solely responsible for determining whether this software is suitable for your use.
 
diff --git a/export_dependencies/main.py b/export_dependencies/main.py
index f7cffa3..3c6fc50 100644
--- a/export_dependencies/main.py
+++ b/export_dependencies/main.py
@@ -500,8 +500,37 @@ def extract_scorecard_from_container(container: Dict[str, Any]) -> None:
     parser.add_argument("--token", default=os.getenv("ENDOR_TOKEN"), help="Bearer token (or set ENDOR_TOKEN)")
     parser.add_argument("--debug", action="store_true", help="Enable debug logging")
     parser.add_argument("--workers", type=int, default=20, help="Number of parallel workers for API calls")
+    parser.add_argument(
+        "--report-type",
+        choices=["full", "licenses", "scores"],
+        default="full",
+        help=(
+            "Controls which columns are written to the CSV. "
+            "'full' (default): all columns — name, package_version_uuid, count, scores, licenses. "
+            "'licenses': name, count, licenses only. "
+            "'scores': name, package_version_uuid, count, overall_score, category scores only."
+        ),
+    )
     args = parser.parse_args()
 
+    _COLUMNS = {
+        "full": [
+            "name", "package_version_uuid", "count",
+            "overall_score",
+            "SCORE_CATEGORY_POPULARITY", "SCORE_CATEGORY_CODE_QUALITY",
+            "SCORE_CATEGORY_SECURITY", "SCORE_CATEGORY_ACTIVITY",
+            "licenses",
+        ],
+        "licenses": ["name", "count", "licenses"],
+        "scores": [
+            "name", "package_version_uuid", "count",
+            "overall_score",
+            "SCORE_CATEGORY_POPULARITY", "SCORE_CATEGORY_CODE_QUALITY",
+            "SCORE_CATEGORY_SECURITY", "SCORE_CATEGORY_ACTIVITY",
+        ],
+    }
+    output_columns = _COLUMNS[args.report_type]
+
     # Validate namespace
     if not args.namespace:
         print("Error: --namespace or ENDOR_NAMESPACE is required.")
@@ -535,17 +564,7 @@ def extract_scorecard_from_container(container: Dict[str, Any]) -> None:
         # Write header once
         with open(output_path, "w", newline="", encoding="utf-8") as f_header:
             writer = csv.writer(f_header)
-            writer.writerow([
-                "name",
-                "package_version_uuid",
-                "count",
-                "overall_score",
-                "SCORE_CATEGORY_POPULARITY",
-                "SCORE_CATEGORY_CODE_QUALITY",
-                "SCORE_CATEGORY_SECURITY",
-                "SCORE_CATEGORY_ACTIVITY",
-                "licenses"
-            ])
+            writer.writerow(output_columns)
 
         # Lock for safe serialized writes
         write_lock = threading.Lock()
@@ -579,21 +598,22 @@ def process_and_write(row: Dict[str, Any]) -> int:
                     objects = metrics_resp.get("list", {}).get("objects", []) or []
                 metrics = extract_metrics_from_dependency_details(objects)
                 cat_scores = metrics.get("category_scores", {}) or {}
+                all_values = {
+                    "name": row["name"],
+                    "package_version_uuid": row["package_version_uuid"],
+                    "count": row["count"],
+                    "overall_score": metrics.get("overall_score", ""),
+                    "SCORE_CATEGORY_POPULARITY": cat_scores.get("SCORE_CATEGORY_POPULARITY", ""),
+                    "SCORE_CATEGORY_CODE_QUALITY": cat_scores.get("SCORE_CATEGORY_CODE_QUALITY", ""),
+                    "SCORE_CATEGORY_SECURITY": cat_scores.get("SCORE_CATEGORY_SECURITY", ""),
+                    "SCORE_CATEGORY_ACTIVITY": cat_scores.get("SCORE_CATEGORY_ACTIVITY", ""),
+                    "licenses": metrics.get("licenses", ""),
+                }
                 with write_lock:
                     # Append row safely
                     with open(output_path, "a", newline="", encoding="utf-8") as f_out:
                         writer = csv.writer(f_out)
-                        writer.writerow([
-                            row["name"],
-                            row["package_version_uuid"],
-                            row["count"],
-                            metrics.get("overall_score", ""),
-                            cat_scores.get("SCORE_CATEGORY_POPULARITY", ""),
-                            cat_scores.get("SCORE_CATEGORY_CODE_QUALITY", ""),
-                            cat_scores.get("SCORE_CATEGORY_SECURITY", ""),
-                            cat_scores.get("SCORE_CATEGORY_ACTIVITY", ""),
-                            metrics.get("licenses", "")
-                        ])
+                        writer.writerow([all_values[col] for col in output_columns])
                 return len(objects)
             except Exception as ex:
                 if args.debug: