-
Notifications
You must be signed in to change notification settings - Fork 9
chore: add markdown hygiene checks and governance docs #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
FScholPer
wants to merge
3
commits into
main
Choose a base branch
from
split/pr51-docs-hygiene
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| # Assistant Runtime Alignment | ||
|
|
||
| This repository aligns policy for multiple assistant runtimes (Copilot/VS Code, Codex, Claude) without duplicating governance content. | ||
|
|
||
| ## Canonical instruction source | ||
|
|
||
| - AGENTS.md is the canonical, runtime-neutral policy document. | ||
| - CLAUDE.md imports AGENTS.md for Claude compatibility. | ||
| - .github/copilot-instructions.md is runtime-specific glue for Copilot. | ||
|
|
||
| ## MCP alignment model | ||
|
|
||
| - Keep one approved MCP integration model for the organization. | ||
| - Configure runtime settings files for MCP-first behavior. | ||
| - Keep governance assets runtime-neutral and shared where possible. | ||
|
|
||
| ## Repository template integration | ||
|
|
||
| The Copier template distributes: | ||
|
|
||
| - AGENTS.md | ||
| - CLAUDE.md | ||
| - .github/<instructions-file> | ||
| - .claude/settings.json | ||
| - .github/copilot/settings.json | ||
|
|
||
| This keeps assistants aligned while preserving runtime-specific entrypoints and MCP-first integration. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # Markdown Maintenance Playbook | ||
|
|
||
| This repository uses a low-maintenance governance model. | ||
|
|
||
| ## Goals | ||
|
|
||
| - Keep SCORE-specific governance assets concise. | ||
| - Avoid local duplication of framework-generic workflow content. | ||
| - Detect markdown hygiene issues early. | ||
|
|
||
| ## Operating Model | ||
|
|
||
| 1. Keep only SCORE-specific contracts and policy in this repository. | ||
| 2. Inherit generic framework assets in adopter repositories. | ||
| 3. Use placeholders for runtime-specific naming. | ||
| 4. Validate markdown health in CI and before merge. | ||
|
|
||
| ## Automated Checks | ||
|
|
||
| The script at [scripts/check_markdown_hygiene.py](/scripts/check_markdown_hygiene.py) validates: | ||
|
|
||
| - Duplicate markdown files by content hash. | ||
| - Broken local markdown links. | ||
|
|
||
| Run locally: | ||
|
|
||
| ```bash | ||
| python3 scripts/check_markdown_hygiene.py --root . --include .github --include README.md --include profile | ||
| ``` | ||
|
|
||
| CI workflow: | ||
|
|
||
| - [.github/workflows/docs-hygiene.yml](/.github/workflows/docs-hygiene.yml) | ||
|
|
||
| ## Cadence | ||
|
|
||
| - Pull request: automatic via CI. | ||
| - Weekly: scheduled CI run. | ||
| - Monthly: remove stale docs and confirm retained files are still SCORE-specific. | ||
|
|
||
| ## Adopter Guidance | ||
|
|
||
| When porting to another repository: | ||
|
|
||
| 1. Copy this playbook, hygiene script, and workflow. | ||
| 2. Keep framework-generic assets out of the local overlay. | ||
| 3. Keep only SCORE-specific deltas and schemas under `.github/references/` and `.github/score/`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| name: Markdown Hygiene | ||
|
|
||
| on: | ||
| pull_request: | ||
| paths: | ||
| - '**/*.md' | ||
| - 'scripts/check_markdown_hygiene.py' | ||
| - '.github/workflows/docs-hygiene.yml' | ||
| schedule: | ||
| - cron: '0 6 * * 1' | ||
| workflow_dispatch: | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| markdown-hygiene: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Setup Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.12' | ||
|
|
||
| - name: Run markdown hygiene checks | ||
| run: python scripts/check_markdown_hygiene.py --root . --include .github --include README.md --include profile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,160 @@ | ||
| #!/usr/bin/env python3 | ||
| """Lightweight markdown hygiene checks for repository-scale governance docs. | ||
|
|
||
| Checks: | ||
| - Duplicate markdown files by content hash | ||
| - Broken local markdown links (relative repo paths) | ||
|
|
||
| Exit codes: | ||
| - 0: no issues | ||
| - 1: one or more issues found | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import argparse | ||
| import hashlib | ||
| import re | ||
| import sys | ||
| from pathlib import Path | ||
| from typing import Iterable | ||
|
|
||
| MARKDOWN_LINK_RE = re.compile(r"\[[^\]]+\]\(([^)]+)\)") | ||
| DEFAULT_EXCLUDED_DIRS = {".git", ".venv", "venv", "node_modules", ".mypy_cache", ".pytest_cache"} | ||
|
|
||
|
|
||
| def parse_args() -> argparse.Namespace: | ||
| parser = argparse.ArgumentParser(description="Check markdown hygiene in a repository") | ||
| parser.add_argument("--root", default=".", help="Repository root path") | ||
| parser.add_argument( | ||
| "--include", | ||
| action="append", | ||
| default=[".github", "README.md", "profile"], | ||
| help="Path (file/dir) under root to include; repeatable", | ||
| ) | ||
| return parser.parse_args() | ||
|
|
||
|
|
||
| def to_repo_relative(path: Path, root: Path) -> str: | ||
| return path.relative_to(root).as_posix() | ||
|
|
||
|
|
||
| def gather_markdown_files(root: Path, include_paths: Iterable[str]) -> list[Path]: | ||
| files: list[Path] = [] | ||
| for include in include_paths: | ||
| candidate = (root / include).resolve() | ||
| if not candidate.exists(): | ||
| continue | ||
| if candidate.is_file() and candidate.suffix.lower() == ".md": | ||
| files.append(candidate) | ||
| continue | ||
| if candidate.is_dir(): | ||
| for path in candidate.rglob("*.md"): | ||
| if any(part in DEFAULT_EXCLUDED_DIRS for part in path.parts): | ||
| continue | ||
| files.append(path.resolve()) | ||
| unique = sorted(set(files)) | ||
| return unique | ||
|
|
||
|
|
||
| def sha256_of(path: Path) -> str: | ||
| digest = hashlib.sha256() | ||
| digest.update(path.read_bytes()) | ||
| return digest.hexdigest() | ||
|
|
||
|
|
||
| def find_duplicate_markdown(files: list[Path]) -> list[list[Path]]: | ||
| by_hash: dict[str, list[Path]] = {} | ||
| for path in files: | ||
| file_hash = sha256_of(path) | ||
| by_hash.setdefault(file_hash, []).append(path) | ||
| return [group for group in by_hash.values() if len(group) > 1] | ||
|
|
||
|
|
||
| def strip_anchor_and_query(target: str) -> str: | ||
| no_anchor = target.split("#", 1)[0] | ||
| no_query = no_anchor.split("?", 1)[0] | ||
| return no_query | ||
|
|
||
|
|
||
| def is_external_link(target: str) -> bool: | ||
| lowered = target.lower() | ||
| return lowered.startswith(("http://", "https://", "mailto:", "tel:")) | ||
|
|
||
|
|
||
| def find_broken_local_links(files: list[Path], root: Path) -> list[tuple[Path, str, str]]: | ||
| issues: list[tuple[Path, str, str]] = [] | ||
| for markdown_file in files: | ||
| text = markdown_file.read_text(encoding="utf-8") | ||
| for raw_target in MARKDOWN_LINK_RE.findall(text): | ||
| target = raw_target.strip() | ||
| if not target or target.startswith("#") or is_external_link(target): | ||
| continue | ||
|
|
||
| # Template placeholders are examples, not resolvable links. | ||
| if "{" in target or "}" in target: | ||
| continue | ||
|
|
||
| normalized = strip_anchor_and_query(target) | ||
| if not normalized: | ||
| continue | ||
|
|
||
| # Absolute repo path style: /path/from/repo/root | ||
| if normalized.startswith("/"): | ||
| resolved = (root / normalized.lstrip("/")).resolve() | ||
| else: | ||
| resolved = (markdown_file.parent / normalized).resolve() | ||
|
|
||
| if not resolved.exists(): | ||
| issues.append((markdown_file, target, to_repo_relative(markdown_file, root))) | ||
| return issues | ||
|
|
||
|
|
||
| def print_duplicate_report(duplicates: list[list[Path]], root: Path) -> None: | ||
| if not duplicates: | ||
| print("No duplicate markdown files detected.") | ||
| return | ||
| print("Duplicate markdown files detected:") | ||
| for group in duplicates: | ||
| print("- Duplicate group:") | ||
| for path in group: | ||
| print(f" - {to_repo_relative(path, root)}") | ||
|
|
||
|
|
||
| def print_broken_link_report(broken: list[tuple[Path, str, str]], root: Path) -> None: | ||
| if not broken: | ||
| print("No broken local markdown links detected.") | ||
| return | ||
| print("Broken local markdown links detected:") | ||
| for markdown_file, target, _ in broken: | ||
| rel = to_repo_relative(markdown_file, root) | ||
| print(f"- {rel}: {target}") | ||
|
|
||
|
|
||
| def main() -> int: | ||
| args = parse_args() | ||
| root = Path(args.root).resolve() | ||
|
|
||
| files = gather_markdown_files(root, args.include) | ||
| if not files: | ||
| print("No markdown files found for the configured include paths.") | ||
| return 0 | ||
|
|
||
| duplicates = find_duplicate_markdown(files) | ||
| broken_links = find_broken_local_links(files, root) | ||
|
|
||
| print(f"Scanned {len(files)} markdown files.") | ||
| print_duplicate_report(duplicates, root) | ||
| print_broken_link_report(broken_links, root) | ||
|
|
||
| has_issues = bool(duplicates or broken_links) | ||
| if has_issues: | ||
| print("Markdown hygiene check failed.") | ||
| return 1 | ||
|
|
||
| print("Markdown hygiene check passed.") | ||
| return 0 | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| sys.exit(main()) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why deviate from the normal S-CORE bazel approach?