Copyright Β© 2025-2026 Blackout Secure | Apache License 2.0
Enterprise-grade automated sitemap generation (XML/TXT/GZIP) for static sites, SSG frameworks (Next.js, Gatsby, Hugo, Jekyll), and dynamic applications. Built for reliability, performance, and SEO best practices.
- Multiple Formats: XML, TXT, and GZIP compressed sitemaps
- Smart Discovery: Auto-detect site URLs and directories
- Framework Support: Works with Next.js, Gatsby, Hugo, Jekyll, Vite, and more
- SEO Optimized: Canonical URL parsing, link discovery, lastmod timestamps
- Validation: Built-in validation against sitemaps.org protocol
- Large Sites: Auto-splitting for sites with 50,000+ URLs
- Flexible: Customizable patterns, exclusions, and priorities
- Git Integration: Last modified dates from git history
- No Build Required: Can validate existing sitemaps without generation
- GitHub Actions environment (Ubuntu, macOS, or Windows)
- Built site files (HTML, CSS, JS, etc.)
- For git-based lastmod:
fetch-depth: 0in checkout step
name: Generate Sitemap
on:
push:
branches: [main]
workflow_dispatch:
jobs:
sitemap:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for git-based lastmod
- name: Build your site
run: npm run build # or your build command
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
- name: Build Next.js site
run: npm run build
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'out'
lastmod_strategy: 'git'- name: Build Gatsby site
run: npm run build
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'public'- name: Build Hugo site
run: hugo --minify
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'public'- name: Build Jekyll site
run: bundle exec jekyll build
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: '_site'- name: Build Vite project
run: npm run build
- name: Generate sitemap
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'- name: Generate sitemap with custom rules
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
include_patterns: '**/*.html,**/*.htm,**/*.php'
exclude_patterns: '**/*.map,**/drafts/**,**/private/**'
exclude_urls: '*/admin/*,*/test/*'
changefreq: 'weekly'
priority: '0.8'Include non-HTML pages or external resources:
- name: Generate sitemap with additional URLs
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
additional_urls: 'https://example.com/api,https://example.com/app'- name: Generate XML sitemap only
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
generate_sitemap_txt: 'false'| Input | Description | Example |
|---|---|---|
site_url |
Public base URL of your site | https://example.com |
| Input | Description | Default |
|---|---|---|
public_dir |
Directory containing built site files | dist |
sitemap_output_dir |
Where to write sitemap files | Same as public_dir |
include_patterns |
Glob patterns to include | **/*.html,**/*.htm |
exclude_patterns |
Glob patterns to exclude | **/*.map |
lastmod_strategy |
Source for lastmod dates | git |
generate_sitemap_gzip |
Create gzipped version | true |
generate_sitemap_txt |
Create TXT format | true |
| Input | Description | Valid Values |
|---|---|---|
changefreq |
How often pages change | always, hourly, daily, weekly, monthly, yearly, never |
priority |
Relative priority on your site | 0.0 to 1.0 |
parse_canonical |
Use canonical URLs from HTML | true (default) |
discover_links |
Auto-discover internal links | true (default) |
| Input | Description | Default |
|---|---|---|
additional_urls |
Extra URLs to include | - |
exclude_urls |
URL patterns to exclude | */sitemap*.xml,*/sitemap*.txt,*/sitemap*.xml.gz |
exclude_extensions |
File extensions to exclude | .zip,.exe,.dmg,.pkg,.deb,.rpm,.tar,.gz,.7z,.rar,.iso |
sitemap_filename |
Main sitemap filename | sitemap.xml |
validate_sitemaps |
Validate existing sitemaps | - |
strict_validation |
Fail on validation issues | true |
git- Use git commit timestamp (requiresfetch-depth: 0)filemtime- Use file modification timecurrent- Use build/generation timenone- Omit lastmod tag
| Output | Description |
|---|---|
sitemap_path |
Path to main sitemap.xml |
sitemap_index_path |
Path to sitemap index (if split) |
sitemap_txt_path |
Path to TXT sitemap (if enabled) |
The action automatically validates:
- Sitemap size limits (50MB uncompressed per sitemaps.org)
- URL count limits (50,000 URLs per file)
- XML format validity
- URL format compliance
Set strict_validation: false to allow warnings without failing the workflow.
You can use this action to validate existing sitemaps without generating new ones. This is useful for:
- Validating sitemaps from external sources
- Pre-deployment validation checks
- CI/CD quality gates
- name: Validate existing sitemaps
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
validate_sitemaps: 'dist/sitemap.xml,dist/sitemap-index.xml'
strict_validation: 'true'You can validate multiple sitemaps by providing comma-separated paths. The validator checks:
- XML Sitemaps: Structure, namespace, URL count, URL format, priorities, and change frequencies
- TXT Sitemaps: URL format, line endings, encoding
- Sitemap Indexes: Structure, sitemap entries, and referenced sitemap URLs
- Size Compliance: Uncompressed file size limits
- Format Compliance: sitemaps.org protocol adherence
For sites with more than 50,000 URLs, the action automatically:
- Splits URLs into multiple sitemap files
- Creates a sitemap index file
- Ensures each file meets protocol limits
Enable debug outputs to troubleshoot:
- name: Generate sitemap with debugging
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
debug_list_files: 'true'
debug_list_urls: 'true'
debug_show_sitemap: 'true'Available debug flags:
debug_list_files- Show all discovered filesdebug_list_canonical- Show parsed canonical URLsdebug_list_urls- Show all sitemap URLsdebug_show_sitemap- Display XML contentdebug_show_sitemap_txt- Display TXT contentdebug_show_exclusions- Show excluded files/URLs
Cause: Build step may have failed or public_dir is incorrect.
Solution:
- Verify build completes successfully
- Check
public_dirmatches your build output location - Enable
debug_list_files: 'true'to see what's being scanned - Verify files exist:
ls -la dist/
Cause: Include patterns don't match files, or all files are excluded.
Solution:
- Check
include_patterns- default is**/*.html,**/*.htm - Verify files match the pattern
- Check
exclude_patternsandexclude_urlsfor overlaps - Use
debug_show_exclusions: 'true'to see what's excluded
Cause: Git history not available or wrong strategy selected.
Solution:
- For
lastmod_strategy: 'git', ensurefetch-depth: 0in checkout:- uses: actions/checkout@v4 with: fetch-depth: 0
- Switch to
lastmod_strategy: 'filemtime'if git not available - Use
lastmod_strategy: 'none'to omit lastmod tag
Cause: Generated XML doesn't match sitemaps.org protocol.
Solution:
- Check for invalid characters in URLs
- Ensure
priorityis between 0.0 and 1.0 - Validate
changefreqvalues - Use
debug_show_sitemap: 'true'to inspect output - Set
strict_validation: 'false'temporarily to see warnings
Cause: parse_canonical or auto-detection is overriding site_url.
Solution:
- Set
parse_canonical: 'false'to disable canonical parsing - Ensure
site_urlinput is provided explicitly - Check if HTML files contain incorrect canonical tags
Cause: Missing permissions or git configuration.
Solution:
- Ensure proper git configuration:
git config user.name "github-actions[bot]" git config user.email "github-actions[bot]@users.noreply.github.com"
- Check GitHub token permissions if using custom tokens
- Verify branch protection rules allow commits
Answer: Run on every build or deployment. The example workflow above triggers on push to main and allows manual trigger via workflow_dispatch.
Answer: Yes, build your site first (which pre-renders dynamic pages), then run the action. Works with SSG frameworks that pre-render to static files.
Answer: By default, it indexes HTML/HTM files. Use include_patterns to add other types:
include_patterns: '**/*.html,**/*.htm,**/*.pdf,**/*.json'Answer: Yes, use either:
exclude_urls: URL patterns (e.g.,*/admin/*,*/test/*)exclude_patterns: File patterns (e.g.,**/*.draft.html)
Answer: Per sitemaps.org protocol:
- 50MB uncompressed per file
- 50,000 URLs per file
- Action auto-splits large sitemaps into index + multiple sitemaps
Answer: It discovers links from HTML <a href> tags if discover_links: 'true' (default). For API endpoints or content not in HTML, use additional_urls.
Answer: Yes, use the validate_sitemaps input:
- uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist'
validate_sitemaps: 'dist/sitemap.xml'Answer: Run the action multiple times with different site_url and public_dir:
- name: Generate sitemap for site 1
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.com'
public_dir: 'dist/site1'
sitemap_output_dir: 'dist/site1'
- name: Generate sitemap for site 2
uses: blackoutsecure/bos-sitemap-generator@v1
with:
site_url: 'https://example.org'
public_dir: 'dist/site2'
sitemap_output_dir: 'dist/site2'Answer: Not automatically. Use exclude_urls or exclude_patterns to manually exclude paths that should be disallowed.
Answer: Once deployed:
- Google: Use Google Search Console
- Bing: Use Bing Webmaster Tools
- Others: Most support sitemap.xml at the root or via robots.txt
Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap.xml.gz
General contribution guidelines (issue triage, PR style, test
expectations, security review) come from the organisation default at
blackoutsecure/.github/CONTRIBUTING.md,
which applies to every repo in the org. The repo-specific bits are
below.
All PRs target the dev branch. The main branch is built by
the Marketplace release pipeline (the launchpad reusable in
bos-automation-hub)
and is read-only to humans β PRs opened against main will be
closed.
# Install dev deps (Node 20+)
npm ci
# Build the action bundle (mocha pretest also runs this)
npm run build
# Run the test suite (this is what CI runs)
npm test
# Lint + format + test in one shot
npm run check
# Coverage report (HTML + text)
npm run coverage- JavaScript: ESLint flat config (
eslint.config.js) + Prettier (.prettierrc.yaml) β both are managed; CI runsnpm run check. - Bundle:
dist/index.jsis committed (ncc bundle) β Marketplace consumers fetch the tag, notnpm install, so the bundle MUST be in sync withsrc/on every release. CI checks for drift. - Action contract:
action.ymlinputs:/outputs:are the published contract; changes are SemVer-significant. - YAML (workflows):
actionlintclean, pin third-party actions by SHA (not tag), minimisepermissions:per job.
Releases promote dev β main via the launchpad's workflow_dispatch
mode = release. See the Marketplace launchpad reusable
for the full event-routing + allowlist model.
Copyright Β© 2025-2026 Blackout Secure
Licensed under the Apache License, Version 2.0. See LICENSE for details.
- Issues: GitHub Issues
- Security: see the organization-wide Security Policy and report via GitHub Security Advisories
- Sponsor: Support this project via GitHub Sponsors
Made with β€οΈ by Blackout Secure