Skip to content

feat: add sitemap.xml and robots.txt for SEO and AI crawler protectio… #23#24

Open
ankitrraj wants to merge 2 commits into
StabilityNexus:mainfrom
ankitrraj:feature/23-sitemap-robots-txt
Open

feat: add sitemap.xml and robots.txt for SEO and AI crawler protectio… #23#24
ankitrraj wants to merge 2 commits into
StabilityNexus:mainfrom
ankitrraj:feature/23-sitemap-robots-txt

Conversation

@ankitrraj

@ankitrraj ankitrraj commented Dec 13, 2025

Copy link
Copy Markdown

Implemented sitemap.xml and robots.txt using Next.js App Router conventions to improve SEO and block AI crawlers from scraping site content.

Changes
Added app/sitemap.ts - Auto-generates sitemap with all public routes
Added app/robots.ts - Controls crawler access and blocks AI bots
Features
Sitemap (/sitemap.xml)
Includes home, auctions, create, and dashboard pages
Proper priority and change frequency settings
Auto-updates lastModified timestamp
Robots (/robots.txt)
Allows standard search engine crawlers
Blocks /api/ and /dashboard/ from indexing
Blocks AI crawlers:
GPTBot (OpenAI)
ChatGPT-User
CCBot (Common Crawl)
anthropic-ai
Google-Extended (Bard/Gemini training)

Summary by CodeRabbit

  • New Features

    • Added automatic site sitemap covering main pages (root, auctions, create).
    • Added robots configuration to control crawler access, explicitly blocking certain AI bots and restricting /api/ and /dashboard/ access.
  • Chores

    • Removed the dashboard route from the public sitemap.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Dec 13, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds two Next.js app-router metadata files: app/robots.ts exporting a robots rules object that blocks several AI crawlers and points to a sitemap; and app/sitemap.ts exporting a static sitemap array with three route entries (/, /auctions, /create).

Changes

Cohort / File(s) Summary
SEO & Metadata Configuration
app/robots.ts, app/sitemap.ts
Adds Next.js app-router metadata exports: robots() returns a MetadataRoute.Robots object (sitemap URL from NEXT_PUBLIC_SITE_URL default, per-bot disallow rules for GPTBot, ChatGPT-User, CCBot, anthropic-ai, Google-Extended, and a wildcard rule allowing / while disallowing /api/ and /dashboard/); sitemap() returns a MetadataRoute.Sitemap array with static entries for /, /auctions, and /create including lastModified, changeFrequency, and priority.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Static configuration changes; low logic density.
  • Pay attention to:
    • Correct environment variable usage and fallback for BASE_URL.
    • Accuracy of disallowed user-agent list and path patterns.
    • Sitemap entries' URLs and metadata fields.

Possibly related issues

Poem

🐇 I hopped through routes and penned a little chart,

Sitemaps in paw and bot-blocking art,
Three paths now safe beneath the moon's light,
No stray GPT shall nibble tonight! 🌙✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding sitemap.xml and robots.txt for SEO and AI crawler protection, which directly matches the implemented files (app/sitemap.ts and app/robots.ts) and their purposes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
app/sitemap.ts (2)

9-9: Consider caching lastModified timestamps.

Using new Date() for lastModified on every build means the timestamp changes even when content hasn't, which can signal to search engines that pages have been updated when they haven't. This may trigger unnecessary recrawls and could impact crawl budget.

For a static sitemap, consider using a fixed date or moving lastModified generation to build time. For a more sophisticated approach, fetch actual last modification dates from your data source:

// Example: Use a fixed build-time constant
const LAST_BUILD = new Date('2025-12-13');

// Or fetch from your data layer
const lastAuctionUpdate = await getLastAuctionUpdateDate();

Also applies to: 15-15, 21-21, 27-27


6-31: Consider adding dynamic auction routes to sitemap.

This sitemap is static and only includes 4 top-level routes. If your application has individual auction detail pages (e.g., /auctions/[id]), they should also be included in the sitemap for better SEO and discoverability.

Consider making the sitemap dynamic by fetching auction IDs:

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  // Static routes
  const routes = [
    {
      url: BASE_URL,
      lastModified: new Date(),
      changeFrequency: 'daily' as const,
      priority: 1,
    },
    // ... other static routes
  ];

  // Fetch auction IDs and add to sitemap
  const auctions = await fetchAuctionIds(); // implement this
  const auctionRoutes = auctions.map((id) => ({
    url: `${BASE_URL}/auctions/${id}`,
    lastModified: new Date(),
    changeFrequency: 'daily' as const,
    priority: 0.8,
  }));

  return [...routes, ...auctionRoutes];
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75b5108 and 5ea362d.

📒 Files selected for processing (2)
  • app/robots.ts (1 hunks)
  • app/sitemap.ts (1 hunks)
🔇 Additional comments (1)
app/robots.ts (1)

13-32: Good approach to blocking AI training crawlers.

The specific blocking of AI training bots (GPTBot, ChatGPT-User, CCBot, anthropic-ai, Google-Extended) aligns with the PR objectives and is a common practice for sites wanting to opt out of AI training datasets.

Comment thread app/robots.ts
Comment thread app/robots.ts Outdated
Comment thread app/sitemap.ts Outdated
Comment thread app/sitemap.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
app/sitemap.ts (2)

13-18: Good fix: /auctions changeFrequency is no longer “hourly”.
Matches the earlier feedback; “daily” is a safer default hint.


6-25: Good fix: /dashboard is no longer in the sitemap (aligns with robots blocking it).
Keeps sitemap and crawler directives consistent.

🧹 Nitpick comments (2)
app/sitemap.ts (2)

3-4: Normalize BASE_URL to avoid accidental double-slashes / non-canonical URLs.
If NEXT_PUBLIC_SITE_URL ends with /, ${BASE_URL}/auctions becomes //auctions; also consider enforcing https:// and stripping any path component.

-const BASE_URL = process.env.NEXT_PUBLIC_SITE_URL || 'https://hah.app'
+const BASE_URL = (process.env.NEXT_PUBLIC_SITE_URL || 'https://hah.app').replace(/\/+$/, '')

Also applies to: 8-21


6-25: lastModified: new Date() will churn and can trigger unnecessary recrawls.
Because it changes on every render/build, bots may treat everything as “just updated” constantly. Prefer a stable build-time value or real content timestamps per route.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ea362d and e308080.

📒 Files selected for processing (2)
  • app/robots.ts (1 hunks)
  • app/sitemap.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/robots.ts
🔇 Additional comments (1)
app/sitemap.ts (1)

1-26: No changes required. MetadataRoute.Sitemap in Next.js 15.3.3 fully supports both changeFrequency and priority properties. The code correctly uses valid changeFrequency values ('daily' and 'weekly' are both allowed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants