Canonical URL fix for search engines (GitHub Issue #6) and other seo fixes"

2026-01-12 04:09:14 +00:00 · 2026-01-07 21:48:41 -08:00
parent b274ddf3c9
commit 1257fa220f
41 changed files with 537 additions and 55 deletions
--- a/content/blog/setup-guide.md
+++ b/content/blog/setup-guide.md
@@ -1408,6 +1408,50 @@ Your blog includes these API endpoints for search engines and AI:
 | `/openapi.yaml`                | OpenAPI 3.0 specification   |
 | `/llms.txt`                    | AI agent discovery          |

+## SEO and Bot Detection
+
+Your site includes intelligent bot detection that serves different responses to different visitors.
+
+### How It Works
+
+The `netlify/edge-functions/botMeta.ts` edge function intercepts requests and serves pre-rendered HTML with correct meta tags to:
+
+- **Social preview bots** (Twitter, Facebook, LinkedIn, Discord): Get Open Graph tags for link previews
+- **Search engine bots** (Google, Bing, DuckDuckGo): Get correct canonical URLs
+
+Regular browsers and AI crawlers receive the normal SPA and let JavaScript update the meta tags.
+
+### Configuration
+
+Edit the bot arrays at the top of `netlify/edge-functions/botMeta.ts` to customize which bots receive pre-rendered HTML:
+
+```typescript
+// Social preview bots - for link previews
+const SOCIAL_PREVIEW_BOTS = ["twitterbot", "facebookexternalhit", ...];
+
+// Search engine bots - for correct canonical URLs
+const SEARCH_ENGINE_BOTS = ["googlebot", "bingbot", ...];
+
+// AI crawlers - get normal SPA (can render JavaScript)
+const AI_CRAWLERS = ["gptbot", "claudebot", ...];
+```
+
+### Testing
+
+Verify bot detection with curl:
+
+```bash
+# Simulate Googlebot
+curl -H "User-Agent: Googlebot" https://yoursite.com/post-slug | grep canonical
+# Expected: correct page canonical
+
+# Normal request
+curl https://yoursite.com/post-slug | grep canonical
+# Expected: homepage canonical (JavaScript will update it)
+```
+
+See `FORK_CONFIG.md` for detailed configuration options.
+
 ## Import External Content

 Use Firecrawl to import articles from external URLs as markdown posts:
--- a/content/pages/changelog-page.md
+++ b/content/pages/changelog-page.md
@@ -11,6 +11,27 @@ docsSectionOrder: 4

 All notable changes to this project.

+## v2.12.0
+
+Released January 7, 2026
+
+**Canonical URL fix for search engines (GitHub Issue #6)**
+
+Fixed a mismatch where raw HTML was showing the homepage canonical URL instead of the page-specific canonical URL. Search engines that check raw HTML before rendering JavaScript now receive the correct canonical tags.
+
+**Changes:**
+
+- Added search engine bot detection (Google, Bing, DuckDuckGo, etc.) to serve pre-rendered HTML
+- Search engines now receive correct canonical URLs in the initial HTML response
+- Added SEO Bot Configuration documentation in FORK_CONFIG.md and setup-guide.md
+- Bot detection arrays are easily customizable in `netlify/edge-functions/botMeta.ts`
+
+**For forkers:**
+
+The bot detection configuration is documented with clear comments at the top of `botMeta.ts`. You can customize which bots receive pre-rendered HTML by editing the `SOCIAL_PREVIEW_BOTS`, `SEARCH_ENGINE_BOTS`, and `AI_CRAWLERS` arrays.
+
+---
+
 ## v2.11.0

 Released January 6, 2026
--- a/content/pages/home.md
+++ b/content/pages/home.md
@@ -31,4 +31,6 @@ agents. -->

 **Sync Commands** - Sync discovery commands to update AGENTS.md, CLAUDE.md, and llms.txt

-**Semantic search** - Find content by meaning, not just keywords, using vector embeddings.
+**Semantic search** - Find content by meaning, not just keywords.
+
+**Ask AI** - Chat with your site content. Get answers with sources.