fix: disable AI service links due to Netlify edge function issues

- Remove /api/raw Netlify Function that caused build failures - Comment out ChatGPT/Claude/Perplexity buttons in CopyPageDropdown - Keep Copy page, View as Markdown, Download as SKILL.md options - Update blog post with detailed log of attempted solutions - Clean up netlify.toml by removing broken redirect rule Users can still copy markdown and paste into AI tools manually. The raw markdown files work in browsers but AI crawlers cannot fetch them reliably due to Netlify edge function interception.
2026-01-12 04:09:14 +00:00 · 2025-12-24 01:44:00 -08:00
parent 534f020999
commit e2eaa9c43b
4 changed files with 139 additions and 296 deletions
--- a/content/blog/netlify-edge-excludedpath-ai-crawlers.md
+++ b/content/blog/netlify-edge-excludedpath-ai-crawlers.md
@@ -5,7 +5,7 @@ date: "2025-12-21"
 slug: "netlify-edge-excludedpath-ai-crawlers"
 published: true
 tags: ["netlify", "edge-functions", "ai", "troubleshooting", "help"]
-readTime: "4 min read"
+readTime: "5 min read"
 featured: false
 ---
@@ -36,78 +36,157 @@ The page could not be loaded with the tools currently available, so its raw mark
 **Claude:**
 Works. Loads and reads the markdown successfully.
-## Current configuration
+## Attempted solutions log
-Static files exist in `public/raw/` and are served via `_redirects`:
+### December 24, 2025
-```
+**Attempt 1: excludedPath in netlify.toml**
 /raw/*         /raw/:splat    200
 ```
-Edge function configuration in `netlify.toml`:
+Added array of excluded paths to the edge function declaration:
 ```toml
 [[edge_functions]]
  path = "/*"
  function = "botMeta"
-  excludedPath = "/raw/*"
+  excludedPath = [
    "/raw/*",
    "/assets/*",
    "/api/*",
    "/.netlify/*",
    "/favicon.ico",
    "/favicon.svg",
    "/robots.txt",
    "/sitemap.xml",
    "/llms.txt",
    "/openapi.yaml"
  ]
 ```
-The `botMeta` function also has a code-level check:
+Result: ChatGPT and Perplexity still blocked.
 **Attempt 2: Hard bypass in botMeta.ts**
 Added early return at top of handler to guarantee static markdown is never intercepted:
 ```typescript
-// Skip if it's the home page, static assets, API routes, or raw markdown files
+const url = new URL(request.url);
 if (
-  pathParts.length === 0 ||
+  url.pathname.startsWith("/raw/") ||
-  pathParts[0].includes(".") ||
+  url.pathname.startsWith("/assets/") ||
-  pathParts[0] === "api" ||
+  url.pathname.startsWith("/api/") ||
-  pathParts[0] === "_next" ||
+  url.pathname.startsWith("/.netlify/") ||
-  pathParts[0] === "raw" // This check exists
+  url.pathname.endsWith(".md")
 ) {
  return context.next();
 }
 ```
-## Why it's not working
+Result: ChatGPT and Perplexity still blocked.
-Despite `excludedPath = "/raw/*"` and the code check, the edge function still intercepts requests to `/raw/*.md` before static files are served.
+**Attempt 3: AI crawler whitelist**
-According to Netlify docs, edge functions run before redirects and static file serving. The `excludedPath` should prevent the function from running, but it appears the function still executes and may be returning a response that blocks static file access.
+Added explicit bypass for known AI user agents:
 ## What we've tried
 1. Added `excludedPath = "/raw/*"` in netlify.toml
 2. Added code-level check in botMeta.ts to skip `/raw/` paths
 3. Verified static files exist in `public/raw/` after build
 4. Confirmed `_redirects` rule for `/raw/*` is in place
 5. Tested with different URLPattern syntax (`/raw/*`, `/**/*.md`)
 All attempts result in the same behavior: ChatGPT and Perplexity cannot access the files, while Claude can.
 ## Why Claude works
 Claude's web fetcher may use different headers or handle Netlify's edge function responses differently. It successfully bypasses whatever is blocking ChatGPT and Perplexity.
 ## The question
 How can we configure Netlify edge functions to truly exclude `/raw/*` paths so static markdown files are served directly to all AI crawlers without interception?
 Is there a configuration issue with `excludedPath`? Should we use a different approach like header-based matching to exclude AI crawlers from the botMeta function? Or is there a processing order issue where edge functions always run before static files regardless of exclusions?
 ## Code reference
 The CopyPageDropdown component sends these URLs to AI services:
 ```typescript
-const rawMarkdownUrl = `${origin}/raw/${props.slug}.md`;
+const AI_CRAWLERS = [
  "gptbot", "chatgpt", "chatgpt-user", "oai-searchbot",
  "claude-web", "claudebot", "anthropic", "perplexitybot"
 ];
 if (isAICrawler(userAgent)) {
  return context.next();
 }
 ```
-Example: `https://www.markdown.fast/raw/fork-configuration-guide.md`
+Result: ChatGPT and Perplexity still blocked.
-The files exist. The redirects are configured. The edge function has exclusions. But AI crawlers still cannot access them.
+**Attempt 4: Netlify Function at /api/raw/:slug**
 Created a serverless function to serve markdown files directly:
 ```javascript
 // netlify/functions/raw.js
 exports.handler = async (event) => {
  const slug = event.queryStringParameters?.slug;
  // Read from dist/raw/${slug}.md or public/raw/${slug}.md
  return {
    statusCode: 200,
    headers: { "Content-Type": "text/plain; charset=utf-8" },
    body: markdownContent
  };
 };
 ```
 With redirect rule:
 ```toml
 [[redirects]]
  from = "/api/raw/*"
  to = "/.netlify/functions/raw?slug=:splat"
  status = 200
  force = true
 ```
 Result: Netlify build failures due to function bundling issues and `package-lock.json` dependency conflicts.
 **Attempt 5: Header adjustments**
 Removed `Link` header from global scope to prevent header merging on `/raw/*`:
 ```toml
 [[headers]]
  for = "/*"
  [headers.values]
    X-Frame-Options = "DENY"
    # Link header removed from global scope
 [[headers]]
  for = "/index.html"
  [headers.values]
    Link = "</llms.txt>; rel=\"author\""
 ```
 Removed `X-Robots-Tag = "noindex"` from `/raw/*` headers.
 Result: ChatGPT and Perplexity still blocked.
 ### Why these attempts failed
 The core issue appears to be how ChatGPT and Perplexity fetch URLs. Their tools receive 400 or 403 responses even when `curl` from the command line works. This suggests:
 1. Netlify may handle AI crawler user agents differently at the CDN level
 2. The edge function exclusions work for browsers but not for AI fetch tools
 3. There may be rate limiting or bot protection enabled by default
 ## Current workaround
 Users can still share content with AI tools by:
 1. **Copy page** copies markdown to clipboard, then paste into any AI
 2. **View as Markdown** opens the raw `.md` file in a browser tab for manual copying
 3. **Download as SKILL.md** downloads in Anthropic Agent Skills format
 The direct "Open in ChatGPT/Claude/Perplexity" buttons have been disabled since the URLs don't work reliably.
 ## Working features
 Despite AI crawler issues, these features work correctly:
 - `/raw/*.md` files load in browsers
 - `llms.txt` discovery file is accessible
 - `openapi.yaml` API spec loads properly
 - Sitemap and RSS feeds generate correctly
 - Social preview bots (Twitter, Facebook, LinkedIn) receive OG metadata
 - Claude's web fetcher can access raw markdown
 ## Help needed
-If you've solved this or have suggestions, we'd appreciate guidance. The goal is simple: serve static markdown files at `/raw/*.md` to all clients, including AI crawlers, without edge function interception.
+If you've solved this or have suggestions, open an issue. We've tried:
-GitHub raw URLs work as a workaround, but we'd prefer to use Netlify-hosted files for consistency and to avoid requiring users to configure GitHub repo details when forking.
+- netlify.toml excludedPath arrays
 - Code-level path checks in edge functions
 - AI crawler user agent whitelisting
 - Netlify Functions as an alternative endpoint
 - Header configuration adjustments
 None have worked for ChatGPT or Perplexity. GitHub raw URLs remain the most reliable option for AI consumption, but require additional repository configuration when forking.
--- a/netlify.toml
+++ b/netlify.toml
@@ -5,13 +5,6 @@
 [build.environment]
  NODE_VERSION = "20"
 # API raw markdown endpoint for AI tools (ChatGPT, Claude, Perplexity)
 [[redirects]]
  from = "/api/raw/*"
  to = "/.netlify/functions/raw?slug=:splat"
  status = 200
  force = true
 # Raw markdown passthrough - explicit rule prevents SPA fallback from intercepting
 [[redirects]]
  from = "/raw/*"
--- a/netlify/functions/raw.js
+++ b/netlify/functions/raw.js
@@ -1,77 +0,0 @@
 const fs = require("fs");
 const path = require("path");
 /**
 * Netlify Function: /api/raw/:slug
 *
 * Serves raw markdown files for AI tools (ChatGPT, Claude, Perplexity).
 * Returns text/plain with minimal headers for reliable AI ingestion.
 */
 function normalizeSlug(input) {
  return (input || "").trim().replace(/^\/+|\/+$/g, "");
 }
 function tryRead(p) {
  try {
    if (!fs.existsSync(p)) return null;
    const body = fs.readFileSync(p, "utf8");
    if (!body || body.trim().length === 0) return null;
    return body;
  } catch {
    return null;
  }
 }
 exports.handler = async (event) => {
  const slugRaw =
    event.queryStringParameters && event.queryStringParameters.slug;
  const slug = normalizeSlug(slugRaw);
  if (!slug) {
    return {
      statusCode: 400,
      headers: {
        "Content-Type": "text/plain; charset=utf-8",
        "Access-Control-Allow-Origin": "*",
      },
      body: "missing slug",
    };
  }
  const filename = slug.endsWith(".md") ? slug : `${slug}.md`;
  const root = process.cwd();
  const candidates = [
    path.join(root, "public", "raw", filename),
    path.join(root, "dist", "raw", filename),
  ];
  let body = null;
  for (const p of candidates) {
    body = tryRead(p);
    if (body) break;
  }
  if (!body) {
    return {
      statusCode: 404,
      headers: {
        "Content-Type": "text/plain; charset=utf-8",
        "Access-Control-Allow-Origin": "*",
      },
      body: `not found: ${filename}`,
    };
  }
  return {
    statusCode: 200,
    headers: {
      "Content-Type": "text/plain; charset=utf-8",
      "Access-Control-Allow-Origin": "*",
      "Cache-Control": "public, max-age=3600",
    },
    body,
  };
 };
--- a/src/components/CopyPageDropdown.tsx
+++ b/src/components/CopyPageDropdown.tsx
@@ -1,85 +1,9 @@
 import { useState, useRef, useEffect, useCallback } from "react";
-import {
+import { Copy, Check, AlertCircle, FileText, Download } from "lucide-react";
  Copy,
  MessageSquare,
  Sparkles,
  Search,
  Check,
  AlertCircle,
  FileText,
  Download,
 } from "lucide-react";
 // Maximum URL length for query parameters (conservative limit)
 const MAX_URL_LENGTH = 6000;
 // AI service configurations
 interface AIService {
  id: string;
  name: string;
  icon: typeof Copy;
  baseUrl: string;
  description: string;
  supportsUrlPrefill: boolean;
  // Custom URL builder for services with special formats
  buildUrl?: (prompt: string) => string;
  // URL-based builder - takes raw markdown file URL for better AI parsing
  buildUrlFromRawMarkdown?: (rawMarkdownUrl: string) => string;
 }
 // AI services configuration - uses raw markdown URLs for better AI parsing
 const AI_SERVICES: AIService[] = [
  {
    id: "chatgpt",
    name: "ChatGPT",
    icon: MessageSquare,
    baseUrl: "https://chatgpt.com/",
    description: "Analyze with ChatGPT",
    supportsUrlPrefill: true,
    // Uses raw markdown file URL for direct content access
    buildUrlFromRawMarkdown: (rawMarkdownUrl) => {
      const prompt =
        `Attempt to load and read the raw markdown at the URL below.\n` +
        `If successful provide a concise summary and then ask what the user needs help with.\n` +
        `If not accessible do not guess the content. State that the page could not be loaded and ask the user how you can help.\n\n` +
        `${rawMarkdownUrl}`;
      return `https://chatgpt.com/?q=${encodeURIComponent(prompt)}`;
    },
  },
  {
    id: "claude",
    name: "Claude",
    icon: Sparkles,
    baseUrl: "https://claude.ai/",
    description: "Analyze with Claude",
    supportsUrlPrefill: true,
    buildUrlFromRawMarkdown: (rawMarkdownUrl) => {
      const prompt =
        `Attempt to load and read the raw markdown at the URL below.\n` +
        `If successful provide a concise summary and then ask what the user needs help with.\n` +
        `If not accessible do not guess the content. State that the page could not be loaded and ask the user how you can help.\n\n` +
        `${rawMarkdownUrl}`;
      return `https://claude.ai/new?q=${encodeURIComponent(prompt)}`;
    },
  },
  {
    id: "perplexity",
    name: "Perplexity",
    icon: Search,
    baseUrl: "https://www.perplexity.ai/search",
    description: "Research with Perplexity",
    supportsUrlPrefill: true,
    buildUrlFromRawMarkdown: (rawMarkdownUrl) => {
      const prompt =
        `Attempt to load and read the raw markdown at the URL below.\n` +
        `If successful provide a concise summary and then ask what the user needs help with.\n` +
        `If not accessible do not guess the content. State that the page could not be loaded and ask the user how you can help.\n\n` +
        `${rawMarkdownUrl}`;
      return `https://www.perplexity.ai/search?q=${encodeURIComponent(prompt)}`;
    },
  },
 ];
 // Extended props interface with optional metadata
 interface CopyPageDropdownProps {
  title: string;
@@ -321,67 +245,6 @@ export default function CopyPageDropdown(props: CopyPageDropdownProps) {
    setTimeout(() => setIsOpen(false), 1500);
  };
  // Generic handler for opening AI services
  // Uses /api/raw/:slug endpoint for AI tools (ChatGPT, Claude, Perplexity)
  // IMPORTANT: window.open must happen BEFORE any await to avoid popup blockers
  const handleOpenInAI = async (service: AIService) => {
    // Use /api/raw/:slug endpoint for AI tools - more reliable than static /raw/*.md files
    if (service.buildUrlFromRawMarkdown) {
      // Build absolute API URL using current origin
      // Uses Netlify Function endpoint that returns text/plain with minimal headers
      const apiRawUrl = new URL(
        `/api/raw/${props.slug}`,
        window.location.origin,
      ).toString();
      const targetUrl = service.buildUrlFromRawMarkdown(apiRawUrl);
      window.open(targetUrl, "_blank");
      setIsOpen(false);
      return;
    }
    // Other services: send full markdown content
    const markdown = formatAsMarkdown(props);
    const prompt = `Please analyze this article:\n\n${markdown}`;
    // Build the target URL using the service's buildUrl function
    if (!service.buildUrl) {
      // Fallback: open base URL FIRST (sync), then copy to clipboard
      window.open(service.baseUrl, "_blank");
      const success = await writeToClipboard(markdown);
      if (success) {
        setFeedback("url-too-long");
        setFeedbackMessage("Copied! Paste in " + service.name);
      } else {
        setFeedback("error");
        setFeedbackMessage("Failed to copy content");
      }
      clearFeedback();
      return;
    }
    const targetUrl = service.buildUrl(prompt);
    // Check URL length - if too long, open base URL then copy to clipboard
    if (isUrlTooLong(targetUrl)) {
      // Open window FIRST (must be sync to avoid popup blocker)
      window.open(service.baseUrl, "_blank");
      const success = await writeToClipboard(markdown);
      if (success) {
        setFeedback("url-too-long");
        setFeedbackMessage("Copied! Paste in " + service.name);
      } else {
        setFeedback("error");
        setFeedbackMessage("Failed to copy content");
      }
      clearFeedback();
    } else {
      // URL is within limits, open directly with prefilled content
      window.open(targetUrl, "_blank");
      setIsOpen(false);
    }
  };
  // Handle download skill file (Anthropic Agent Skills format)
  const handleDownloadSkill = () => {
    const skillContent = formatAsSkill(props);
@@ -423,6 +286,10 @@ export default function CopyPageDropdown(props: CopyPageDropdownProps) {
    }
  };
  // Suppress unused variable warnings for functions that may be used later
  void isUrlTooLong;
  void MAX_URL_LENGTH;
  return (
    <div className="copy-page-dropdown" ref={dropdownRef}>
      {/* Trigger button with ARIA attributes */}
@@ -484,33 +351,6 @@ export default function CopyPageDropdown(props: CopyPageDropdownProps) {
            </div>
          </button>
          {/* AI service options */}
          {AI_SERVICES.map((service) => {
            const Icon = service.icon;
            return (
              <button
                key={service.id}
                className="copy-page-item"
                onClick={() => handleOpenInAI(service)}
                role="menuitem"
                tabIndex={0}
              >
                <Icon size={16} className="copy-page-icon" aria-hidden="true" />
                <div className="copy-page-item-content">
                  <span className="copy-page-item-title">
                    Open in {service.name}
                    <span className="external-arrow" aria-hidden="true">
                      ↗
                    </span>
                  </span>
                  <span className="copy-page-item-desc">
                    {service.description}
                  </span>
                </div>
              </button>
            );
          })}
          {/* View as Markdown option */}
          <button
            className="copy-page-item"
@@ -553,6 +393,14 @@ export default function CopyPageDropdown(props: CopyPageDropdownProps) {
              </span>
            </div>
          </button>
          {/* AI service options temporarily disabled
           * ChatGPT, Claude, and Perplexity links were removed because
           * Netlify edge functions block AI crawler fetch requests to /raw/*.md
           * despite multiple configuration attempts. See blog post:
           * /netlify-edge-excludedpath-ai-crawlers for details.
           * Users can still copy markdown and paste into AI tools.
           */}
        </div>
      )}
    </div>