⚠️ DEV TOOL — This website intentionally contains SEO issues for testing. Not for public use.Issues Index →

Homepage -- Crawl Behaviour Issues

5 Intentional Issues

This page demonstrates intentional crawl-behaviour SEO issues. The robots.txt is misconfigured, sitemap is absent, and AI bot directives are missing.

XML Sitemap Status#3Issue #3: XML sitemap missing at /sitemap.xml

This site does not have a sitemap.xml. Search engines must discover pages by crawling alone, wasting crawl budget.

robots.txt Configuration#5Issue #5: robots.txt misconfigured#164Issue #164: No Sitemap: directive in robots.txt

The /robots.txt file is intentionally broken: the Sitemap: directive is missing, and AI bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot) have no explicit rules.

View broken robots.txt ?

AI Bot Directives#187Issue #187: GPTBot not declared in robots.txt#188Issue #188: ClaudeBot not declared in robots.txt#189Issue #189: PerplexityBot not declared in robots.txt#190Issue #190: Google-Extended not declared in robots.txt#191Issue #191: CCBot not declared in robots.txt

  • GPTBot (OpenAI) -- no rule set
  • ClaudeBot (Anthropic) -- no rule set
  • PerplexityBot -- no rule set
  • Google-Extended -- no rule set
  • CCBot (Common Crawl) -- no rule set

AI Content Manifests#186Issue #186: llms.txt missing at /llms.txt#199Issue #199: llms-full.txt missing

Neither /llms.txt nor /llms-full.txt is published. AI systems cannot discover what content is available for citation.

Crawl-delay Directive Missing#165Issue #165: No Crawl-delay directive — server has no rate-limiting protection for crawlers

The robots.txt file has no Crawl-delay directive. Without it, aggressive crawlers can hammer the server continuously, exhausting crawl budget and degrading server performance.

Missing from robots.txt:

Crawl-delay: 10

Search Results Pages Not Blocked#166Issue #166: Search results pages (/?s=) not blocked in robots.txt — wastes crawl budget

Internal search result pages at /?s=query are fully crawlable. The Disallow: /?s= rule is absent from robots.txt, allowing search engines to index low-quality duplicate search result pages and waste crawl budget.

Should be in robots.txt but is missing:

Disallow: /?s=

SEO Issues Test Site

Internal developer testing pages — each page contains deliberately planted SEO issues for scanner validation.

/seo-html/index.htmlTest Site Home/seo-html/issues-index.htmlIssues Index (Master List)/seo-html/crawl-issues.htmlCrawl Behaviour Issues/seo-html/head-tags-issues.htmlHTML Head Tag Issues/seo-html/page-seo-basics.htmlPage SEO Basics Issues/seo-html/image-issues.htmlImage Analysis Issues/seo-html/security-headers-issues.htmlHTTP Security Issues/seo-html/social-opengraph-issues.htmlSocial & Open Graph Issues/seo-html/structured-data-issues.htmlStructured Data Issues/seo-html/international-seo-issues.htmlInternational SEO Issues/seo-html/content-readability-issues.htmlContent & Readability Issues/seo-html/careers.htmlCareers Page (JobPosting schema test)/seo-html/location.htmlLocation Page (LocalBusiness schema test)/seo-html/events.htmlEvents Page (Event + VideoObject schema test)/seo-html/duplicate-elements-test.htmlDuplicate Elements Test Page/seo-html/pagination-test.htmlPagination Test Page/seo-html/status-200.htmlHTTP 200 OK Test Page/seo-html/status-301.htmlHTTP 301 Redirect Test Page/seo-html/status-302.htmlHTTP 302 Redirect Test Page/seo-html/status-404.htmlHTTP 404 Not Found Test Page/seo-html/status-410.htmlHTTP 410 Gone Test Page/seo-html/public/allowed-page.htmlRobots.txt Allowed Page/seo-html/private/blocked-page.htmlRobots.txt Disallowed Page/seo-html/noindex-page.htmlRobots Meta Noindex Page/seo-html/indexable-page.htmlRobots Meta Indexable Page/seo-html/canonical-blocked-test.htmlCanonical Blocked by Robots Page/seo-html/page-a.htmlCanonical in Sitemap (page-a)/seo-html/page-b.htmlCanonical NOT in Sitemap (page-b)/seo-html/canonical-sitemap-match.htmlCanonical Sitemap Match/seo-html/canonical-sitemap-mismatch.htmlCanonical Sitemap Mismatch