Homepage -- Crawl Behaviour Issues
5 Intentional IssuesThis page demonstrates intentional crawl-behaviour SEO issues. The robots.txt is misconfigured, sitemap is absent, and AI bot directives are missing.
XML Sitemap Status#3Issue #3: XML sitemap missing at /sitemap.xml
This site does not have a sitemap.xml. Search engines must discover pages by crawling alone, wasting crawl budget.
robots.txt Configuration#5Issue #5: robots.txt misconfigured#164Issue #164: No Sitemap: directive in robots.txt
The /robots.txt file is intentionally broken: the Sitemap: directive is missing, and AI bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot) have no explicit rules.
AI Bot Directives#187Issue #187: GPTBot not declared in robots.txt#188Issue #188: ClaudeBot not declared in robots.txt#189Issue #189: PerplexityBot not declared in robots.txt#190Issue #190: Google-Extended not declared in robots.txt#191Issue #191: CCBot not declared in robots.txt
- GPTBot (OpenAI) -- no rule set
- ClaudeBot (Anthropic) -- no rule set
- PerplexityBot -- no rule set
- Google-Extended -- no rule set
- CCBot (Common Crawl) -- no rule set
AI Content Manifests#186Issue #186: llms.txt missing at /llms.txt#199Issue #199: llms-full.txt missing
Neither /llms.txt nor /llms-full.txt is published. AI systems cannot discover what content is available for citation.
Crawl-delay Directive Missing#165Issue #165: No Crawl-delay directive — server has no rate-limiting protection for crawlers
The robots.txt file has no Crawl-delay directive. Without it, aggressive crawlers can hammer the server continuously, exhausting crawl budget and degrading server performance.
Missing from robots.txt:
Crawl-delay: 10
Search Results Pages Not Blocked#166Issue #166: Search results pages (/?s=) not blocked in robots.txt — wastes crawl budget
Internal search result pages at /?s=query are fully crawlable. The Disallow: /?s= rule is absent from robots.txt, allowing search engines to index low-quality duplicate search result pages and waste crawl budget.
Should be in robots.txt but is missing:
Disallow: /?s=
SEO Issues Test Site
Internal developer testing pages — each page contains deliberately planted SEO issues for scanner validation.