Crawl Issues — Redirects, Depth & Canonicals
19 Intentional IssuesThis page demonstrates redirect, crawl depth, canonical, and link-related SEO issues. Broken internal links and redirect chains are intentionally present.
Redirect Configuration & Loop Detection#8Issue #8: Redirect types (301/302) not correctly configured#11Issue #11: Redirect chains and loops present — hurt SEO performance#220Issue #220: Moved pages should return 301 for at least 1 year, not instant 404
This site uses 302 (temporary) redirects where 301 (permanent) redirects should be used. It also contains redirect chains and potential redirect loops (e.g., A → B → A or A → A) that waste crawl budget and cause browser errors.
Redirect Loop Checker Tool:
Run locally: node redirect-tracker.js --test
Tracked redirect loop examples detected programmatically: A → B → A, A → A.
www vs non-www#10Issue #10: www vs non-www redirects not correctly set up
Both www.example.com and example.com serve content without a canonical redirect, creating duplicate content.
Crawl Depth#14Issue #14: Pages more than 3 clicks from homepage — too deep for crawlers
Some pages on this site are buried more than 3 clicks from the homepage, making them harder for search engine crawlers to discover within their crawl budget.
Orphan Pages#17Issue #17: Pages with no internal links pointing to them
Several pages on this site have zero inbound internal links, making them invisible to search engine crawlers following links.
External Link Security#50Issue #50: External links missing rel=noopener noreferrer
External link without noopener noreferrer (intentional) →Canonical Tag Issues#172Issue #172: Canonical URL does not match URL in XML sitemap#175Issue #175: Canonical page blocked by robots.txt
SEO Issue #172 — Canonical vs Sitemap Mismatch:
This page canonical: https://acmeanalytics.example.com/crawl-issues
Sitemap URL: http://acmeanalytics.example.com/crawl-issues
Protocol mismatch (https vs http) — Google cannot confirm canonical
SEO Issue #175 — Canonical Blocked by robots.txt:
Canonical href: https://acmeanalytics.example.com/blocked-canonical/about
robots.txt rule: Disallow: /blocked-canonical/
Google can see the canonical tag but cannot fetch or validate the canonical URL
Image & Video Sitemaps#157Issue #157: No image sitemap entries in XML sitemap#180Issue #180: No video sitemap with required video:video entries
The sitemap declares image: and video: namespaces but contains ZERO <image:image> or <video:video> entries, reducing image/video discovery by Google.
AI Discoverability Files Missing#186Issue #186: /llms.txt file missing — AI crawlers cannot discover site content manifest#199Issue #199: /llms-full.txt file missing — AI systems cannot access full content
Files intentionally absent (returns 404):
GET /llms.txt → 404 Not Found (Issue #186)
GET /llms-full.txt → 404 Not Found (Issue #199)
llms.txt (emerging standard) tells LLMs what site content is available for citation. llms-full.txt provides extended full-content access for AI ingestion. Neither file exists on this site — intentional test case.
Page Indexability Audit#227Issue #227: Critical pages not confirmed indexable via audit
No indexability audit has been performed. Some pages may inadvertently carry a noindex directive or be blocked in robots.txt.