- Response Codes
- Internal No Response
- Internal Client Error (4XX)
- Internal Server Error (5XX)
- Internal Redirect Loop
- Internal Blocked by Robots.txt
- Internal Blocked Resource
- Internal Redirect Chain
- External Blocked Resource
- Internal Redirection (3XX)
- Internal Redirection (Meta Refresh)
- Internal Redirection (JavaScript)
- External No Response
- External Client Error (4XX)
- External Server Error (5XX)
- Security
- HTTP URLs
- Mixed Content
- Form URL Insecure
- Form On HTTP URL
- Missing HSTS Header
- Unsafe Cross Origin Links
- Protocol-Relative Resource Links
- Missing Content-Security-Policy Header
- Missing X-Content-Type-Options Header
- Missing X-Frames-Options Header
- Missing Secure Referrer-Policy Header
- Bad Content Type
- Hreflang
- Non-200 Hreflang URLs
- Missing Return Links
- Inconsistent Language & Region Confirmation Links
- Non-Canonical Return Links
- Noindex Returns Links
- Incorrect Language & Region Codes
- Multiple Entries
- Not Using Canonical
- Outside <head>
- Unlinked Hreflang URLs
- Missing Self Reference
- Missing X-Default
- JavaScript
- Noindex Only in Original HTML
- Nofollow Only in Original HTML
- Canonical Mismatch
- Uses Old AJAX Crawling Scheme URLs
- Uses Old AJAX Crawling Scheme Meta Fragment Tag
- Pages with Blocked Resources
- Contains JavaScript Links
- Contains JavaScript Content
- Page Title Only in Rendered HTML
- Page Title Updated by JavaScript
- Meta Description Only in Rendered HTML
- Meta Description Updated by JavaScript
- H1 Only in Rendered HTML
- H1 Updated by JavaScript
- Canonical Only in Rendered HTML
- Pages With JavaScript Errors
- Links
- Outlinks To Localhost
- Pages Without Internal Outlinks
- Non-Indexable Page Inlinks Only
- Internal Nofollow Outlinks
- Pages With High External Outlinks
- Pages With High Internal Outlinks
- Follow & Nofollow Internal Inlinks To Page
- Internal Nofollow Inlinks Only
- Pages With High Crawl Depth
- Internal Outlinks With No Anchor Text
- Non-Descriptive Anchor Text In Internal Outlinks
- AMP
- Non-200 Response
- Missing Non-AMP Return Link
- Missing Canonical to Non-AMP
- Non-Indexable Canonical
- Missing <html amp> Tag
- Missing/Invalid Doctype HTML Tag
- Missing Head Tag
- Missing Body Tag
- Missing Canonical
- Missing/Invalid Meta Charset Tag
- Missing/Invalid Meta Viewport Tag
- Missing/Invalid AMP Script
- Missing/Invalid AMP Boilerplate
- Contains Disallowed HTML
- Other Validation Errors
- Indexable
- PageSpeed
- Eliminate Render-Blocking Resources
- Properly Size Images
- Defer Offscreen Images
- Minify CSS
- Minify JavaScript
- Reduce Unused CSS
- Reduce Unused JavaScript
- Efficiently Encode Images
- Serve Images in Next-Gen Formats
- Enable Text Compression
- Preconnect to Required Origin
- Reduce Server Response Times (TTFB)
- Preload Key Requests
- Reduce JavaScript Execution Time
- Serve Static Assets With An Efficient Cache Policy
- Minimize Main-Thread Work
- Image Elements Do Not Have Explicit Width & Height
- Avoid Large Layout Shifts
- Avoid Serving Legacy JavaScript to Modern Browsers
- Avoid Multiple Page Redirects
- Use Video Format for Animated Images
- Avoid Excessive DOM Size
- Ensure Text Remains Visible During Webfont Load
Non-Indexable Page Inlinks Only
Indexable pages that are only linked to from pages that are non-indexable, which includes noindex, canonicalised or robots.txt disallowed pages.
Pages with noindex and links from them will initially be crawled, but noindex pages will be removed from the index and be crawled less over time.
Links from these pages may also be crawled less and it has been debated by Googlers whether links will continue to be counted at all.
Links from canonicalised pages can be crawled initially, but PageRank may not flow as expected if indexing and link signals are passed to another page as indicated in the canonical. This may impact discovery and ranking.
How to Analyse in the SEO Spider
View URLs with this issue in the ‘Links’ tab and ‘Non-Indexable Page Inlinks Only’.
To populate this filter ‘Crawl Analysis’ must be performed via ‘Crawl Analysis > Start’.
The pages that are non-indexable and link to the pages in this issue can be viewed via the ‘Inlinks’ tab.
Export in bulk via ‘Bulk Export > Links > Non-Indexable Page Inlinks Only’.
Robots.txt pages can’t be crawled, so links from these pages will not be seen. Robots.txt disallowed pages will only be reported if ‘ignore robots.txt but report status’ is selected via ‘Config > Robots.txt’.
What Triggers This Issue
This issue is triggered when indexable pages are only linked to from pages that are non-indexable, including those marked with noindex, those that have been canonicalised to other pages, or pages disallowed by robots.txt.
For example if a page on the following website:
https://www.screamingfrog.co.uk/page1/
Was linked to from one page:
https://www.screamingfrog.co.uk/page2/
But https://www.screamingfrog.co.uk/page2/ had a meta noindex tag.
How To Fix
Ensure you link to important pages from indexable pages to avoid any uncertainty in discovery, indexing and ranking.
Consider whether the non-indexable pages linking to these pages should be non-indexable.
Further Reading
- Google Counts Links On Noindexed Pages? It Depends - From Search Engine Roundtable