Issues

Response Codes: Internal Blocked by Robots.txt

Issues Response Codes Internal Blocked by Robots.txt

Internal URLs blocked by the site’s robots.txt. This means they cannot be crawled and is a critical issue if you want the page content to be crawled and indexed by search engines.

How to Analyse in the SEO Spider

Use the ‘Response Codes’ tab and ‘Internal’ and ‘Blocked By Robots.txt’ filters to view these URLs.

View URLs that link to URLs blocked by robots.txt using the lower ‘Inlinks’ tab and export them in bulk via ‘Bulk Export > Response Codes > Internal > Blocked by Robots.txt Inlinks’.

What Triggers This Issue

The issue is triggered when directives within the robots.txt match the URLs in question. An example of this could be a directive like:

Disallow: /private/

In the robots.txt file, which would block search engines from crawling any URL that begins with

https://www.screamingfrog.co.uk/private/

How To Fix

Review URLs to ensure they should be disallowed. If they are incorrectly disallowed, then the site’s robots.txt should be updated to allow them to be crawled.

Consider whether you should be linking internally to these URLs and remove links where appropriate.