SEO Spider Tabs

Internal

The internal tab combines all data crawled from all other tabs except the external and custom tabs. So it combines data from the following tabs – response codes, uri, page titles, meta description, meta keywords, h1, h2, images, meta & canonical so data can be viewed or exported all together.

  • Address – The URI crawled.
  • Content – The content type of the URI.
  • Status Code – Http response code.
  • Status – The http header response.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Title 1 – The (first) page title.
  • Title 1 Length – The character length of the page title.
  • Title 1 Pixel Width – The pixel width of the page title as described in our pixel width post.
  • Meta Description 1 – The meta description.
  • Meta Description Length 1 – The character length of the meta description.
  • Meta Description Pixel Width – The pixel width of the meta description.
  • Meta Keyword 1 – The meta keywords.
  • Meta Keywords Length – The character length of the meta keywords.
  • h1 – 1 – The first h1 (heading) on the page.
  • h1 – Len-1 – The character length of the h1.
  • h2 – 1 – The first h2 (heading) on the page.
  • h2 – Len-1 – The character length of the h2.
  • Meta Data 1 – Meta robots data.
  • Meta Refresh 1 – Meta refresh data.
  • Canonical Link Element – The canonical link element data.
  • Size – Size is in bytes, divide by 1024 to convert to kilobytes. The value is set from the Content-Length header if provided, if not it’s set to zero. For HTML pages this is updated to the size of the (uncompressed) HTML in bytes.
  • Word Count – This is all ‘words’ inside the body tag. This does not include HTML markup. Our figures may not be exactly what doing this manually would find, as the parser performs certain fix-ups on invalid html. Your rendering settings also affect what HTML is considered. Our definition of a word is taking the text and splitting it by spaces. No consideration is given to visibility of content (such as text inside a div set to hidden).
  • Text Ratio – Number of non-HTML characters found in the HTML body tag on a page (the text), divided by the total number of characters the HTML page is made up of, and displayed as a percentage.
  • Crawl Depth – Depth of the page from the start page (number of ‘clicks’ away from the start page). Please note, redirects are counted as a level currently in our page depth calculations.
  • Inlinks – Number of internal hyperlinks to the URI. ‘Internal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled.
  • Unique Inlinks – Number of ‘unique’ internal inlinks to the URI. ‘Internal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled. For example, if ‘page A’ links to ‘page B’ 3 times, this would be counted as 3 inlinks and 1 unique inlink to ‘page B’.
  • % of Total – Percentage of unique internal inlinks (200 response HTML pages) to the URI. ‘Internal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled.
  • Outlinks – Number of internal outlinks from the URI. ‘Internal outlinks’ are links from a given URI to other URIs on the same subdomain that is being crawled.
  • Unique Outlinks – Number of unique internal outlinks from the URI. ‘Internal outlinks’ are links from a given URI to other URIs on the same subdomain that is being crawled. For example, if ‘page A’ links to ‘page B’ on the same subdomain 3 times, this would be counted as 3 outlinks and 1 unique outlink to ‘page B’.
  • External Outlinks – Number of external outlinks from the URI. ‘External outlinks’ are links from a given URI to another subdomain.
  • Unique External Outlinks – Number of unique external outlinks from the URI. ‘External outlinks’ are links from a given URI to another subdomain. For example, if ‘page A’ links to ‘page B’ on a different subdomain 3 times, this would be counted as 3 external outlinks and 1 unique external outlink to ‘page B’.
  • Hash – Hash value of the page. This is a duplicate content check. If two hash values match the pages are exactly the same in content.
  • Response Time – Time in seconds to download the URI. More detailed information in can be found in our FAQ.
  • Last-Modified – Read from the Last-Modified header in the servers HTTP response. If there server does not provide this the value will be empty.
  • URL Encoded Address – The URL actually requested by the SEO Spider. All non ascii characters percent encoded, see RFC 3986 for further details.
  • Title 2, meta description 2, h1-2, h2-2 etc – The Spider will collect data from the first two elements it encounters in the source code. Hence, h1-2 is data from the second h1 heading on the page.

Filter by –

  • HTML – HTML pages.
  • JavaScript – Any JavaScript
  • CSS – Any style sheets discovered.
  • Images – Any images.
  • PDF – Any portable document files.
  • Flash – Any .swf files.
  • Other – Any other file types, like docs etc.

External

The external tab includes information about external URI.

  • Address – The external URI address
  • Content – The content type of the URI.
  • Status Code – Http response code.
  • Status – The http header response.
  • Level – Depth of the page from the homepage or start page (number of ‘clicks’ aways from the start page).
  • Inlinks – Number of links found pointing to the external URI.

Filter by –

  • HTML – HTML pages.
  • JavaScript – Any JavaScript
  • Images – Any images.
  • PDF – Any portable document files.

Protocol

The protocol tab includes data on the hypertext transfer protocol data (HTTP Vs HTTPS) from internal and external URI.

  • Address – The URI crawled.
  • Content – The content type of the URI.
  • Status Code – Http response code.
  • Status – The http header response.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Canonical Link Element 1/2 etc – Canonical link element data on the URI. The Spider will find all instances if there are multiple.
  • Meta Robots 1/2 etc – Meta robots found on the URI. The Spider will find all instances if there are multiple.
  • X-Robots-Tag 1/2 etc – X-Robots-tag data. The Spider will find all instances if there are multiple.

Filter by –

  • HTTP – Insecure HyperText Transfer Protocol.
  • HTTPS – The secure version of HTTP.

Response codes

The response codes tab includes response information from internal and external URI.

  • Address – The URI crawled.
  • Content – The content type of the URI.
  • Status Code – Http response code.
  • Status – The http header response.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Inlinks – Number of internal inlinks to the URI. ‘Internal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled.
  • Response Time – Time in seconds to download the URI. More detailed information in can be found in our FAQ.
  • Redirect URI – If the address URI redirects, this column will include the redirect URI target. The status code above will display the type of redirect, 301, 302 etc.
  • Redirect Type – One of: HTTP Redirect: triggered by an HTTP header, HSTS Policy: Turned around locally by the SEO Spider due to a previous STS header, JavaScript Redirect: triggered by execution of JavaScript (can only happen when using JavaScript rendering) or MetaRefresh Redirect: triggered by a meta refresh tag in the html.

Filter by –

  • No Response – Where we receive no response to our request. Typically a malformed URI or a connection time out.
  • Blocked by Robots.txt – URLs are blocked by the site’s robots.txt.
  • Blocked Resource – URLs of blocked resources.
  • Success (2XX) – The URI requested was received, understood, accepted and processed successfully.
  • Redirection (3XX) – A redirection was encountered.
  • Redirection (JavaScript) – A JavaScript redirect was encountered.
  • Redirection (Meta Refresh) – A meta refresh was encountered.
  • Client Error (4xx) – Indicates a problem occurred with the request.
  • Server Error (5XX) – The server failed to fulfill an apparently valid request.

W3.org offer a full list of http status codes to find the exact description.

URI

The URI tab includes data related to the URLs requested.

  • Address – The URI crawled.
  • Content – The content type of the URI.
  • Status Code – Http response code.
  • Status – The http header response.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Hash – Hash value of the page. This is a duplicate content check. If two hash values match the pages are exactly the same in content.
  • Length – The character length of the URI.
  • Canonical 1 – The canonical link element data.

Filter by –

  • Non ASCII Characters – The URI has characters in it that are not included in the ASCII character encoding scheme.
  • Underscores – The URI has underscores within it which are not always seen as word separators.
  • Duplicate – This is a duplicate content check. It filters for all duplicate pages found via the hash value. If two hash values match the pages are exactly the same in content.
  • Parameters – The URI includes parameters such as ‘?’ or ‘&’ etc.
  • Over 115 characters – The URI is over 115 characters in length (hence getting fairly long).

 

Page titles

The page title tab includes data related to page titles.

  • Address – The URI crawled.
  • Occurences – The number of page titles found on the page (maximum we find is 2).
  • Title 1/2 – The page title.
  • Title 1/2 length – The character length of the page title.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Missing – Any pages which have a missing page title.
  • Duplicate – Any pages which have duplicate page titles.
  • Over 65 characters – Any pages which have page titles over 65 characters in length.
  • Below 35 characters – Any pages which have page titles under 35 characters in length. This isn’t necessarily a bad thing, but you have more room to play with.
  • Same as h1 – Any page titles which match their h1.
  • Multiple – Any pages which have multiple page titles.

Meta description

The meta description tab includes data related to meta descriptions.

  • Address – The URI crawled.
  • Occurences – The number of meta descriptions found on the page (maximum we find is 2).
  • Meta Description 1/2 – The meta description.
  • Meta Description 1/2 length – The character length of the meta description.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Missing – Any pages which have a missing meta description.
  • Duplicate – Any pages which have duplicate meta description.
  • Over 320 characters – Any pages which have meta descriptions over 320 characters in length.
  • Below characters – Any pages which have meta descriptions below 70 characters in length.
  • Multiple – Any pages which have multiple meta descriptions.

Meta keyword

The meta keywords tab includes data related to meta keywords. PLEASE NOTE – We advise to ignore the meta keyword tag, it is widely ignored, in particular Google does not consider it at all in their scoring of sites for ranking.

  • Address – The URI crawled.
  • Occurences – The number of meta keywords found on the page (maximum we find is 2).
  • Meta Keyword 1/2 – The meta keywords.
  • Meta Keyword 1/2 length – The character length of the meta keywords.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Missing – Any pages which have a missing meta keywords.
  • Duplicate – Any pages which have duplicate meta keywords.
  • Multiple – Any pages which have multiple meta keywords.

 

h1

The h1 tab includes data related to the h1 heading.

  • Address – The URI crawled.
  • Occurences – The number of h1s found on the page (maximum we find is 2).
  • h1- 1/2 – The h1 data.
  • h1-len- 1/2 – The character length of the h1.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Missing – Any pages which have a missing h1.
  • Duplicate – Any pages which have duplicate h1.
  • Over 70 characters – Any pages which have h1 over 70 characters in length.
  • Multiple – Any pages which have multiple h1.

h2

The h2 tab includes data related to the h2 heading.

  • Address – The URI crawled.
  • Occurences – The number of h2s found on the page (maximum we find is 2).
  • h2- 1/2 – The h2 data.
  • h2-len- 1/2 – The character length of the h2.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Missing – Any pages which have a missing h2.
  • Duplicate – Any pages which have duplicate h2.
  • Over 70 characters – Any pages which have h2 over 70 characters in length.
  • Multiple – Any pages which have multiple h2.

 

Images

The images tab includes data related to any images crawled.

  • Address – The URI crawled.
  • Content – The content type of the image (jpeg, gif, png etc).
  • Size – Size of the image. File size is in bytes, divide by 1024 to convert to kilobytes.
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.

Filter by –

  • Over 100kb – Large images over 100kb in size.
  • Missing Alt Text – Images that are missing alt text. Click the address (URI) of the image and then the ‘image info’ tab in the lower window pane to view which pages have the image on and which pages are missing alt text of the said image.
  • Alt Text Over 100 Characters – Images which have one instance of alt text over 100 characters in length.

Canonicals

The canonicals tab includes information on canonical link elements and HTTP canonicals discovered during a crawl.

  • Address – The URI crawled.
  • Occurences – The number of canonicals found (via both link element and HTTP).
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Canonical Link Element 1/2 etc – Canonical link element data on the URI. The SEO Spider will find all instances if there are multiple.
  • HTTP Canonical 1/2 etc – Canonical issued via HTTP. The SEO Spider will find all instances if there are multiple.
  • Meta Robots 1/2 etc – Meta robots found on the URI. The SEO Spider will find all instances if there are multiple.
  • X-Robots-Tag 1/2 etc – X-Robots-tag data. The SEO Spider will find all instances if there are multiple.
  • rel=“next” and rel=“prev” – The SEO Spider collects these HTML link elements designed to indicate the relationship between URLs in a paginated series.

Filter by –

  • Contains Canonical – The URL has a canonical URL (either via link element or HTTP header), which could be self-referencing or to another URL (‘canonicalised’).
  • Self Referencing – The URL has a canonical which is the same URL as the URL crawled (hence, self referencing).
  • Canonicalised – The URL has a canonical set, that is different to the URL crawled. The URL is ‘canonicalised’ to another location.
  • Missing – There’s no canonical URL present either as a link element, or via HTTP header.
  • Multiple – There’s multiple canonicals set for a URL (either multiple link elements, HTTP header, or combined).
  • Non-Indexable Canonical – The canonical URL is a non-indexable page.

Directives

The directives tab includes all information related to meta robots, canonicals and rel=“next” and rel=“prev” link elements crawled by the SEO Spider.

  • Address – The URI crawled.
  • Meta Robots 1/2 etc – Meta robots found on the URI. The Spider will find all instances if there are multiple.
  • Meta Refresh 1/2 etc – Meta Refresh found on the URI. The Spider will find all instances if there are multiple.
  • Canonical Link Element 1/2 etc – Canonical link element data on the URI. The Spider will find all instances if there are multiple.
  • HTTP Canonical 1/2 etc – Canonical issued via HTTP. The Spider will find all instances if there are multiple.
  • X-Robots-Tag 1/2 etc – X-Robots-tag data. The Spider will find all instances if there are multiple.
  • rel=“next” and rel=“prev” – The SEO Spider collects these HTML link elements designed to indicate the relationship between URLs in a paginated series.

Filter by –

  • Index
  • Noindex
  • Follow
  • Nofollow
  • None – This does not mean there are no directives in place. It means the meta tag ‘none’ is being used, which is the equivalent to “noindex, nofollow”.
  • NoArchive
  • NoSnippet
  • NoODP
  • NoYDIR
  • NoImageIndex
  • NoTranslate
  • Unavailable_After
  • Refresh

 

hreflang

The hreflang tab includes details of hreflang annotations crawled by the SEO Spider, delivered by HTML link element, HTTP Header or XML Sitemap.

Please note, ‘Extract Hreflang‘ and ‘Crawl Hreflang‘ options need to be enabled (under ‘Config > Spider’) for this tab and respective filters to be populated. To extract hreflang annotations from XML Sitemaps during a regular crawl ‘Crawl Linked XML Sitemaps‘ must be selected as well.

The hreflang tab columns include –

  • Address – The URI crawled.
  • Title 1/2 etc – The page title element of the page.
  • Occurrences – The number of hreflang discovered on a page.
  • HTML hreflang 1/2 etc – The hreflang language and region code from any HTML link element on the page.
  • HTML hreflang 1/2 URL etc – The hreflang URL from any HTML link element on the page.
  • HTTP hreflang 1/2 etc – The hreflang language and region code from the HTTP Header.
  • HTTP hreflang 1/2 URL etc – The hreflang URL from the HTTP Header.
  • Sitemap hreflang 1/2 etc – The hreflang language and region code from the XML Sitemap. Please note, this only populates when crawling the XML Sitemap in list mode.
  • Sitemap hreflang 1/2 URL etc – The hreflang URL from the XML Sitemap. Please note, this only populates when crawling the XML Sitemap in list mode.

Filter by –

  • Contains Hreflang – These are simply any URLs that have rel=”alternate” hreflang annotations from any implementation, whether link element, HTTP header or XML Sitemap.
  • Non-200 Hreflang URLs – These are URLs contained within rel=”alternate” hreflang annotations that do not have a 200 response code, such as URLs blocked by robots.txt, no responses, 3XX (redirects), 4XX (client errors) or 5XX (server errors). Hreflang URLs must be crawlable and indexable and therefore non-200 URLs are treated as errors, and ignored by the search engines. The non-200 hreflang URLs can be seen in the lower window ‘URL Info’ pane with a ‘non-200’ confirmation status. They can be exported in bulk via the ‘Reports > Hreflang > Non-200 Hreflang URLs’ export.
  • Unlinked Hreflang URLs – These are URLs that are only discoverable via rel=”alternate” hreflang link annotations. Hreflang annotations do not pass PageRank like a traditional anchor tag, so this might be a sign of a problem with internal linking, or the URLs contained in the hreflang annotation. This filter requires ‘crawl analysis‘ to be populated.
  • Missing Confirmation Links – These are URLs with missing return links (or ‘return tags’ in Google Search Console) to them, from their alternate pages. Hreflang is reciprocal, so all alternate versions must confirm the relationship. When page X links to page Y using hreflang to specify it as it’s alternate page, page Y must have a return link. No return links means the hreflang annotations may be ignored or not interpreted correctly. The missing confirmation links URLs can be seen in the lower window ‘URL Info’ pane with a ‘missing’ confirmation status. They can be exported in bulk via the ‘Reports > Hreflang > Missing Confirmation Links’ export.
  • Inconsistent Language & Region Confirmation Links – This filter includes URLs with inconsistent language and regional return links to them. This is where a return link has a different language or regional value than the URL is referencing itself. The inconsistent language confirmation URLs can be seen in the lower window ‘URL Info’ pane with an ‘Inconsistent’ confirmation status. They can be exported in bulk via the ‘Reports > Hreflang > Inconsistent Language Confirmation Links’ export.
  • Non Canonical Confirmation Links – URLs with non canonical confirmation links to them. Hreflang should only include canonical versions of URLs. So this filter picks up return links that go to URLs that are are not canonical versions of URLs. The non canonical confirmation URLs can be seen in the lower window ‘URL Info’ pane with a ‘Non Canonical’ confirmation status. They can be exported in bulk via the ‘Reports > Hreflang > Non Canonical Confirmation Links’ export.
  • Noindex Confirmation Links – Confirmation links which have a ‘noindex’ meta tag. All pages within a set should be indexable, and hence any return URLs with ‘noindex’ may result in the hreflang relationship being ignored. The noindex confirmation links URLs can be seen in the lower window ‘URL Info’ pane with a ‘noindex’ confirmation status. They can be exported in bulk via the ‘Reports > Hreflang > Noindex Confirmation Links’ export.
  • Incorrect Language & Region Codes – This simply checks for URLs with incorrect language and regional code values. These can be viewed in the lower window ‘URL Info’ pane with an ‘invalid’ status.
  • Multiple Entries – URLs with multiple entries to a language or regional code. For example, if page X links to page Y and Z using the same ‘en’ hreflang value annotation. This filter will also pick up multiple implementations, for example, if hreflang annotations were disovered as link elements and via HTTP header.
  • Missing Self Reference – URLs missing a self referencing hreflang attribute. URLs should have their own self referencing rel=”alternate” hreflang annotation.
  • Not Using Canonical – URLs not using the canonical URL on the page, in it’s own hreflang annotation. Hreflang should only include canonical versions of URLs.
  • Missing X-Default – URLs missing an X-Default hreflang attribute. This is optional, and not necessarily an error or issue.
  • Missing – URLs missing an hreflang attribute completely. These might be valid of course, if they aren’t multiple versions of a page.

For more information on hreflang, please read our guide on ‘How to Audit Hreflang‘.

AJAX

The Ajax tab specifically refers to the now deprecated Google AJAX crawling scheme.

If the site uses AJAX / JavaScript, but does not have escaped fragment URLs with HTML snapshots, then you’ll need to adjust the configuration to JavaScript rendering to crawl the site. This mode is only available in the paid version, and will render content like a modern day browser, rendering content, crawling and indexing JavaScript and dynamically generated content. This configuration can be adjusted under ‘Configuration > Spider > Rendering tab > JavaScript’.

The AJAX tab shows both ugly and pretty URLs, with filters for hash fragments. Some AJAX pages may not use hash fragments (such as a homepage), so the ‘fragment’ meta tag can be used to recognise an Ajax page. In a the same way as Google, the SEO Spider will then fetch the ugly version of the URL

  • Pretty URL – The pretty URL of the page.
  • Ugly URL – The ugly URL actually requested.
  • Status Code – Http response code.
  • Status – The http header response.

Please read our guide on crawling JavaScript websites.

AMP

The AMP tab includes Accelerated Mobile Pages (AMP) discovered during a crawl. These are identified via the HTML AMP Tag, and rel=”amphtml” inlinks.

The tab includes filters for common SEO issues and validation errors using the AMP Validator.

Please note, ‘Extract AMP Links‘ and ‘Crawl AMP Links‘ options need to be enabled (under ‘Config > Spider’) for this tab and respective filters to be populated.

The AMP tab columns include –

  • Address – The URI crawled.
  • Occurences – The number of canonicals found (via both link element and HTTP).
  • Indexability – Whether the URL is indexable or Non-Indexable.
  • Indexability Status – The reason why a URL is Non-Indexable. For example, if it’s canonicalised to another URL.
  • Title 1 – The (first) page title.
  • Title 1 Length – The character length of the page title.
  • Title 1 Pixel Width – The pixel width of the page title.
  • h1 – 1 – The first h1 (heading) on the page.
  • h1 – Len-1 – The character length of the h1.
  • Size – Size is in bytes, divide by 1024 to convert to kilobytes. The value is set from the Content-Length header if provided, if not it’s set to zero. For HTML pages this is updated to the size of the (uncompressed) HTML in bytes.
  • Word Count – This is all ‘words’ inside the body tag. This does not include HTML markup. Our figures may not be exactly what doing this manually would find, as the parser performs certain fix-ups on invalid html. Your rendering settings also affect what HTML is considered. Our definition of a word is taking the text and splitting it by spaces. No consideration is given to visibility of content (such as text inside a div set to hidden).
  • Text Ratio – Number of non-HTML characters found in the HTML body tag on a page (the text), divided by the total number of characters the HTML page is made up of, and displayed as a percentage.
  • Crawl Depth – Depth of the page from the start page (number of ‘clicks’ away from the start page). Please note, redirects are counted as a level currently in our page depth calculations.
  • Response Time – Time in seconds to download the URI. More detailed information in can be found in our FAQ.

Filter by the following SEO related items –

  • Non-200 Response – The AMP URLs do not respond with a 200 ‘OK’ status code. These will include URLs blocked by robots.txt, no responses, redirects, client and server errors. This filter requires ‘crawl analysis‘ to be populated.
  • Non-Confirming Canonical – The canonical desktop version of the URL, does not contain a rel=”amphtml” URL back to the AMP URL. This could simply be missing from the desktop version, or there might be a configuration issue with the AMP canonical.
  • Missing Non-AMP Canonical – The AMP URLs canonical does not go to a desktop version, but to another AMP URL.
  • Non-Indexable Canonical – The AMP canonical URL is a non-indexable page. Generally the desktop equivalent should be an indexable page.
  • Indexable – The AMP URL is indexable. AMP URLs with a desktop equivalent should be non-indexable (as they should have a canonical to the desktop equivalent). Standalone AMP URLs (without an equivalent) should be indexable.
  • Non-Indexable – The AMP URL is non-indexable. This is usually because they are correctly canonicalised to the desktop equivalent.

The following filters help identify common issues relating to AMP specifications. The SEO Spider uses the AMP Validator for validation of AMP pages.

Filter by –

  • Missing HTML AMP Tag – AMP HTML documents must contain a top-level HTML or HTML AMP tag. This filter requires ‘crawl analysis‘ to be populated.
  • Missing/Invalid Doctype HTML Tag – AMP HTML documents must start with the doctype, doctype HTML.
  • Missing Head Tag – AMP HTML documents must contain head tags (they are optional in HTML).
  • Missing Body Tag – AMP HTML documents must contain body tags (they are optional in HTML).
  • Missing Canonical – AMP URLs must contain a canonical tag inside their head that points to the regular HTML version of the AMP HTML document, or to itself if no such HTML version exists.
  • Missing/Invalid Meta Charset Tag – AMP HTML documents must contain a meta charset=”utf-8″ tag as the first child of their head tag.
  • Missing/Invalid Meta Viewport Tag – AMP HTML documents must contain a meta name=”viewport” content=”width=device-width,minimum-scale=1″ tag inside their head tag. It’s also recommended to include initial-scale=1.
  • Missing/Invalid AMP Script – AMP HTML documents must contain a script async src=”https://cdn.ampproject.org/v0.js” tag inside their head tag.
  • Missing/Invalid AMP Boilerplate – AMP HTML documents must contain the AMP boilerplate code in their head tag.
  • Contains Disallowed HTML – This flags any AMP URLs with disallowed HTML for AMP.
  • Other Validation Errors – This flags any AMP URLs with other validation errors not already covered by the above filters.

For more information on AMP, please read our guide on ‘How to Audit & Validate AMP‘.

  • Like us on Facebook
  • +1 us on Google Plus
  • Connect with us on LinkedIn
  • Follow us on Twitter
  • View our RSS feed

Download.

Download

Purchase a licence.

Purchase