An SEOs Guide To Crawling HSTS

HTTP Strict Transport Security

HTTP Strict Transport Security (HSTS) is a standard, defined in RFC 6797, by which a web server can declare to a client that it should only be accessed via HTTPS. The client, typically a web server or crawler, will then make all future requests over HTTPS, even if following a link to an HTTP URL. When this happens the SEO Spider, as of version 8.0, it shows a Status Code of 307, a Status of “HSTS Policy” and Redirect Type of “HSTS Policy”.

Here’s the SEO Spider running on our HSTS test site https://www.screamingprojects.com/hsts/.

Here’s how Chrome handles the same situation (I’ve ticked the ‘Preserve Log’ option at the top, otherwise the 307 is lost).

Unlike a 301 or a 302, this redirect isn’t actually sent by the web server. It’s just an internal representation in the browser and SEO Spider. No request is actually sent to the web server, it’s turned around internally.

When a webserver declares it should only be contacted via HTTPS, it sets an expiry on that declaration, so the use of the 307 response makes sense for this, as 307 means ‘Temporary Redirect’.

Protocol Overview

The HSTS protocol is based on the server sending a single header called Strict-Transport-Security which must only be sent over HTTPS. If it is sent over HTTP it is ignored. There are two directives associated with the header:

  • max-age: This is mandatory and specifies the number of seconds for which the server must only be contacted via HTTPS.
  • includeSubDomains: This is an optional field. If set, specifies that HSTS policy should also apply to any subdomains.

Examples

Enables HSTS for a year:

Strict-Transport-Security: max-age=31536000

Forces expiry of HSTS Policy:

Strict-Transport-Security: max-age=0

Enables HSTS policy for a month for this domain, and all subdomains:

Strict-Transport-Security: max-age=2592000 ; includeSubDomains

Benefits

As the HTTP to HTTPS rewriting happens internally on the client, there are several key benefits to this over just using a site wide HTTP -> HTTPS redirect.

  • Reduced communication over non-secure protocols.
  • Improved performance, as a round trip is avoided each time an HTTP link is encountered.
  • Reduced load on the web server.

A site wide HTTP -> HTTPS redirect is still needed however. As the Strict-Transport-Security header is ignored unless it’s sent over HTTPS. So if the first visit to your site is not via HTTPS, you still need that initial redirect to HTTPS to deliver the Strict-Transport-Security header.

If you have anymore questions about how to use the Screaming Frog SEO Spider, then please do get in touch with our support team.

  • Like us on Facebook
  • +1 us on Google Plus
  • Connect with us on LinkedIn
  • Follow us on Twitter
  • View our RSS feed

Download.

Download

Purchase a licence.

Purchase