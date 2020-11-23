Posted 23 November, 2020 by screamingfrog in Screaming Frog SEO Spider

We are pleased to launch Screaming Frog SEO Spider version 14.0, codenamed internally as ‘megalomaniac’.

Since the release of version 13 in July, we’ve been busy working on the next round of features for version 14, based upon user feedback and as always, a little internal steer.

Let’s talk about what’s new in this release.

1) Dark Mode

While arguably not the most significant feature in this release, it is used throughout the screenshots – so it makes sense to talk about first. You can now switch to dark mode, via ‘Config > User Interface > Theme > Dark’.

Not only will this help reduce eye strain for those that work in low light (everyone living in the UK right now), it also looks super cool – and is speculated (by me now) to increase your technical SEO skills significantly.

The non-eye-strained among you may notice we’ve also tweaked some other styling elements and graphs, such as those in the right-hand overview and site structure tabs.

2) Google Sheets Export

You’re now able to export directly to Google Sheets.

You can add multiple Google accounts and connect to any, quickly, to save your crawl data which will appear in Google Drive within a ‘Screaming Frog SEO Spider’ folder, and be accessible via Sheets.

Many of you will already be aware that Google Sheets isn’t really built for scale and has a 5m cell limit. This sounds like a lot, but when you have 55 columns by default in the Internal tab (which can easily triple depending on your config), it means you can only export around 90k rows (55 x 90,000 = 4,950,000 cells).

If you need to export more, use a different export format that’s built for the size (or reduce your number of columns). We had started work on writing to multiple sheets, but really, Sheets shouldn’t be used in that way.

This has also been integrated into scheduling and the command line. This means you can schedule a crawl, which automatically exports any tabs, filters, exports or reports to a Sheet within Google Drive.

You’re able to choose to create a timestamped folder in Google Drive, or overwrite an existing file.

This should be helpful when sharing data in teams, with clients, or for Google Data Studio reporting.

3) HTTP Headers

You can now store, view and query full HTTP headers. This can be useful when analysing various scenarios which are not covered by the default headers extracted, such as details of caching status, set-cookie, content-language, feature policies, security headers etc.

You can choose to extract them via ‘Config > Spider > Extraction’ and selecting ‘HTTP Headers’. The request and response headers will then be shown in full in the lower window ‘HTTP Headers’ tab.

The HTTP response headers also get appended as columns in the Internal tab, so they can be viewed, queried and exported alongside all the usual crawl data.

Headers can also be exported in bulk via ‘Bulk Export > Web > All HTTP Headers’.

4) Cookies

You can now also store cookies from across a crawl. You can choose to extract them via ‘Config > Spider > Extraction’ and selecting ‘Cookies’. These will then be shown in full in the lower window Cookies tab.

You’ll need to use JavaScript rendering mode to get an accurate view of cookies, which are loaded on the page using JavaScript or pixel image tags.

The SEO Spider will collect cookie name, value, domain (first or third party), expiry as well as attributes such as secure and HttpOnly.

This data can then be analysed in aggregate to help with cookie audits, such as those for GDPR via ‘Reports > Cookies > Cookie Summary’.

You can also highlight multiple URLs at a time to analyse in bulk, or export via the ‘Bulk Export > Web > All Cookies’.

Please note – When you choose to store cookies, the auto exclusion performed by the SEO Spider for Google Analytics tracking tags is disabled to provide an accurate view of all cookies issued.

This means it will affect your analytics reporting, unless you choose to exclude any tracking scripts from firing by using the Exclude configuration (‘Config > Exclude’) or filter out the ‘Screaming Frog SEO Spider’ user-agent similar to excluding PSI in this FAQ.

5) Aggregated Site Structure

The SEO Spider now displays the number of URLs discovered in each directory when in directory tree view (which you can access via the tree icon next to ‘Export’ in the top tabs).

This helps better understand the size and architecture of a website, and some users find it more logical to use than traditional list view.

Alongside this update, we’ve improved the right-hand ‘Site Structure’ tab to show an aggregated directory tree view of the website. This helps quickly visualise the structure of a website, and identify where issues are at a glance, such as indexability of different paths.

If you’ve found areas of a site with non-indexable URLs, you can switch the ‘view’ to analyse the ‘indexability status’ of those different path segments to see the reasons why they are considered as non-indexable.

You can also toggle the view to crawl depth across directories to help identify any internal linking issues to areas of the site, and more.

This wider aggregated view of a website should help you visualise the architecture, and make better decisions for different sections and segments.

6) New Configuration Options

We’ve introduced two new significant configuration options – ‘Ignore Non-Indexable URLs for On-Page Filters’ and ‘Ignore Paginated URLs for Duplicate Filters’.

These are both enabled by default via ‘Config > Spider > Advanced’, and will mean non-indexable pages won’t be flagged in appropriate on-page filters for page titles, meta descriptions, or headings.

This means URLs won’t be considered as ‘Duplicate’, or ‘Over X Characters’ or ‘Below X Characters’ if for example they are noindex, and hence non-indexable. Paginated pages won’t be flagged for duplicates either.

If you’re crawling a staging website which has noindex across all pages, remember to disable these options.

These options are a little different to the ‘respect‘ configuration options, which remove non-indexable URLs from appearing at all. Non-indexable URLs will still appear in the interface, they just won’t be flagged for relevant issues.

Other Updates

Version 14.0 also includes a number of smaller updates and bug fixes, outlined below.

There’s now a new filter for ‘Missing Alt Attribute’ under the ‘Images’ tab. Previously missing and empty alt attributes would appear under the singular ‘Missing Alt Text’ filter. However, it can be useful to separate these, as decorative images should have empty alt text (alt=””), rather than leaving out the alt attribute which can cause issues in screen readers. Please see our How To Find Missing Image Alt Text & Attributes tutorial.

Headless Chrome used in JavaScript rendering has been updated to keep up with evergreen Googlebot.

‘Accept Cookies’ has been adjusted to ‘Cookie Storage‘, with three options – Session Only, Persistent and Do Not Store. The default is ‘Session Only’, which mimics Googlebot’s stateless behaviour.

The ‘URL’ tab has new filters available around common issues including Multiple Slashes (//), Repetitive Path, Contains Space and URLs that might be part of an Internal Search.

The ‘Security‘ tab now has a filter for ‘Missing Secure Referrer-Policy Header’.

There’s now a ‘HTTP Version’ column in the Internal and Security tabs, which shows which version the crawl was completed under. This is in preparation for supporting HTTP/2 crawling inline with Googlebot.

You’re now able to right click and ‘close’ or drag and move the order of lower window tabs, in a similar way to the top tabs.

Non-Indexable URLs are now not included in the ‘URLs not in Sitemap’ filter, as we presume they are non-indexable correctly and therefore shouldn’t be flagged. Please see our tutorial on ‘How To Audit XML Sitemaps‘ for more.

Google rich result feature validation has been updated inline with the ever-changing documentation.

The ‘Google Rich Result Feature Summary’ report available via ‘Reports‘ in the top-level menu, has been updated to include a ‘% eligible’ for rich results, based upon errors discovered. This report also includes the total and unique number of errors and warnings discovered for each Rich Result Feature as an overview.

That’s everything for now, and we’ve already started work on features for version 15. If you experience any issues, please let us know via support and we’ll help.

Thank you to everyone for all their feature requests, feedback, and continued support.

Now, go and download version 14.0 of the Screaming Frog SEO Spider and let us know what you think!