Screaming Frog SEO Spider Update – Version 16.0
Screaming Frog SEO Spider Update – Version 16.0
We’re excited to announce Screaming Frog SEO Spider version 16.0, codenamed internally as ‘marshmallow’.
Here’s what’s new in our latest update.
Crawl Original & Rendered HTML
One of the fundamental changes in this update is that the SEO Spider will now crawl both the original and rendered HTML to identify pages that have content or links only available client-side and report other key differences.
For example, our homepage apparently has 4 additional words in the rendered HTML, which was new to us.
Aha! There are the 4 words. Thanks, Highcharts.
Compare HTML Vs Rendered HTML
The two-phase approach of crawling the raw and rendered HTML can help pick up on easy to miss problematic scenarios, such as the original HTML having a noindex meta tag, but the rendered HTML not having one.
Previously by just crawling the rendered HTML the page would be deemed as indexable when in reality Google will see the noindex in the original HTML first, and subsequently skip rendering, meaning the removal of the noindex won’t be seen and the page won’t be indexed.
Shadow DOM & iFrames
Another enhancement we’ve wanted to make is to improve our rendering to better match Google’s own behaviour. Giacomo Zecchini’s recent ‘Challenges of building a search engine like web rendering service‘ talk at SMX Advanced provides an excellent summary of some of the challenges and edge cases.
After research and testing, both of these are now supported in the SEO Spider, as we try to mimic Google’s web rendering service as closely as possible.
They are enabled by default, but can be disabled when required via ‘Config > Spider > Rendering’. There are further improvements we’d like to make in this area, and if you spot any interesting edge cases then drop us an email.
2) Automated Crawl Reports For Data Studio
Data Studio is commonly the tool of choice for SEO reporting today, whether that’s for your own reports, clients or the boss. To help automate this process to include crawl report data, we’ve introduced a new Data Studio friendly custom crawl overview export available in scheduling.
This has been purpose-built to allow users to select crawl overview data to be exported as a single summary row to Google Sheets. It will automatically append new scheduled exports to a new row in the same sheet in a time series.
The new crawl overview summary in Google Sheets can then be connected to Data Studio to be used for a fully automated Google Data Studio crawl report. You’re able to copy our very own Screaming Frog Data Studio crawl report template, or create your own better versions!
This allows you or a team to monitor site health and be alerted to issues without having to even open the app. It also allows you to share progress with non-technical stakeholders visually.
Please read our tutorial on ‘How To Automate Crawl Reports In Data Studio‘ to set this up.
We’re excited to see alternative Screaming Frog Data Studio report templates, so if you’re a Data Studio whizz and have one you’d like to share with the community, let us know and we will include it in our tutorial.
3) Advanced Search & Filtering
The inbuilt search function has been improved, it defaults to regular text search but allows you to switch to regex, choose from a variety of predefined filters (including a ‘does not match regex’) and combine rules (and/or).
The search bar displays the syntax used by the search and filter system, so this can be formulated by power users to build common searches and filters quickly, without having to click the buttons to run searches.
The syntax can just be pasted or written directly into the search box to run searches.
4) Translated UI
Alongside English, the GUI is now available in Spanish, German, French and Italian to further support our global users. It will detect the language used on your machine on startup, and default to using it.
Language can also be set within the tool via ‘Config > System > Language’.
A big shoutout and thank you to the awesome MJ Cachón, Riccardo Mares, Jens Umland and Benjamin Thiers at Digimood for their time and amazing help with the translations. We truly appreciate it. You all rock.
Technical SEO jargon alongside the complexity and subtleties in language makes translations difficult, and while we’ve worked hard to get this right with amazing native speaking SEOs, you’re welcome to drop us an email if you have any suggestions to improve further.
We may support additional languages in the future as well.
Version 16.0 also includes a number of smaller updates and bug fixes, outlined below.
- ‘Total Internal Indexable URLs’ and ‘Total Internal Non-Indexable URLs’ have been added to the ‘Overview’ tab and report.
- You’re now able to open saved crawls via the command line and export any data and reports.
- The include and exclude have both been changed to partial regex matching by default. This means you can just type in ‘blog’ rather than say .*blog.* etc.
- The HTTP refresh header is now supported and reported!
- Scheduling now includes a ‘Duplicate’ option to improve efficiency. This is super useful for custom Data Studio exports, where it saves time selecting the same metrics for each scheduled crawl.
- Alternative images in the picture element are now supported when the ‘Extract Images from srcset Attribute’ config is enabled. A bug where alternative images could be flagged with missing alt text has been fixed.
- The Google Analytics integration now has a search function to help find properties.
- The ‘Max Links per URL to Crawl’ limit has been increased to 50k.
- The default ‘Max Redirects to Follow’ limit has been adjusted to 10, inline with Googlebot before it shows a redirect error.
- PSI requests are now x5 times faster, as we realised Google increased their quotas!
- Updated a tonne of Google rich result feature changes for structured data validation.
- Improved forms based authentication further to work in more scenarios.
- Fix macOS launcher to trigger Rosetta install automatically when required.
- Ate plenty of bugs.
That’s everything! As always, thanks to everyone for their continued feedback, suggestions and support. If you have any problems with the latest version, do just let us know via support and we will help.
Now, download version 16.0 of the Screaming Frog SEO Spider and let us know what you think in the comments.
Small Update – Version 16.1 Released 27th September 2021
We have just released a small update to version 16.1 of the SEO Spider. This release is mainly bug fixes and small improvements –
- Updated some Spanish translations based on feedback.
- Updated SERP Snippet preview to be more in sync with current SERPs.
- Fix issue preventing the Custom Crawl Overview report for Data Studio working in languages other than English.
- Fix crash resuming crawls with saved Internal URL configuration.
- Fix crash caused by highlighting a selection then clicking another cell in both list and tree views.
- Fix crash duplicating a scheduled crawl.
Small Update – Version 16.2 Released 18th October 2021
We have just released a small update to version 16.2 of the SEO Spider. This release is mainly bug fixes and small improvements –
- Fix issue with corrupt fonts for some users.
- Fix bug in the UI that allowed you to schedule a crawl without a crawl seed in Spider Mode.
- Fix stall opening saved crawls.
- Fix issues with upgrades of database crawls using excessive disk space.
- Fix issue with exported HTML visualisations missing pop up help.
- Fix issue with PSI going too fast.
- Fix issue with Chromium requesting webcam access.
- Fix crash when cancelling an export.
- Fix crash accessing visualisations configuration using languages other then English.
Small Update – Version 16.3 Released 4th November 2021
We have just released a small update to version 16.3 of the SEO Spider. This release is mainly bug fixes and small improvements –
- The Google Search Console integration now has new filters for search type (Discover, Google News, Web etc) and supports regex as per the recent Search Analytics API update.
- Fix issue with Shopify and CloudFront sites loading in Forms Based authentication browser.
- Fix issue with cookies not being displayed in some cases.
- Give unique names to Google Rich Features and Google Rich Features Summary report file names.
- Fix crash running on macOS Monetery.
- Fix right click focus in visualisations.
- Fix crash in Spelling and Grammar UI.
- Fix crash when exporting invalid custom extraction tabs on the CLI.
- Fix crash when flattening shadow DOM.
- Fix crash generating a crawl diff.
- Fix crash when the Chromium can’t be initialised.