Scheduling

Dan Sharp

Posted 19 September, 2018 by in

Scheduling

You’re able to schedule crawls to run automatically within the SEO Spider, as a one-off, or at chosen intervals. This feature can be found under ‘File > Scheduling’ within the app.

Click ‘Add’ to set-up a scheduled crawl.

Scheduling

Choose the task name, project it should be saved in, date and interval of the scheduled crawl.

Scheduling General Tab

You’re able to pre select the mode (spider, or list), website address to crawl, saved configuration and authentication config for the scheduled crawl.

Scheduling Spider tab

The APIs tab allows you to choose which APIs to connect to for the scheduled crawl, including (Google Analytics, Search Console, PageSpeed Insights, Majestic, Ahrefs, Moz).

Scheduling API tab

The exports tab has a number of sub tabs, including Output Settings, Spreadsheets, Local Files, Looker Studio, Presets and Notifications.

Each of these tabs will display a warning at the top, that ‘Headless Mode’ needs to be configured for exports.

Scheduling export headless mode warning

This simply means the UI won’t be visible when the SEO Spider runs the scheduled crawl. This is a requirement to avoid users clicking buttons while the scheduled crawl is performing actions, which might be confusing and result in mistakes.

Headless mode can be enabled under the ‘Spider’ tab.

Scheduling Headless Mode

The export options when then be available for selection.

The Output Settings allow users to select how files should be exported – the local output folder, Google Drive Account, timestamped folders or overwriting, spreadsheet format (CSV, Excel, or Google Sheets), and an option to consolidated exports into tabs in a single spreadsheet.

Scheduled Crawl Output Settings

The Spreadsheets tab allows you to select any tab and filter, bulk export and report to export.

Scheduling Spreadsheets tab

Local files options include saving a crawl (crawls are autosaved in database storage mode, so this is typically not required), creating an XML Sitemap or an Image Sitemap.

Scheduling Local Files

The Looker Studio tab allows users to export a timeseries report to Google Sheets that can be connected to Looker Studio.

Scheduling LookerStudio

This has been purpose-built to allow users to select crawl overview data to be exported as a single summary row to Google Sheets. It will automatically append new scheduled exports to a new row in the same sheet in a time series.

Please read our tutorial on ‘How To Automate Crawl Reports In Looker Studio‘ to set this up.

The Presets tab allows you to set up the export of multiple reports as a single preset to use across scheduled crawls to improve efficiency.

Scheduling Presets

The Notifications options allow email notifications to be sent once a crawl has completed. Please note, this does not currently send exports as well.

Scheduling Notifcations

Scheduling & Google Sheets

When selecting to export, you can choose to automatically exports any tabs, filters, exports or reports to Google Sheets by switching the ‘format’ to gsheet. This will save a Google Sheet within your Google Drive account in a ‘Screaming Frog SEO Spider’ folder.

The ‘project name’ and ‘crawl name’ used in scheduling will be used as folders for the exports. So for example, a ‘Screaming Frog’ project name and ‘Weekly Crawl’ name, will sit within Google Drive like below.

Google Drive Location For Scheduling

You’re also able to choose to overwrite the existing file (if present), or create a timestamped folder in Google Drive using the ‘Output Mode’ options under ‘Ouput Settings’.


Tips On Scheduling

There are a few things to remember when using scheduling.

  • If you’re using database storage mode, there is no need to ‘save’ crawls in scheduling, as they are stored automatically within the SEO Spiders database. Crawls can be opened via the ‘File > Crawls’ menu in the application after the scheduled crawl has been performed. Please see our guide on saving, opening, exporting & importing crawls.
  • A new instance of the SEO Spider is started for a scheduled crawl. So if there is an overlap of crawls, multiple instances of the SEO Spider will run at the same time, rather than there being a delay until the previous crawl has completed. Hence, we recommend considering your system resources and timing of crawls appropriately.
  • The SEO Spider will run in headless mode (meaning without an interface) when scheduled to export data. This is to avoid any user interaction or the application starting in front of you and options being clicked, which would be a little strange.
  • This scheduling is within the user interface, if you’d prefer to use the command line to operate the SEO Spider, please see our command line interface guide.
  • If you experience any issues with a scheduled crawl, the first step is to look under ‘File > Scheduling’ and ensure the crawl is set up as ‘Valid’. If it’s not valid, then click through the tabs to find the issue and correct any highlighted problems. If the scheduled crawl is valid, click ‘File > Scheduling > History’ and check to see if the crawl has an ‘end’ date and time, or if there are any errors reported under the ‘error’ column.

Dan Sharp is founder & Director of Screaming Frog. He has developed search strategies for a variety of clients from international brands to small and medium-sized businesses and designed and managed the build of the innovative SEO Spider software.

Comments are closed.

Back to top