Command line interface set-up

Table of Contents

General

Configuration Options

Spider Crawl Tab

Spider Extraction Tab

Spider Limits Tab

Spider Rendering Tab

Spider Advanced Tab

Spider Preferences Tab

Other Configuration Options

Tabs

Command line interface set-up

If you are running on a platform that won’t allow you to run the user interface at all, then you’ll need to follow the instructions in this guide before running the SEO Spider via the Command Line.

If you can run the User Interface, please do so before running on the Command Line. This will allow you to accept the End User Licence Agreement (EULA), enter your licence key and select a storage mode.

When the User Interface is not available to perform the initial run, you have to edit a few configuration files. The location of these varies depending on platform:

Windows
C:\Users\USERNAME\.ScreamingFrogSEOSpider\
macOS:
~/.ScreamingFrogSEOSpider/
Ubuntu:
~/.ScreamingFrogSEOSpider/
From now on we’ll refer to this as the .ScreamingFrogSEOSpider directory.


Entering Your Licence Key

Create a file in your .ScreamingFrogSEOSpider directory called licence.txt. Enter (copy and paste to avoid typos) your licence username on the first line and your licence key on the second line, and save the file.


Accepting the EULA

Create or edit the file spider.config in your .ScreamingFrogSEOSpider directory. Locate and edit or add the following line:
eula.accepted=15
Save the file and exit. Please note, the number value may need to be adjusted to a later version.


Choosing Storage Mode

The default storage mode is memory. If you are happy to use memory storage you don’t need to change anything. To change to database storage mode edit the file spider.config in your .ScreamingFrogSEOSpider directory. Add or edit the storage.mode property to be:
storage.mode=DB
The default path is a directory called db in your .ScreamingFrogSEOSpider directory. If you would like to change this add or edit the storage.db_dir property. Depending on your OS the path will have to be entered differently.

Windows:
storage.db_dir=C\:\\Users\\USERNAME\\dbfolder
macOS:
storage.db_dir=/Users/USERNAME/dbfolder
Ubuntu:
storage.db_dir=/home/USERNAME/dbdir

Disabling the Embedded Browser

embeddedBrowser.enable=false


Memory Allocation

We recommend changing memory allocation within the SEO Spider interface under ‘File > Settings > Memory Allocation’. Then running the SEO Spider via the CLI once it’s set.

However, users that are running truly headless (with no monitor attached) tend to run on Linux based operating systems. So, to configure this on a Linux operating system you need to modify or create a file named ‘.screamingfrogseospider’ in the home directory (i.e.’~/.screamingfrogseospider’).

Add or modify the following line accordingly (shown for 8gb configuration):
-Xmx8g


Connecting To APIs

To utilise APIs we recommend using the user interface to set-up and authorise credentials, before using the CLI. However, when the user interface is not available then the APIs can be utilised by copying across required folders set-up on another machine, or editing the spider.config file, depending on the API.

For Google Analytics and Google Search Console both require connecting and authorising using the user interface due to OAuth. So a different machine which is able to use the SEO Spider interface should be used to set these up. Once authorised, the credentials can then be copied over to the machine where the user interface is not available.

Please navigate to the respective local Screaming Frog SEO Spider user folder –

Windows
C:\Users\USERNAME\.ScreamingFrogSEOSpider\
macOS:
~/.ScreamingFrogSEOSpider/
Ubuntu:
~/.ScreamingFrogSEOSpider/
And copy the ‘analytics’ and ‘search_console’ folders.

CLI Connect to GA GSC Headless

And paste those into the Screaming Frog SEO Spider user folder on the machine without a user interface. The APIs can then be utilised using the CLI as normal with the following commands.

Use the Google Analytics API during crawl.
--use-google-analytics "google account" "account" "property" "view" "segment"
Use the Google Search Console API during crawl.
--use-google-search-console "google account" "website"
For PSI, Ahrefs, Majestic and Moz, they all require the ‘spider.config’ file to be edited and updated with their respective API keys. The spider.config file can be found in the Screaming Frog SEO Spider user folder as shown above.

To use each API, simply paste in the relevant API line and replace ‘APIKEY’ with the API key provided by each provider.
PSI.secretkey=APIKEY
ahrefs.authkey=APIKEY
majestic.authkey=APIKEY
moz.secretkey=APIKEY
The APIs can then be utilised using the CLI as normal with the following commands.

Use the PageSpeed Insights API during crawl.
--use-pagespeed
Use the Ahrefs API during crawl.
--use-ahrefs
Use the Majestic API during crawl.
--use-majestic
Use the Mozscape API during crawl.
--use-mozscape


Exporting to Google Drive

To utilise Google Drive exports in CLI the machine will require appropriate credentials, similar to any other API.

These credentials can be authorised via the user-interface or by copying the ‘google_drive’ folder from another machine as described above.


Configuring A Crawl

If a feature or configuration option isn’t available as a specific command line option (like exclude, or JavaScript rendering), you will need to use the user interface to set the exact configuration you wish, and save the configuration file.

Configuration Profiles

You can then supply that configuration file when using the CLI to utilise those features.

When the user interface is not available then we recommend setting up the configuration file on a machine where it is available first, transferring over the saved .seospiderconfig file and then supplying it via command line. The command line for supercool-config.seospiderconfig would be as follows –
--config "C:\Users\Your Name\Desktop\supercool-config.seospiderconfig"

Join the mailing list for updates, tips & giveaways

Back to top