Command line interface set-up
Table of Contents
- Installation on Windows
- Installation on macOS
- Installation on Ubuntu
- Saving, opening, exporting & importing crawls
- User agent
- Checking memory allocation
- XML sitemap creation
- Crawl analysis
- Command line interface set-up
- Command line interface
- Search function
- User Interface
Spider Crawl Tab
Spider Extraction Tab
Spider Limits Tab
Spider Rendering Tab
Spider Advanced Tab
- Cookie storage
- Ignore non-indexable URLs for on-page filters
- Ignore paginated URLs for duplicate filters
- Always follow redirects
- Always follow canonicals
- Respect noindex
- Respect canonical
- Respect next/prev
- Respect HSTS policy
- Respect self referencing meta refresh
- Extract images from img srcset attribute
- Crawl fragment identifiers
- Response timeout
- 5XX response retries
Spider Preferences Tab
Other Configuration Options
- Content area
- Spelling & grammar
- Robots.txt settings
- Custom robots.txt
- URL rewriting
- User agent
- HTTP header
- Custom search
- Custom extraction
- Custom link positions
- User Interface
- Google Analytics integration
- Google Search Console integration
- PageSpeed Insights integration
- Memory allocation
- Storage mode
Lower Window Tabs
Right Side Window Tabs
Command line interface set-up
If you are running on a platform that won’t allow you to run the user interface at all, then you’ll need to follow the instructions in this guide before running the SEO Spider via the Command Line.
If you can run the User Interface, please do so before running on the Command Line. This will allow you to accept the End User Licence Agreement (EULA), enter your licence key and select a storage mode.
When the User Interface is not available to perform the initial run, you have to edit a few configuration files. The location of these varies depending on platform:
From now on we’ll refer to this as the .ScreamingFrogSEOSpider directory.
Entering Your Licence Key
Create a file in your .ScreamingFrogSEOSpider directory called licence.txt. Enter (copy and paste to avoid typos) your licence username on the first line and your licence key on the second line, and save the file.
Accepting the EULA
Create or edit the file spider.config in your .ScreamingFrogSEOSpider directory. Locate and edit or add the following line:
save the file and exit.
Choosing Storage Mode
The default storage mode is memory. If you are happy to use memory storage you don’t need to change anything. To change to database storage mode edit the file spider.config in your .ScreamingFrogSEOSpider directory. Add or edit the storage.mode property to be:
The default path is a directory called db in your .ScreamingFrogSEOSpider directory. If you would like to change this add or edit the storage.db_dir property. Depending on your OS the path will have to be entered differently.
Disabling the Embedded Browser
We recommend changing memory allocation within the SEO Spider interface under ‘Configuration > System > Memory’. Then running the SEO Spider via the CLI once it’s set.
However, users that are running truly headless (with no monitor attached) tend to run on Linux based operating systems. So, to configure this on a Linux operating system you need to modify or create a file named ‘.screamingfrogseospider’ in the home directory (i.e.’~/.screamingfrogseospider’).
Add or modify the following line accordingly (shown for 8gb configuration):
Connecting To APIs
To utilise APIs we recommend using the user interface to set-up and authorise credentials, before using the CLI. However, when the user interface is not available then the APIs can be utilised by copying across required folders set-up on another machine, or editing the spider.config file, depending on the API.
For Google Analytics and Google Search Console both require connecting and authorising using the user interface due to OAuth. So a different machine which is able to use the SEO Spider interface should be used to set these up. Once authorised, the credentials can then be copied over to the machine where the user interface is not available.
Please navigate to the respective local Screaming Frog SEO Spider user folder –
And copy the ‘analytics’ and ‘search_console’ folders.
And paste those into the Screaming Frog SEO Spider user folder on the machine without a user interface. The APIs can then be utilised using the CLI as normal with the following commands.
Use the Google Analytics API during crawl.
--use-google-analytics [google account] [account] [property] [view] [segment]
Use the Google Search Console API during crawl.
--use-google-search-console [google account] [website]
For PSI, Ahrefs, Majestic and Moz, they all require the ‘spider.config’ file to be edited and updated with their respective API keys. The spider.config file can be found in the Screaming Frog SEO Spider user folder as shown above.
To use each API, simply paste in the relevant API line and replace ‘APIKEY’ with the API key provided by each provider.
The APIs can then be utilised using the CLI as normal with the following commands.
Use the PageSpeed Insights API during crawl.
Use the Ahrefs API during crawl.
Use the Majestic API during crawl.
Use the Mozscape API during crawl.
Exporting to Google Drive
To utilise Google Drive exports in CLI the machine will require appropriate credentials, similar to any other API.
These credentials can be authorised via the user-interface or by copying the ‘google_drive’ folder from another machine as described above.
Configuring A Crawl
You can then supply that configuration file when using the CLI to utilise those features.
When the user interface is not available then we recommend setting up the configuration file on a machine where it is available first, transferring over the saved .seospiderconfig file and then supplying it via command line. The command line for supercool-config.seospiderconfig would be as follows –
--config "C:\Users\Your Name\Desktop\supercool-config.seospiderconfig"
Join the mailing list for updates, tips & giveawaysHow we use the data in this form
Back to top