SEO Spider FAQ
LicencingWhat happens when the licence expires? What additional features does a licence provide? How many users are permitted to use one licence? Can I use my licence on more than one device? Why is my Licence Key saying it’s invalid? Why can’t my Licence Key be saved (Unable to update licence file)? I have lost my licence or invoice, how do I get another one? Is it possible to move my licence to a new computer? Where can I find your EULA (terms & conditions)?
Common ProblemsWhy do I get Error Initialising Embedded Browser on startup? Why won’t the SEO Spider crawl my website? Why am I experiencing slow down or hanging upon exports & saving crawls? Why does the SEO Spider freeze? Why do I get a “Connection Refused” response? Why do I get a “Connection Error” response? Why do I get a “Connection Timeout” response? Why do I get a “403 Forbidden” error response? Why do I get a “503 Service Unavailable” error response? Why Am I Experiencing A Different Response In A Browser? Why Do URLs Redirect to Themselves? Why is the character encoding incorrect? Why are page titles &/or meta descriptions not being displayed/displayed incorrectly? Why is the SEO Spider not finding images? Does the SEO Spider crawl PDFs? Why do I get a ‘Project open failed java.io.EOFException’ when attempting to open a saved crawl? Why won’t my crawl complete? Why isn’t my Include/Exclude function working? Why am I experiencing slow down? Why Am I Running Out Of Disk Space?
SEO SpiderHow do I maintain order in a list mode export? What IP address and ports does the SEO Spider use? What operating systems does the SEO Spider run on? How do I use the configuration options? How do I check for broken links (404 Errors)? What do each of the configuration options do? How do I bulk export all inlinks to 3XX, 4XX (404 error etc) or 5XX pages? How do I bulk export all images missing alt text? How do I bulk export all image alt text? How is the response time calculated? What’s the difference between ‘Crawl outside of start folder’ & ‘Check links outside folder’? How do I increase memory? How does the Spider treat robots.txt? Where can I see the pages blocked by robots.txt? How many URI can the Spider crawl? Can the SEO Spider crawl staging or development sites that are password protected or behind a login? Why does the URI completed total not match what I export? How do I block the SEO Spider from crawling my site? Do you collect data & can you see the websites I am crawling?
Common QueriesDo you have an API? What do Indexable & Non-Indexable mean? How Do I Crawl Wix Websites? Why do the results change between crawls? What hardware is recommended? Why is the SEO Spider not finding a particular page or set of pages? Why does the number of URLs crawled not match the number of results indexed in google or errors reported within Google Search Console? Why does the number of URLs crawled (or errors discovered) not match another crawler? Can I crawl more than one site at a time? Do you have an affiliate program? Can The SEO Spider Work On A Chromebook? Can I Use An External SSD?
Purchasing A LicenceHow many users are permitted to use one licence? How do I buy a licence? How do I renew my licence? Will an SEO Spider licence work in the Log File Analyser? How much does the Screaming Frog SEO Spider cost? Do you offer discounts on bulk licence purchases? What payment methods do you accept & from which countries? I have purchased a licence, why have I not received it? I’m a business in the EU, can I pay without VAT? Why is my credit card payment being declined? Do you have a refund policy? Where can I find your EULA (terms & conditions)?
ResellersDo you work with resellers? How is the software delivered? What is the part number? What is the reseller price? Where can I get company information? Where can I get licensing terms? Where can I get Form W-9 information? Can I get a quote in a currency other than GBP?
WindowsWhy do I get a blank screen? Why does the spider show in the task bar but not on screen? Why does the Installer take a while to start? Why do I get “error opening file for writing” when installing? Can I do a silent install?
MacWhy is the GUI text garbled? Do you support Macs below macOS Version 10.7.3 (& 32-Bit Macs)? How can I open multiple instances of the SEO Spider? The Spider GUI doesn’t have the latest flat style used in Yosemite
SitemapsHow do I create an XML sitemap? Why is my sitemap missing some URIs? Why can’t I generate an image sitemap from a list of images?
What do Indexable & Non-Indexable mean?
Every URL discovered in a crawl is classified as either ‘Indexable' or ‘Non-Indexable'.
'Indexable' means a URL that can be crawled, responds with a ‘200’ status code and is permitted to be indexed.
'Non-Indexable' is a URL that can't be crawled, doesn't respond with a '200' status code, or has an instruction not to be indexed.
Every non-indexable URL has an 'Indexability Status' associated with it, which explains quickly why it isn't indexable.
Non-indexable can include URLs that are the following -
- Blocked by robots.txt.
- No Response.
- Client Error (4XX).
- Server Error (5XX).
- Noindex (or 'None').
To stop self referencing meta refresh URLs being considered as 'non-indexable', untick the 'Respect Self Referencing Meta Refresh' configuration under 'Configuration > Spider > Advanced'. Back to top
Do you have an API?
In short, no. The SEO Spider is a desktop application you download, install and run locally. So there is no API.
There is a command line interface to use the tool programmatically. There is also a scheduling feature built into the SEO Spider.
Why is the GUI text garbled?
This is triggered by a local font issue, normally caused by having duplicate Arial fonts installed.
To investigate open the "FontBook" application. Go to "Edit->Look for Enabled Duplicates..." to remove any duplicates. After resolving these try restarting the SEO Spider. If you still have an issue, go back to FontBook and take a look at your Arial fonts, are there any messages about them needing repairing? If so, repair them and restart the SEO Spider. If you still have an issue go to "File->Restore Standard Fonts...". The fonts that are removed by this will got into a separate folder in Font Book so you'll be able to add them back in as needed.
How do I maintain order in a list mode export?
If you wish to export data in list mode in the same order it was uploaded, then use the ‘Export’ button which appears next to the ‘upload’ and ‘start’ buttons at the top of the user interface.
The data in the export will be in the same order and include all of the exact URLs in the original upload, including duplicates or any fix-ups performed.
Why Do I Receive An Error When Granting Access To My Google Account?
After allowing the SEO Spider access to your Google account you should be redirected to a screen that looks like this: However, if you receive an error like this: There are a few things to check:
- Is there any security software running on your machine preventing the SEO Spider listening on the port specified in the URL? The port is the number after localhost: in the address bar, 63212 in the screenshots above.
- Is your browser sending the request, intended for localhost, to a proxy instead? You can sometimes tell this if the failure screen mentions the name of a proxy server, such as Squid for example.
What hardware is recommended?
In short: For crawls under 100-200k URLs, a 64bit OS and 8GB of RAM should be sufficient. To be able to crawl millions of URLs, an SSD and 16gb of RAM is recommended.
Hard Disk: We highly recommend having an SSD and switching the SEO Spider to database storage mode to crawl large websites. a 500gb SSD will suffice, but 1TB is recommended if you're performing lots of large crawls.
Memory: The SEO Spider stores all crawl data in memory by default, but it can be configured to store data within a database to crawl more URLs. The more memory you have allocated, the more URLs you will be able to crawl in both regular memory storage mode and database storage mode. To be able to allocate more than 1gb of memory you need a 64-bit operating system. Most PCs purchased in the last five years will be running a 64-bit OS. So the most important thing is to make sure you have plenty of memory available. Each website is unique in terms of how much memory it requires, so we cannot give exact figures on how much memory is required to crawl a certain number of URLs. As a very rough guide, a 64-bit machine with 8GB of RAM will generally allow you to crawl about 200,000 URLs in memory storage mode. In database storage mode, this should allow you to crawl approx. 5 million URLs.
CPU: The speed of a crawl will normally be limited by the website itself, rather than the SEO Spider, as most sites limit the number of concurrent connections they will accept from a single IP. When crawling hundreds of thousand URLs some operations will be limited by CPU, such as sorting and searching, so a fast CPU will help minimise these slowdowns.
Why does the spider show in the task bar but not on screen?
The spider is opening off screen, possibly due to a multi monitor setup that has recently changed. To move the spider on to the active monitor use Alt + Tab to select the spider, then hold in the Windows key and use the arrow keys to move the Spider window into view.Back to top
What IP address and ports does the SEO Spider use?
The SEO Spider runs from the machine it is installed on, so the IP address is simply that of this machine/network. You can find out what this is by typing “IP Address” into Google.
The local port used for the connection will be from the ephemeral range. The port being connected to will generally be port 80, the default http port or port 443, the default https port. Other ports will be connected to if the site being crawled or any of its links specify a different port. For example: http://www.example.com:8080/home.html
How many users are permitted to use one licence?
Licences are individual per user. A single licence key is for a single assigned user. If you have five people from your team that wish to use the SEO Spider, you will require 5 user licences.
Discounts are available for 5 users or more, as shown in our pricing.
Please see section 3 of our terms and conditions for full details.
Why is my Licence Key saying it’s invalid?
If the SEO Spider says your ‘licence key is invalid’, then please check the following, as the licence keys we provide always work.
Licence keys are displayed on screen when you check out, sent in an email with the subject "Screaming Frog SEO Spider licence details" and are available at any time by logging into your account.
- Ensure you are using the username we provided for your licence key, as this isn't always the same as your account username and it's not your email address. This is by far the most common issue we see.
- Copy and paste the username and licence key, they are not designed to be entered manually.
- Please also double check you have inserted the provided ‘Username’ in the ‘Username’ field and the provided ‘Licence Key’, in the ‘Licence Key’ field.
- Ensure you are not entering a Log File Analyser licence into the SEO Spider.
- Ensure you are not entering a SEO Spider licence into the Log File Analyser.
I have lost my licence or invoice, how do I get another one?
If you have lost your a licence key or invoice from the 22nd of September 2014 onwards, please login to your account to retrieve the details.
If you have lost your account password, then simply request a new password via the form.
If you purchased a licence before the 22nd of September 2014, then please contact firstname.lastname@example.org with your username or e-mail you used to pay for the premium version.
How do I buy a licence?
Simply click on the ‘buy a licence’ option in the SEO Spider ‘licence’ menu or visit our purchase a licence page directly.
You can then create an account & make payment. When this is complete, you will be provided with your licence key to open up tool & remove the crawl limit. If you have just purchased a licence and have not received your licence, please check your spam / junk folder. You can also view your licence(s) details and invoice(s) by logging into your account.
Please note, the account login has only been active from the 22nd of September 2014. If you purchased before this date, it won’t be available and you can contact us for any information.
Will an SEO Spider licence work in the Log File Analyser?
No, the Screaming Frog SEO Spider is a separate product to the Log File Analyser. They have different licences, which will need to be purchased individually. You can purchase a Log File Analyser licence here.Back to top
Do you offer discounts on bulk licence purchases?
Yes, please see our SEO Spider licence page for more details on discounts.Back to top
I have purchased a licence, why have I not received it?
If you have just purchased a licence and have not received your licence, please check your spam / junk folder. Licences are sent immediately upon purchase. You can also view your licence(s) details and invoice(s) by logging into your account.Back to top
Why is my credit card payment being declined?
There are a few reasons this could happen:
- Incorrect card details: Double check you have filled out your card details correctly.
- Incorrect billing address: Please check the billing address you provided matches the address of the payment card.
- Blocked by payment provider: Please contact your card issuer. Screaming Frog does not have access to failure reasons. It’s quite common for a card issuer to block international purchases.
Do you work with resellers?
Resellers can purchase an SEO Spider licence online on behalf of a client. Please be aware that licence usernames are automatically generated from the account name entered during checkout. If you require a custom username, then please request a PayPal invoice in advance.
For resellers who are unable to purchase online with PayPal or a credit card and encumber us with admin such as vendor forms, we reserve the right to charge an administration fee of £50.
What is the part number?
There is no part number or SKU.Back to top
Where can I get Form W-9 information?
Screaming Frog is a UK based company, so this is not applicable.Back to top
Why won’t the SEO Spider crawl my website?
This could be for a number of reasons:
- The very first thing to look at is the status code and status in the Internal tab. The site should respond with a 200 status code and 'OK' status. However, if it doesn't, please read our guide on common HTTP status codes when crawling, what they mean and how to resolve any issues.
- The site is blocked by robots.txt. The 'status code' column in the internal tab will be a '0' and the 'status' column for the URL will say 'Blocked by Robots.txt'. You can configure the SEO Spider to ignore robots.txt under 'Configuration > Robots.txt > Settings'.
- The site behaves differently depending on User-Agent. Try changing the User-Agent under Configuration->HTTP Header->User Agent.
- The site requires Cookies. Can you view the site with cookies disabled in your browser after clearing your cache? Licenced users can enable cookies by going to Configuration->Spider and ticking “Allow Cookies” in the “Advanced” tab.
- The ‘nofollow’ attribute is present on links not being crawled. There is an option in Configuration->Spider under the “Basic” tab to follow ‘nofollow’ links.
- The page has a page level ‘nofollow’ attribute. The could be set by either a meta robots tag or an
X-Robots-Tagin the HTTP header. These can be seen in the “Directives” tab in the “Nofollow” filter. To ignore the NoFollow directive go to Configuration -> Spider -> and tick "Follow Internal 'No Follow'" and recrawl.
- The website is using framesets. The SEO Spider does not crawl the frame src attribute.
- The website requires an Accept-Language header (Configuration->HTTP Header add a header call 'Accept Language' with a value of 'en-gb').
- The Content-Type header did not indicate the page is HTML. This is shown in the Content column and should be either
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
Why does the SEO Spider freeze?
This will generally be due to the SEO Spider reaching its memory limit. Please read how to increase memory.Back to top
Why do I get a “Connection Error” response?
Connection error, or connection timeout is a message when there is an issue in receiving a response at all. This is generally due to network issues or proxy settings. Please check that you can connect to the internet. If you have changed the SEO Spider proxy settings (under configuration, proxy), please ensure that these are correct (or they are switched off).Back to top
Why do I get a “403 Forbidden” error response?
The 403 forbidden status codes occurs when a web server denies access to the SEO Spider’s request for some reason.
If this happens consistently and you can see the website in a browser, it could be the web server behaves differently depending on User Agent. In the premium version try adjusting the User Agent setting under Configuration->HTTP Header->User Agent. For example, try crawling as a bot, such as ‘Googlebot Regular’, or as a browser, such as ‘Chrome’.
If this happens intermittently during a crawl, it could be due to the speed the Spider is requesting pages overwhelming the server. In the premium version of the SEO Spider you can reduce the speed of requests. If you are running the ‘lite’ version you may find that right clicking the URL and choosing re-spider will help.
Why Am I Experiencing A Different Response In A Browser?
The SEO Spider HTTP request is often different to a traditional browser and other tools, so you can sometimes experience a different response than if you visit the page or use a different tool to check the response.
The SEO Spider simply reports on the response given to it by the server when it makes a request, which won’t be incorrect, but can differ from what might be experienced elsewhere. Some of the common factors that can cause servers to give a different response, that are configurable in the SEO Spider are -
- User-Agent - The SEO Spider uses it's own user-agent as default, and so do browsers. You can find the User-Agent configuration under ‘Configuration > HTTP Header > User-Agent’. If you adjust this to a browser user-agent (Chrome etc), you may experience a different response.
- Cookies - By default the SEO Spider doesn't accept cookies (similar to Google). However, browsers do. If you disable cookies in your browser, you may see that the page doesn't load anymore, issues session IDs into the URL, or redirects to itself. You can 'allow cookies' under 'Configuration > Spider > Advanced'.
- Accept-Language Header - Your browser will supply an accept language header with your language. Similar to Googlebot, the SEO Spider doesn't supply an Accept-Language header for requests by default. However, you can adjust the Accept-Language configuration under 'Configuration > HTTP Header > Accept-Language'.
- Speed - Servers can respond differently when under stress and load. Their responses can be less stable. We recommend reducing the crawl speed and seeing if the responses then change, and using WireShark to verify responses independently.
Why is the character encoding incorrect?
The SEO Spider determines the character encoding of a web page by the “charset=” parameter in the http Content-Type header, eg:
You can see this in the SEO Spider’s interface in the ‘Content’ columns (in various tabs). If this is not present in the http header, the SEO Spider will then read the first 2048 bytes of the html page to see if there is a charset within the html.
For example –
“meta http-equiv=”Content-Type” content=”text/html; charset=windows-1255″
If this is not the case, we continue assuming the page is UTF-8.
The Spider does log any character encoding issues. If there is a specific page that is causing problems, perform a crawl of only that page by setting the maximum number of URLs to crawl to be 1, then crawling the URL. You may see a line in the trace.txt log file (the location is – C:UsersYourprofile.ScreamingFrogSEOSpidertrace.txt):
20-06-12 20:32:50 INFO seo.spider.net.InputStreamWrapper:logUnsupportedCharset Unsupported Encoding ‘windows-‘ reverting to ‘UTF-8’ on page ‘http://www.example.com’ java.io.UnsupportedEncodingException: windows-‘. This could be an error on the site or you may need to install an additional language pack.
The solution to fix this is to specify the format of the data by either the Content-Type field of the accompanying HTTP header or ensuring the charset parameter in the source code is within the first 2048 bytes of the html within the head element.
Why is the SEO Spider not finding images?
There are generally two reasons for this:
- The images are blocked by robots.txt. You can either ignore robots.txt or customise the robots.txt to allow crawling.
Why do I get a ‘Project open failed java.io.EOFException’ when attempting to open a saved crawl?
This means the crawl did not save completely, which is why it can’t be opened. EOF stands for ‘end of file’, which means the SEO Spider was unable to read to the expect end of the file. This can be due to the SEO Spider crashing during save, which is normally due to running out of memory. This can also happen if you exit the SEO Spider during save, or your machine crashes for example. Unfortunately there is no way to open or retrieve the crawl data, as it’s incomplete and therefore lost. Please also consider increasing your memory allocation, which will help reduce any problems saving a crawl in the future.Back to top
Why isn’t my Include/Exclude function working?
Please note Include/Exclude are case sensitive so any functions need to match the URL exactly as it appears.
Functions will only be applied to URLs that have not yet been discovered by the Spider. Any URLs that have been discovered and queued for crawling will to be affected, hence it is recommended the crawl is restarted between updates to ensure the results are accurate.
Functions will not be applied to the starting URL of a crawl or URLs in list mode.
.* is a the regex wildcard
Why do I get “error opening file for writing” when installing?
Please reboot your computer and restart the installation process.Back to top
Do you support Macs below macOS Version 10.7.3 (& 32-Bit Macs)?
From version 2.50 the SEO Spider requires a version of Java not supported by this version of macOS. This means older 32-bit Macs (the last of which we understand were made 8-9 years ago) will not be able to use the latest version of the SEO Spider. Newer 64-bit Macs which haven’t yet updated their version of macOS will need to update their OS before installing Java.
We do still support version 2.40 for macOS versions below 10.7.3 (and 32-bit) Macs which can be downloaded here. This version has considerably less features than the current version, as described in our release history.
The Spider GUI doesn’t have the latest flat style used in Yosemite
Unfortunately we are at the mercy of Oracle to update their Mac look and feel to more closely match the new style introduced in macOS Yosemite. There is a Java bug related to this at JDK-8052173. This will be updated in a future Java release.Back to top
How do I provide feedback?
Feedback is welcome, please just follow the steps on the support page to submit feedback. Please note we will try to read all messages but might not be able to reply to all of them. We will update this FAQ as we receive additional questions and feedback.Back to top
How do I use the configuration options?
You cannot use the configuration options in the lite version of the tool. You will need to buy a licence to open up this menu, you can do this by clicking the ‘buy a licence’ option in the Spider’s interface under ‘license’.Back to top
What do each of the configuration options do?Back to top
How do I bulk export all images missing alt text?
You can bulk export data via the ‘bulk export’ option in the top level navigation menu. Simply choose the ‘images missing alt text’ option to export all references of images without alt text. Please see more on exporting in our user guide.Back to top
How is the response time calculated?
How do I increase memory?Back to top
Where can I see the pages blocked by robots.txt?
You can simply view URLs blocked via robots.txt in the UI (within the ‘Internal’ and ‘Response Codes’ tabs for example). Ensure you have the ‘Show internal URLs blocked by robots.txt’ configuration ticked under 'Configuration > Robots.txt > Settings'.
You can view external URLs blocked by robots.txt within the 'External' and 'Response Codes' tabs by ticking the ‘Show External URLs blocked by robots.txt’ configuration under 'Configuration > Robots.txt > Settings'.
Disallowed URLs will appear with a ‘status’ as ‘Blocked by Robots.txt’ and there’s a ‘Blocked by Robots.txt’ filter under the ‘Response Codes’ tab, where these can be viewed.
The ‘Blocked by Robots.txt’ filter also displays a ‘Matched Robots.txt Line’ column, which provides the line number and disallow path of the robots.txt entry that’s excluding each URL. If multiple lines in robots.txt block a URL, the SEO Spider will just report on the first encountered, similar to Google within Search Console.
Please see our guide on using the SEO Spider as a robots.txt tester.
If you’re using the older 2.40 Mac version of the SEO Spider, you can view the ‘Total Blocked by robots.txt’ for a crawl on the right-hand side of the user interface in the ‘Summary’ section of the overview tab. This count includes both internal and external URLs. Currently, there isn’t a way of seeing which URLs have been blocked in the user interface. However, it is possible to get this information from the SEO Spider log file, after a crawl. Each time a URL is blocked by robots.txt, it will be reported like this:
2015-02-18 08:56:09,652 [RobotsMain 1] INFO - robots.txt file prevented the spider of 'http://www.example.com/page.html', reason 'Blocked by line 2: Disallow: http://www.example.com/'. You can choose to ignore robots.txt files in the Spider configuration.
You can view the log file(s) by either going to the location shown for ‘Log File’ under Help->Debug, or downloading and unzipping the log files from Help->Debug->Save Logs.
Can the SEO Spider crawl staging or development sites that are password protected or behind a login?
The SEO Spider supports two forms of authentication, standards based which includes basic and digest authentication, and web forms based authentication.
Basic & Digest AuthenticationThere is no set-up required for basic and digest authentication, it is detected automatically during a crawl of a page which requires a login. If you visit the website and your browser gives you a pop-up requesting a username and password, that will be basic or digest authentication. If the login screen is contained in the page itself, this will be a web form authentication, which is discussed in the next section.
Often sites in development will also be blocked via robots.txt as well, so make sure this is not the case or use the ‘ignore robot.txt configuration'. Then simply insert the staging site URL, crawl and a pop-up box will appear, just like it does in a web browser, asking for a username and password. Enter your credentials and the crawl will continue as normal. You cannot pre-enter login credentials – they are entered when URLs that require authentication are crawled. This feature does not require a licence key. Try to following pages to see how authentication works in your browser, or in the SEO Spider.
- Basic Authentication Username:user Password: password
- Digest Authentication Username:user Password: password
Web Form AuthenticationThere are other web forms and areas which require you to login with cookies for authentication to be able to view or crawl it. The SEO Spider allows users to log in to these web forms within the SEO Spider’s built in Chromium browser, and then crawl it. This feature requires a licence to use it.
To log in, simply navigate to ‘Configuration > Authentication’ then switch to the ‘Forms Based’ tab, click the ‘Add’ button, enter the URL for the site you want to crawl, and a browser will pop up allowing you to log in.
Please note – This is a very powerful feature, and should therefore be used responsibly. The SEO Spider clicks every link on a page; when you’re logged in that may include links to log you out, create posts, install plugins, or even delete data. Back to top
How do I block the SEO Spider from crawling my site?
The spider obeys robots.txt protocol. Its user agent is ‘Screaming Frog SEO Spider’ so you can include the following in your robots.txt if you wish the Spider not to crawl your site – User-agent: Screaming Frog SEO Spider Disallow: / Please note – There is an option to ‘ignore’ robots.txt and change user-agent, which is down to the responsibility of the user entirely.Back to top
Why does the number of URLs crawled not match the number of results indexed in google or errors reported within Google Search Console?
There’s a number of reasons why the number of URLs found in a crawl might not match the number of results indexed in Google (via a site: query) or errors reported in the SEO Spider match those in Google Search Console.
First of all, crawling and indexing are quite separate, so there will always be some disparity. URLs might be crawled, but it doesn’t always mean they will actually be indexed in Google. This is an important area to consider, as there might be content in Google’s index which you didn’t know existed, or no longer want indexed for example. Equally, you may find more URLs in a crawl than in Google’s index due to directives used (noindex, canonicalisation) or even duplicate content, low site reputation etc.
Secondly, the SEO Spider only crawls internal links of a website at that moment of time of the crawl. Google (more specifically Googlebot) crawls the entire web, so not just the internal links of a website for discovery, but also external links pointing to a website.
Googlebot’s crawl is also not a snapshot in time, it’s over the duration of a site’s lifetime from when it’s first discovered. Therefore, you may find old URLs (perhaps from discontinued products or an old section on the site which still serve a 200 ‘OK’ response) that isn't linked to anymore, or content that is only linked to via external sources in their index still. The SEO Spider won’t be able to discover URLs which are not linked to internally, like orphan pages or URLs only accessible by external links.
There are other reasons as well, these may include –
- Google include URLs which are blocked via robots.txt in their search results number. Don’t forget, robots.txt just stops a URL from being crawled, it doesn’t stop the URL from being indexed and appearing in Google.
- Google crawl XML sitemaps. The SEO Spider does not currently crawl XML sitemaps by default, you currently have to upload them in list mode. The reason we decided against crawling XML sitemaps by default is that it shouldn’t make up for a site’s architecture. If a page is not linked to in the site’s internal link structure, and only in an XML sitemap, it will help it be discovered and indexed, but the chances are it won’t perform very well organically. This is obviously because it won’t be passed any real PageRank, like a proper internal link. So we believe, it’s useful to analyse websites via the natural crawling and indexing process of internal links to get a better idea of a site’s set-up. There are some scenarios where it does make sense to crawl XML sitemaps though and we may make this possible in the future as an option.
- Google’s results number via a site: query can be pretty unreliable!
- Google’s error reporting can be pretty slow and outdated!
Can I crawl more than one site at a time?
Yes. There are two ways you can do this:
1) Open up a multiple instances of the SEO Spider, one for each domain you want to crawl. Mac users check here.
2) Use list mode (Mode->List). Remove the search depth limit (Configuration->Spider->Limits and untick “Limit Search Depth”, untick “Ignore robots.txt” (Configuration->Robots.txt->Settings) then upload your list of domains to crawl.
Why is my sitemap missing some URIs?
Canonicalised, robots.txt blocked, noindex and paginated URIs are not included in the sitemap by default. You may choose to include these in your site map by ticking the appropriate checkbox(s) in the 'Pages' tab when you export the site map.
Please read our user guide on XML Sitemap Creation.
Why is my regex extracting more than expected?
If you are using a regex like
.* that contains a greedy quantifier you may end up matching more than you want. The solution to this is to use a regex like
For example if you are trying to extract the id from the following JSON:
"id":"(.*)" you will get:
007", "name":"James Bond
If you use
"id":"(.*?)" you will extract:
Why doesn’t GA data populate against my URLs?
The URLs in your chosen Google Analytics view have to match the URLs discovered in the SEO Spider crawl exactly, for data to be matched and populated accurately. If they don’t match, then GA data won’t be able to be matched and won’t populate. This is the single most common reason.
If Google Analytics data does not get pulled into the SEO Spider as you expected, then analyse the URLs under ‘Behaviour > Site Content > Landing Pages’ and ‘Behaviour > Site Content > All Pages’ depending on which dimension you choose in your query. Try clicking on the URLs to open them in a browser to see if they load correctly.
You can also export the ‘orphan pages’ report which shows a list of URLs returned from the Google Analytics & Search Analytics (from Search Console) API’s for your query, that didn’t match URLs in the crawl. Check the URLs with source as ‘GA’ for Google Analytics specifically (those marked as ‘GSC’ are Google Search Analytics, from Google Search Console). The URLs here need to match those in the crawl, for the data to be matched accurately.
If they don’t match, then the SEO Spider won’t be able to match up the data accurately. We recommend checking your default Google Analytics view settings (such as ‘default page’) and filters such as ‘extended URL’ hacks, which all impact how URLs are displayed and hence matched against a crawl. If you want URLs to match up, you can often make the required amends within Google Analytics or use a ‘raw’ unedited view (you should always have one of these ideally).
Please note – There are some very common scenarios where URLs in Google Analytics might not match URLs in a crawl, so we cover these by matching trailing and non-trailing slash URLs and case sensitivity (upper and lowercase characters in URLs). Google doesn’t pass the protocol (HTTP or HTTPS) via their API, so we also match this data automatically as well.
Why Am I Running Out Of Disk Space?
When using Database Storage mode the SEO Spider monitors how much disk space you have and will automatically pause if you have less than 5GB remaining. If you receive this warning you can free up some disk space to continue the crawl.
If you are unable to free up any disk space you can either configure the SEO Spider to use another drive with more space by going to Configuration->System->Storage and selecting a folder on another disk, or switch to Memory Storage by going to Configuration->System->Storage and selecting Memory Storage. Changing either of these settings requires a restart, so if you'd like to continue the current crawl you will have to save it and reload it in after restarting.
Do you have an affiliate program?
No, we do not have an affiliate program for the SEO Spider software at this time.Back to top
Can I Use An External SSD?
If you don't have an internal SSD and you'd like to crawl large websites using database storage mode, then an external SSD can help.
There are a few things to remember with this set-up. It's important to ensure your machine has USB 3.0 and your system supports UASP mode. Most new systems do automatically, if you already have USB 3.0 hardware. When you connect the external SSD, ensure you connect to the USB 3.0 port, otherwise reading and writing will be slow.
USB 3.0 ports generally have a blue inside (as recommended in their specification), but not always; and you will typically need to connect a blue ended USB cable to the blue USB 3.0 port. Simple!
After that, you need to switch to database storage mode ('Configuration > System > Storage'), and then select the database location on the external SSD (the 'D' drive in the example below). You will then need to restart the SEO Spider, before beginning the crawl.
Why do I get Error Initialising Embedded Browser on startup?
This is normally triggered by some third-party software, such as a firewall or antivirus. Please try disabling this or adding an exception. The exception you need to add varies depending on what operating system you are using:
You can prevent this initialisation happening by going to Configuration->System->Embedded Browser.
Why do I get a blank screen?
If the SEO Spider user interface isn't rendering for you then the chances are you've run into this Java bug. On our experience this seems to be an issue with Intel HD 5xx series graphics cards. We've had less of these recently, so it might be a driver update will help resolve this issue.
If not, please close the SEO Spider, then open up the following file in a text editor:
C:\Program Files (x86)\Screaming Frog SEO Spider\ScreamingFrogSEOSpider.l4j.ini
then add the following under the -Xmx line:
(You may have a permission issue here, so copying your desktop, editing, then copying back may be easier).
No when you start the SEO Spider the user interface should render correctly.
How Do I Crawl Wix Websites?
In short, you shouldn't have to do anything special to crawl Wix websites anymore. Wix use dynamic rendering to show a server-side-rendered (SSR) version of their website to search bots, browsers, and the Screaming Frog SEO Spider user-agent.
Wix websites were historically set-up using Google's now deprecated AJAX crawling scheme, with escaped fragment URLs. Google announced they would stop using the old AJAX crawling scheme in Q2 of 2018 (and will render #! URLs and content instead).
If you experience any problems crawling Wix websites, double check your user-agent is either Googlebot or Screaming Frog SEO Spider (Config > User-Agent).
Why do the results change between crawls?
The most common reasons for this are:
- Crawl settings are different, which can lead to different pages being crawled or different responses being given, leading to different results.
- The site has changed, meaning the different elements of the crawl are reported differently.
- The SEO Spider receives different responses, specific URLs timing out or giving server errors. This could mean less pages are discovered overall as well as these being inconsistent between crawls. Remember to double check under 'Response Codes > No Responses' and right click on URLs and click to 're-spider' on URLs that might have intermittent issues (such as timing out or server errors).
Why Does My Connection To Google Analytics Fail?
If you are receiving the following error when trying to connect to Google Analytics or Search Console: Please read our guide on resolving this.Back to top
Why is the SEO Spider not finding a particular page or set of pages?
The SEO Spider finds pages by scanning the HTML code of the entered starting URL for
<a href> links, which it will then crawl to find more links. Therefore to find a page there must be a clear linking path to it from the starting point of a crawl for the SEO Spider to follow.
If there is a clear path, then these links or the pages the links are on must exist in a way the SEO Spider either cannot 'see' or crawl.
Hence please make sure of the following:
- If any links or linking pages have ‘nofollow’ attributes or directives preventing the SEO Spider from following these links. By default the SEO Spider obeys ‘nofollow’ directives unless the 'follow internal nofollow' configuration is checked.
- The expected page(s) are on the same subdomain as your starting page. By default links to different subdomains are treated as external unless the Crawl all subdomains option is checked.
- If the expected page(s) are in a different subfolder to the starting point of the crawl the Crawl outside start folder option is checked.
- You do not have an Include or Exclude function set up that is limiting the crawl.
- Ensure category pages (or similar) were not temporarily unreachable during the crawl, giving a connection timeout, server error etc. preventing linked pages from being discovered.
- By default the SEO Spider won't crawl the XML Sitemap of a website to discover new URLs. However, you can select to 'Crawl Linked XML Sitemaps' in the configuration.
What happens when the licence expires?
When the licence expires, the SEO Spider returns to the restricted free lite version. The Spider’s configuration options are unavailable, there is a 500 URI maximum crawl limit and previously saved crawls cannot be opened.
To remove the crawl limit, use all the features and configuration options and open up saved crawls, simply purchase a licence upon expiry.
What additional features does a licence provide?
In the same way as the free ‘lite’ version, there are no restrictions on the number of websites you can crawl with a licence. Licences are however, individual per user. If you have five members of the team who would like to use the licenced version, you will need five licences.
Can I use my licence on more than one device?Back to top
Why can’t my Licence Key be saved (Unable to update licence file)?
The SEO Spider stores the licence in a file called licence.txt in the users home directory in a ‘.ScreamingFrogSEOSpider’ folder. You can see this location by going to Help->Debug and looking at the line labeled “Licence File”. Please check the following to resolve this issue:
- Ensure you are able to create the licence file in the correct location.
- If you are using a Mac, see the answer to this stackoverflow question.
- If you are using Windows is could be the default
user.homevalue supplied to Java is incorrect. Ideally your IT team should fix this. As a work around you can add:
-Duser.home=DRIVE_LETTER:\path\to\new\directory\to the ScreamingFrogSEOSpider.l4j.ini file that controls memory settings.
Is it possible to move my licence to a new computer?
Yes, please take a note of your licence key (you can find this under ‘Licence’ and ‘Enter Licence...’ in the software), then uninstall the SEO Spider on the old computer, before installing and entering your licence on the new machine. If you experience any issues during this move, please contact our support.Back to top
How do I renew my licence?Back to top
How much does the Screaming Frog SEO Spider cost?
As standard you download the lite version of the tool which is free. However, without a licence the SEO Spider is limited to crawling a maximum of 500 URIs each crawl. The configuration options of the Spider and the custom source code search feature are also only available in the licensed version.
For £149 per annum you can purchase a licence which opens up the Spider’s configuration options and removes restrictions on the 500 URI maximum crawl. A licence is required per individual using the tool. When the licence expires, the SEO Spider returns to the restricted free lite version.
What payment methods do you accept & from which countries?
We accept PayPal and most major credit and debit cards. The price of the SEO Spider is in pound sterling (GBP). If you are outside of the UK, please take a look at the current exchange rate to work out the cost. (The automatic currency conversion will be dependent on the current foreign exchange rate and perhaps your card issuer). We do not accept cheques (or checks!)Back to top
I’m a business in the EU, can I pay without VAT?
Yes, if you are not in the UK. To do this you must have a valid VAT number and enter this on the Billing page during checkout. Select business and enter your VAT number as shown below: Your VAT number will be checked against the VIES system and VAT removed if it is valid. The VIES system does go down from time to time, so if this happens please try again later. Unfortunately we cannot refund VAT once a purchase has been made.Back to top
Do you have a refund policy?
Absolutely! If you are not completely satisfied with the SEO Spider you purchased from this website, you can get a full refund if you contact us within 14 days of purchasing the software. To obtain a refund, please follow the procedure below.
Contact us via email@example.com or support and provide the following information:
- Your contact information (last name, first name and email address).
- Your order number.
- Your reason for refund! If there's an issue, we can generally help.
- For downloaded items, please provide proof that the software has been uninstalled from all your computers and will never be installed or used any more (screenshots will suffice).
If you have purchased your item by PayPal the refund is re-credited to the same PayPal account used to purchase the software.
If you have purchased your item using any other payment method, we will issue the refund by BACS, once approved by our Financial Department.
For any questions concerning this policy, please contact us at support. Back to top
How is the software delivered?
The software needs to be downloaded from our website, the licence key is delivered electronically by email.Back to top
What is the reseller price?
We do not offer discounted rates for resellers. The price is GBP at £149 per year, per user.Back to top
Can I get a quote in a currency other than GBP?
No, we only sell in GBP.Back to top
Why am I experiencing slow down or hanging upon exports & saving crawls?
This will generally be due to the SEO Spider reaching its memory limit. Please read how to increase memory.Back to top
Why do I get a “Connection Refused” response?
Connection refused is displayed in the Status column when the SEO Spiders connection attempt has been refused at some point between the local machine and website. If this happens for all sites consistently then it is an issue with the local machine/network. Please check the following:
- You can view websites in your browser.
- Make sure you have the latest version of the SEO Spider installed.
- That software such as ZoneAlarm, anti-virus (such as the premium version of Avira Antivirus, and Kaspersky) or firewall protection software are not blocking your machine/SEO Spider from making requests. The SEO Spider needs to be trusted / accepted. We recommend your IT team is consulted on what might be the cause in office environments.
- The proxy is not accidentally ‘on’, under Configuration->Proxy. Ensure the box is not ticked, or the proxy details are accurate and working.
- If you are trying to crawl a secure site (https://) and not using version 8.0 or above, please see here.
- Changing the User Agent under Configuration->HTTP Header->User Agent.
- Adjusting the crawl speed / number of threads under Configuration->Speed.
- In the ‘lite’ version where you cannot control the speed, try right clicking on the URL and choosing re-spider.
Why do I get a “Connection Timeout” response?
Connection timeout occurs when the SEO Spider struggles to receive an HTTP response at all and the request times out. It can often be due to a slow responding website or server when under load, or it can be due to network issues. We recommend the following –
- Ensure you can view the website (or any websites) in your browser and check their loading time for any issues. Hard refresh your browser to ensure you’re not seeing a cached version.
- Increase the default response timeout configuration of 10 seconds, up to 20 or 30 seconds if the website is slow responding.
- Decrease the speed of the crawl in the SEO Spider configuration to decrease load on any servers struggling to respond. Try 1 URL per second for example.
- Ensure the proxy settings are not enabled accidentally and if enabled that the details are accurate.
- Ensure that ZoneAlarm, anti virus or firewall protection software (such as the premium version of Avira Antivirus) are not blocking your machine from making requests. The SEO Spider needs to be trusted / accepted. We generally recommend your IT team who know your systems are consulted on what might be the cause.
Why Do URLs Redirect to Themselves?
When a website requires cookies this often appears in the SEO Spider as if the starting URL is redirecting to itself or to another URL and then back to itself (any necessary cookies are likely being dropped along the way). This can also be seen when viewing in a browser with cookies disabled:
The easiest way to work around this issue is to first load up the page using forms based authentication.
‘Configuration > Authentication > Forms Based’
Select ‘Add’, then enter the URL that is redirecting, and wait for the page to load before clicking ‘OK’.
The SEO Spider's in-built Chromium browser has thus accepted the cookies, and you should now be able to crawl the site normally.
A secondary method to bypass this kind of redirect is to ensure the ‘Allow Cookies’ configuration is set:
'Configuration > Spider > Advanced > Allow Cookies'
To bypass the redirect behaviour, as the SEO Spider only crawls each URL once, a parameter must be added the to the starting URL:
A URL rewriting rule that removes this parameter when the spider is redirected back to the starting URL must then be added:
Configuration > URL Rewriting > Remove Parameters
The SEO Spider should then be able crawl normally from the starting page now it has any required Cookies.
Why are page titles &/or meta descriptions not being displayed/displayed incorrectly?
If the site or URL in question has page titles and meta descriptions, but one (or both!) are not showing in the SEO Spider this is generally due to the following reasons -
1) The SEO Spider reads up to a maximum of 20 meta tags. So, if there are over 20 meta tags and the meta description is after the 20th meta tag, it will be ignored.
Does the SEO Spider crawl PDFs?
The SEO Spider will check links to PDF documents. These URLs can be seen under the PDF filter in the Internal and External tabs. It does not parse PDF documents to find links to crawl.Back to top
Why won’t my crawl complete?
First ensure the Spider is still crawling the site and if so what the URLs it has been finding look like. Depending on the URLs the Spider has been finding will explain why the crawl percentage is not increasing:
- URLS seem normal – The Spider keeps finding new URLs on a very large site. Consider splitting the crawl up into sections.
- Many similar URLs parameters – The Spider keeps finding the same URLs with different parameters, possibly from faceted navigation. Try setting the query string limit to 0 (Configuration->Spider, “Limit Number of Query Strings” in the “Limits” tab).
- There are many long URLs with parts that repeat themselves – There is a relative linking error where the Spider keeps finding URLs that cause a never ending loop. Use the exclude feature to exclude the offending URLs.
Why does the Installer take a while to start?
Because Windows Defender is running a security scan on it, this can take up to a couple of minutes. Unfortunately when downloading the file using Google Chrome it gives no indication that it is running the scan. Internet Explorer does give an indication of this, and Firefox does not scan at all. If you go directly to your downloads folder and run the installer from there you don’t have to wait for the security scan to run.Back to top
Can I do a silent install?
Yes, by issuing the following command:
By default this will install the SEO Spider to:
C:\Program Files (x86)\Screaming Frog SEO Spider
You can choose an alternative location by using the following command:
ScreamingFrogSEOSpider-VERSION.exe /S /D=C:\My Folder
How can I open multiple instances of the SEO Spider?
To open additional instances of the SEO Spider open a Terminal and type the following:
open -n /Applications/Screaming\ Frog\ SEO\ Spider.app/
How do I submit a bug / receive support?
Please follow the steps on the support page so we can help you as quickly as possible. Please note, we only offer full support for premium users of the tool although we will generally try and fix any issues.Back to top
What operating systems does the SEO Spider run on?
The SEO Spider runs on Windows, Mac and Linux. It’s a Java application and requires a Java 8 runtime environment or later to be to run. You can check here to see the system requirements to run Java. You can download the SEO Spider for free and try it.
Mac: If you are using macOS 10.7.2 or lower please see this faq.
Linux: We provide an Ubuntu package for Linux. If you would like to run the SEO Spider on a non-Debian based distribution please extract the jar file from the .deb and run it manually.
Windows: The SEO Spider can also be run on the server variants and Windows 10. From version 9.0 onwards, the SEO Spider doesn't run on Windows XP.
Please note that the rendering feature is not available on older operating systems.
How do I bulk export all image alt text?
You can bulk export data via the ‘bulk export’ option in the top level navigation menu. Simply choose the ‘all images’ option to export all images and associated alt text found in our crawl. Please see more on exporting in our user guide.Back to top
How does the Spider treat robots.txt?
The Screaming Frog SEO Spider is robots.txt compliant. It checks robots.txt in the same way as Google. So it will check robots.txt of the (sub) domain and follow directives for all robots and specifically any for Googlebot. The tool also supports URL matching of file values (wildcards * / $) like Googlebot. Please see the above document for more information or our robots.txt section in the user guide. You can turn this feature off in the premium version.Back to top
How many URI can the Spider crawl?
The SEO Spider uses a configurable hybrid storage engine, which enables it to crawl millions of URLs. However, it does require configuration (explained below) and the correct hardware.
By default the SEO Spider will crawl using RAM, rather than saving to disk. This has advantages, but it cannot crawl at scale, without lots of RAM allocated.
In standard memory storage mode there isn't a set number of pages it can crawl, it is dependent on the complexity of the site and the users machine specifications. The SEO Spider sets a maximum memory of 1gb for 32-bit and 2gb for 64-bit machines, which enables it to crawl between 5k-100k URLs of a site.
You can increase the SEO Spider’s memory allocation, and crawl into hundreds of thousands of URLs purely using RAM. A 64-bit machine with 8gb of RAM will generally allow you to crawl a couple of hundred thousand URLs, if the memory allocation is increased.
The SEO Spider can be configured to save crawl data to disk, which enables it to crawl millions of URLs. However, we recommend using this option with a Solid State Drive (SSD), as hard disk drives are significantly slower at writing and reading data. This can be configured by selecting ‘Database Storage’ mode (under ‘Configuration > System > Storage’).
As a rough guide, an SSD and 8gb of RAM in database storage mode, should allow the SEO Spider to crawl approx. 5 million URLs.
Please see our guide on crawling large websites for more information.
Why does the URI completed total not match what I export?
The ‘Completed’ URI total is the number of URIs the SEO Spider has encountered. This is the total URI crawled, plus any ‘Internal’ and ‘External’ URI blocked by robots.txt.
Depending on the settings in the robots.txt section of the ‘Configuration > Spider >Basic’ menu, these blocked URI may not be visible in the SEO Spider interface.
If the ‘Respect Canonical’ or ‘Respect Noindex’ options in the ‘Configuration > Spider > Advanced’ tab are checked, then these URI will count towards the ‘Total Encountered’ (Completed Total) and ‘Crawled’, but will not be visible within the SEO Spider interface.
The ‘Response Codes’ Tab and Export will show all URLs encountered by the Spider except those hidden by the settings detailed above.
Do you collect data & can you see the websites I am crawling?
We do not see what sites you are crawling or the data you have crawled. All crawl data is stored on your machine.
Google APIs use the OAuth 2.0 protocol for authentication and authorisation, and obviously, the data provided via Google Analytics and other APIs is only accessible locally on your machine.
The software does not contain any spyware, malware or adware (as verified by Softpedia for Windows and macOS).
Why does the number of URLs crawled (or errors discovered) not match another crawler?
First of all, the free ‘lite’ version is restricted to a 500 URLs crawl limit and obviously a website might be significantly larger. If you have a licence, the main reason an SEO Spider crawl might discover more or less links (and indeed broken links etc), than another crawler is simply down to the different default configuration set-ups of each.
As default the SEO Spider will respect robots.txt, respect ‘nofollow’ of internal and external URLs & crawl canonicals. But other crawlers sometimes don’t respect these as default and hence why there might be differences. Obviously these can all be adjusted to your own preferences within the configuration.
While crawling more URLs might seem to be a good thing, actually it might be completely unnecessary and a waste of time and effort. So please choose wisely what you want to crawl.
We believe the SEO Spider is the most advanced crawler available and it will often find more URLs than other crawls as it crawl canonicals and AJAX similar to Googlebot which other crawlers might not have as standard, or within their current capability. There are other reasons as well, these may include –
- User-agent, speed or time of the crawl may play a part.
- Some other crawlers may use XML sitemaps for discovery and crawling. The SEO Spider does not currently crawl XML sitemaps by default, you currently have to upload them in list mode. The reason we decided against crawling XML sitemaps by default is that it shouldn’t make up for a site’s architecture. If a page is not linked to in the sites internal link structure, and only in an XML sitemap, it will help it be discovered and indexed, but the chances are it won’t perform very well organically. This is obviously because it won’t be passed any real PageRank, like a proper internal link. So we believe, it’s useful to analyse websites via the natural crawling and indexing process of internal links to get a better idea of a sites set-up. There are some scenarios where it does make sense to crawl XML sitemaps though and we may make this possible in the future as an option.
- Some other crawlers might crawl analytics landing pages, or URLs in Google Search Console Tools top pages. Again, this is not the natural crawling and indexing process, but might be something we consider in the future.
How do I create an XML sitemap?
Read our ‘How To Create An XML Sitemap‘ tutorial, which explains how to generate an XML Sitemap, include or exclude pages or images and runs through all the configuration settings available.Back to top
How do I extract multiple matches of a regex?
If you want all the H1s from the following HTML:
Then we can use:
Why am I experiencing slow down?
There are a number of reasons why you might be experiencing slow crawl rate or slow down of the SEO Spider. These include –
- If you’re performing a large crawl, you might be reaching the memory capacity of the SEO Spider. Learn how to increase the SEO Spider’s memory and read our guide on crawling large websites.
- Slow response of the site or server (or specific directives for hitting them too hard).
- Internet connection.
- Problems with the site you are crawling.
- Large pages or files.
- Crawling or viewing a large number of URIs.
Why doesn’t the GA API data in the SEO Spider match what’s reported in the GA interface?
There’s a number of reasons why data fetched via the Google API into the SEO Spider, might be different to the data reported within the Google Analytics Interface. First of all, we recommend triple checking that you’re viewing the exact same account, property, view, segment, date range and metrics and dimensions. LandingPagePath and PagePath will of course provide very different results for example! If data still doesn’t match, then there are some common reasons why –
- The Google API can just return slightly different metrics – We’ve tested this and sometimes the data from the API, can just be a little different to what’s reported in the interface.
- We use default sampling, and your settings in Google Analytics might be different.
- We use ga:hostname dimension and a ga:hostname==www.yourdomain.co.uk filter, to remove other domains which might be using the same GA tracking code as your core domain. Google does not do this by default in the interface, so landing page sessions for your homepage, might be inflated for example.
Can The SEO Spider Work On A Chromebook?
We don't have a Chromebook version of the SEO Spider. However, you can install Crouton, set up Ubuntu and download and install the Ubuntu version of the SEO Spider.
Please note, Chromebook's are not very powerful and are generally limited to 4GB of RAM. This will mean memory is restricted, and the number of URLs that can be crawled will also be limited. You can read more about SEO Spider memory in our user guide.
Why can't I generate an image sitemap from a list of images?
Image sitemap protocol require the HTML page the image is referenced on to be included in the sitemap. A list of images only does not have this information, so a sitemap cannot be generated.
Details on Google's requirements for image sitemaps can be seen at - https://support.google.com/webmasters/answer/178636.