Log File Analyser Configuration
Workspace
Here you can set an alternative location to store your project data.
By default the Log File Analyser stores projects in the following locations:
- Windows: {main drive letter}:\Users\{username}\.ScreamingFrogLogfileAnalyser\projects
- macOs: /Users/{username}/.ScreamingFrogLogfileAnalyser/projects
- Ubuntu: /home/{username}/.ScreamingFrogLogfileAnalyser/projects
When choosing an alternative location there are two things to consider, performance and size. The faster the drive, the better the Log File Analyser will perform. Choosing a network drive here is a sure way to kill performance! The Log File Analyser requires at least as much space as the size of the logs you are importing.
Troubleshooting
- If you get a red X rather than a green tick next to Workspace Directory, hover over it to see the error message.
- If the error message includes “OverlappingFileLockException” this means you are using an ExFAT/MS-DOS (FAT) file systems which is not supported on macOS due to JDK-8205404. You’ll need to choose a drive with a different format or reformat your drive to a different format to resolve this. You can use the Disk Utility application to view the current format and reformat the drive.
User-Agents
You’re able to configure the user-agents you wish to import into a project when creating a new project. You can choose from the pre-defined list of common search engine bot user-agents, or de-select those that are not relevant to you. This helps improve performance and reduces disk usage by focusing only on bots of interest.

You can also add your own custom user-agents, which are then stored and can be selected for projects.

The Log File Analyser will by default import data for the following search bot user-agents –
- All Googlebots – This includes Googlebot and Googlebot Smartphone.
- Googlebot
- Googlebot Smartphone
- Bingbot
- Yandex
- Baidu
However, as discussed above, this is entirely configurable. Similar to the date range, you can switch user-agent using the drop down filter in the top right of the application.
Verify Bots
You can automatically verify search engine bots, either when uploading a log file or retrospectively after you have uploaded log files to a project.
When uploading logs, you’ll be given the opportunity to tick the ‘verify bots’ option under the ‘User Agents’ tab.

If you have already imported log files, or would like to verify search engine bots retrospectively, then you can do so under the ‘Project > Verify Bots’ menu.

Search engine bots are often spoofed by other bots or crawlers, including our own SEO Spider software when emulating requests from specific search engine user-agents. Hence, when analysing logs, it’s important to know which events are genuine, and those that can be discounted.
The Log File Analyser will verify all major search engine bots according to their individual guidelines. For Googlebot and Bingbot, the Log File Analyser will perform an extremely fast lookup against their publicly confirmed IP lists (Google IPs, Bing IPs) to confirm they are genuine.
For other bots such as Yandex and Baidu, the Log File Analyser will perform a reverse DNS and verify the matching domain name and then run a forward DNS using the host command to verify it’s the same original requesting IP. This takes significantly longer to verify, so remove these user-agents from your analysis if they are not required.
After validation, you can use the ‘verification status’ filter, to view log events that are verified, spoofed or if there are any errors in verification.

Troubleshooting
If you find all events being marked as Spoofed there are a few things to check:
- Is the Remote Host being read? Check the Remote Host value associated with the Events marked as spoofed. To do this click on one of the Events and look at the Remote Host value in the lower window pane. Remote Host is not mandatory, so if this was not available in the imported log file, it won’t be possible to verify the Event.
- The Remote Host has correct looking values: If the Remote Host values are all from a single, or small selection of IPs (Head over to the IP tab to see Unique IPs) then it’s likely these are from a load balancer. You’ll need to have the log format adjusted by the site administrator/hosting provider to include the real IP address. Before doing this you could double check that the real IP is not already in the log file. To do this open up the log file in a text editor and inspect a few of the lines, is there more than 1 IP address on each line? If so please send the first few lines of the log or the Log File Analyser debug logs (Help->Debug->Save Logs) to our support team so we can make sure this isn’t a parsing issue.
- Verify Manually: For Googlebot the Log File Analyser verifies as Google recommends try this yourself, if you get different results please let us know. If not, go ahead and request the real IP is added to the log.
If you find any Events marked as “Verification Error”:
- Select to show “Verification Error” in the top level dropdown, the go to the Events tab, click on an Event and look at the “Verification Status” column. This will tell you why the “Verification Error” has occurred. “Invalid Bot IP” means the DNS lookup failed for the IP address of this event.
Include
This feature allows you to supply a list of regular expressions for URLs to import into a project. So if you only wanted to analyse certain domains, or paths, like the /blog/, or /products/ pages on a huge site, then you can now do that to save time and resource – and more granular analysis.

Only analysing certain paths will save time importing, analysing and disk space.
Exclude
This feature allows you to supply a list of regular expressions for URLs to exclude from being imported into a project. So if you wanted to exclude a section of the website such as a /forum/ or /blog/, then you can now do that to save time and resource – and more granular analysis.

Excluding paths will save time importing, analysing and disk space.
Date Range
In the top right hand side of the application, you can change the date range of your view across the project. There are 3 preset date ranges, the last day, the last 7 days or last 30 days, as well as an option for a custom date range.

You can also skip backwards and forwards with dates using the arrows at the side. This will update the date range for all tabs, not just the tab you’re on.
Import History
You’re able to view the log file import history of a project by clicking on ‘Project > Import History’ via the top level menu.

This allows you to view the first and last events from the log files, as well as the import date, number of events contained within the log file, site URL provided, log file format and the file name.
By clicking on the individual import rows, you can also delete import history, if you accidentally import incorrect logs.
Timezone
The Log File Analyser stores all events in Coordinated Universal Time (UTC). To view in your local timezone you can adjust the UTC offset to match your current timezone. You can select this either at project creation time:

Or by going to the ‘Project > Settings > General’ and clicking on the ‘Timzone’ dropdown when you have a project open.
Join the mailing list for updates, tips & giveaways
How we use the data in this formBack to top