Log File Analyser Configuration

Workspace

Here you can set an alternative location to store your project data.

By default the Log File Analyser stores projects in the following locations:

  • Windows: {main drive letter}:\Users\{username}\.ScreamingFrogLogfileAnalyser\projects
  • macOs: /Users/{username}/.ScreamingFrogLogfileAnalyser/projects
  • Ubuntu: /home/{username}/.ScreamingFrogLogfileAnalyser/projects

When choosing an alternative location there are two things to consider, performance and size. The faster the drive, the better the Log File Analyser will perform. Choosing a network drive here is a sure way to kill performance! The Log File Analyser requires at least as much space as the size of the logs you are importing.

User-Agents

You’re able to configure the user-agents you wish to import into a project when creating a new project. You can choose from the pre-defined list of common search engine bot user-agents, or de-select those that are not relevant to you. This helps improve performance and reduces disk usage by focusing only on bots of interest.

Log File Analyser user-agent configuration

You can also add your own custom user-agents, which are then stored and can be selected for projects.

Log File Analyser custom user-agents

The Log File Analyser will by default import data for the following search bot user-agents –

  • All Googlebots – This includes Googlebot and Googlebot Smartphone.
  • Googlebot
  • Bingbot
  • Googlebot Smartphone
  • Yandex
  • Baidu

However, as discussed above, this is entirely configurable. Similar to the date range, you can switch user-agent using the drop down filter in the top right of the application.

Verify Bots

You can now automatically verify search engine bots, either when uploading a log file or retrospectively after you have uploaded log files to a project.

When uploading logs, you’ll be given the opportunity to tick the ‘verify bots’ option under the ‘User Agents’ tab.

verify search bots

If you have already imported log files, or would like to verify search engine bots retrospectively, then you can do so under the ‘Project > Verify Bots’ menu.

verify search engine bots retrospectively

Search engine bots are often spoofed by other bots or crawlers, including our own SEO Spider software when emulating requests from specific search engine user-agents. Hence, when analysing logs, it’s important to know which events are genuine, and those that can be discounted.

The Log File Analyser will verify all major search engine bots according to their individual guidelines. For example, for Googlebot verification, the Log File Analyser will perform a reverse DNS lookup, verify the matching domain name and then run a forward DNS using the host command to verify it’s the same original requesting IP.

After validation, you can use the ‘verification status’ filter, to view log events that are verified, spoofed or if there are any errors in verification.

bot verification status filter

Troubleshooting

If you find all events being marked as Spoofed there are a few things to check:

  • Is the Remote Host being read? Check the Remote Host value associated with the Events marked as spoofed. To do this click on one of the Events and look at the Remote Host value in the lower window pane. Remote Host is not mandatory, so if this was not available in the imported log file, it won’t be possible to verify the Event.
  • The Remote Host has correct looking values: If the Remote Host values are all from a single, or small selection of IPs (Head over to the IP tab to see Unique IPs) then it’s likely these are from a load balancer. You’ll need to have the log format adjusted by the site administrator/hosting provider to include the real IP address. Before doing this you could double check that the real IP is not already in the log file. To do this open up the log file in a text editor and inspect a few of the lines, is there more than 1 IP address on each line? If so please send the first few lines of the log or the Log File Analyser debug logs (Help->Debug->Save Logs) to our support team so we can make sure this isn’t a parsing issue.
  • Verify Manually: For Googlebot the Log File Analyser verifies as Google recommends try this yourself, if you get different results please let us know. If not, go ahead and request the real IP is added to the log.

If you find any Events marked as “Verification Error”:

  • Select to show “Verification Error” in the top level dropdown, the go to the Events tab, click on an Event and look at the “Verification Status” column. This will tell you why the “Verification Error” has occurred. “Invalid Bot IP” means the DNS lookup failed for the IP address of this event.

Include

This feature allows you to supply a list of regular expressions for URLs to import into a project. So if you only wanted to analyse certain domains, or paths, like the /blog/, or /products/ pages on a huge site, then you can now do that to save time and resource – and more granular analysis.

Log File Analyer Include

Only analysing certain paths will save time importing, analysing and disk space.

Date Range

In the top right hand side of the application, you can change the date range of your view across the project. There are 3 preset date ranges, the last day, the last 7 days or last 30 days, as well as an option for a custom date range.

date range

You can also skip backwards and forwards with dates using the arrows at the side. This will update the date range for all tabs, not just the tab you’re on.

Import History

You’re able to view the log file import history of a project by clicking on ‘Project > Import History’ via the top level menu.

import history

This allows you to view the first and last events from the log files, as well as the import date, number of events contained within the log file, site URL provided, log file format and the file name.

By clicking on the individual import rows, you can also delete import history, if you accidentally import incorrect logs.

Timezone

The Log File Analyser stores all events in Coordinated Universal Time (UTC). To view in your local timezone you can adjust the UTC offset to match your current timezone. You can select this either at project creation time:

Log File Analyser New Project

Or by going to the ‘Project > Settings > General’ and clicking on the ‘Timzone’ dropdown when you have a project open.

  • Like us on Facebook
  • +1 us on Google Plus
  • Connect with us on LinkedIn
  • Follow us on Twitter
  • View our RSS feed

Download.

Download

Purchase a licence.

Purchase