Google Analytics integration

Table of Contents

General

Configuration Options

Spider Crawl Tab

Spider Extraction Tab

Spider Limits Tab

Spider Rendering Tab

Spider Advanced Tab

Spider Preferences Tab

Other Configuration Options

Tabs

Google Analytics integration

Configuration > API Access > Google Universal Analytics / Google Analytics 4

You can connect to the Google Universal Analytics API and GA4 API and pull in data directly during a crawl. The SEO Spider can fetch user and session metrics, as well as goal conversions and ecommerce (transactions and revenue) data for landing pages, so you can view your top performing pages when performing a technical or content audit.

To set this up, start the SEO Spider and go to ‘Configuration > API Access’ and choose ‘Google Universal Analytics’ or ‘Google Analytics 4’.

Next, connect to a Google account (which has access to the Analytics account you wish to query) by granting the ‘Screaming Frog SEO Spider’ app permission to access your account to retrieve the data.

Google APIs use the OAuth 2.0 protocol for authentication and authorisation. The SEO Spider will remember any Google accounts you authorise within the list, so you can ‘connect’ quickly upon starting the application each time.

GA4 Login

Once connected in Universal Analytics, you can choose the relevant Google Analytics account, property, view, segment and date range.

Universal Analytics Account, Property and View selection

For GA4, you can select the analytics account, property and Data Stream.

GA4 Data Streams

Then simply select the metrics that you wish to fetch for Universal Analytics –

Universal Analytics Metrics

Or for GA4 –

GA4 Metrics

By default the SEO Spider collects the following 11 metrics in Universal Analytics –

  1. Sessions
  2. % New Sessions
  3. New Users
  4. Users
  5. Bounce Rate
  6. Page Views Per Session
  7. Avg Session Duration
  8. Page Value
  9. Goal Conversion Rate
  10. Goal Completions All
  11. Goal Value All

For UA you can select up to 30 metrics at a time from their API.

By default the SEO Spider collects the following 7 metrics in GA4 –

  1. Sessions
  2. Engaged Sessions
  3. Engagement Rate
  4. Views
  5. Conversions
  6. Event Count
  7. Total Revenue

For GA4 you can select up to 65 metrics available via their API.

You can read more about the metrics available and the definition of each metric from Google for Universal Analytics and GA4.

You can also set the dimension of each individual metric against either full page URL (‘Page Path’ in UA), or landing page, which are quite different (and both useful depending on your scenario and objectives).

Google Analytics Dimensions

For GA4 there is also a ‘filters’ tab, which allows you to select additional dimensions. For example, you can choose first user or session channel grouping with dimension values, such as ‘organic search’ to refine to a specific channel.

GA4 Filters

There are scenarios where URLs in Google Analytics might not match URLs in a crawl, so these are covered by auto matching trailing and non-trailing slash URLs and case sensitivity (upper and lowercase characters in URLs). Google doesn’t pass the protocol (HTTP or HTTPS) via their API, so these are also matched automatically.

Google Analytics General Config

When selecting either of the above options, please note that data from Google Analytics is sorted by sessions, so matching is performed against the URL with the highest number of sessions. Data is not aggregated for those URLs.

The following options are available –

  • Match Trailing and Non-Trailing Slash URLs – Allows both http://example.com/contact and http://example.com/contact/ to match either http://example.com/contact or http://example.com/contact/ from GA, whichever has the highest number of sessions.
  • Match Uppercase & Lowercase URLs – Allows http://example.com/contact.html, http://example.com/Contact.html and http://example.com/CONTACT.html to match the version of this URL from GA with the highest number of sessions.
  • Limit Max Results – If you have hundreds of thousands of URLs in GA, you can choose to limit the number of URLs to query, which is by default ordered by sessions to return the top performing page data of the top 100,000 URLs.
  • Crawl New URLs Discovered in Google Analytics – This means any new URLs discovered in Google Analytics (that are not found via hyperlinks) will be crawled. If this option isn’t enabled, then new URLs discovered via Google Analytics will only be available to view in the ‘Orphan Pages’ report. They won’t be added to the crawl queue, viewable within the user interface and appear under the respective tabs and filters. Please see our guide on finding orphan pages.

Google Analytics data will be fetched and display in respective columns within the ‘Internal’ and ‘Analytics’ tabs.

There’s an ‘API’ progress bar in the top right and when this has reached 100%, analytics data will start appearing against URLs in real-time. The more URLs and metrics queried the longer this process can take, but generally it’s extremely quick.

Google Analytics Data populating in the SEO Spider

There are 5 filters currently under the ‘Analytics’ tab, which allow you to filter the Google Analytics data –

  • Sessions Above 0 – This simply means the URL in question has 1 or more sessions.
  • Bounce Rate Above 70% – This means the URL has a bounce rate over 70%, which you may wish to investigate. In some scenarios this is normal though!
  • No GA Data – This means that for the metrics and dimensions queried, the Google API didn’t return any data for the URLs in the crawl. So the URLs either didn’t receive any visits sessions, or perhaps the URLs in the crawl are just different to those in GA for some reason.
  • Non-Indexable with GA Data – This means the URL is non-indexable, but still has data from GA.
  • Orphan URLs – This means the URL was only discovered via GA, and was not found via an internal link during the crawl.

Please read the following FAQs for various issues with accessing Google Analytics data in the SEO Spider –

  1. Why do I receive an error when granting access to my Google account?
  2. Why does my connection to Google Analytics fail?
  3. Why doesn’t GA data populate against my URLs?
  4. Why doesn’t the GA API data in the SEO Spider match what’s reported in the GA interface?
  5. Why can’t I see GA4 properties when I connect my Google Analytics account?

Please note, Google APIs use the OAuth 2.0 protocol for authentication and authorisation, and the data provided via Google Analytics and other APIs is only accessible locally on your machine. We cannot view and do not store that data ourselves. Please see more in our FAQ.

Using the Google Analytics 4 API is subject to their standard property quotas for core tokens.


Join the mailing list for updates, tips & giveaways

Back to top