Google Analytics integration
Table of Contents
Google Analytics integration
Configuration > API Access > Google Universal Analytics / Google Analytics 4
You can connect to the Google Universal Analytics API and GA4 API and pull in data directly during a crawl. The SEO Spider can fetch user and session metrics, as well as goal conversions and ecommerce (transactions and revenue) data for landing pages, so you can view your top performing pages when performing a technical or content audit.
To set this up, start the SEO Spider and go to ‘Configuration > API Access’ and choose ‘Google Universal Analytics’ or ‘Google Analytics 4’.
Next, connect to a Google account (which has access to the Analytics account you wish to query) by granting the ‘Screaming Frog SEO Spider’ app permission to access your account to retrieve the data.
Google APIs use the OAuth 2.0 protocol for authentication and authorisation. The SEO Spider will remember any Google accounts you authorise within the list, so you can ‘connect’ quickly upon starting the application each time.
Once connected in Universal Analytics, you can choose the relevant Google Analytics account, property, view, segment and date range.
For GA4, you can select the analytics account, property and Data Stream.
Then simply select the metrics that you wish to fetch for Universal Analytics –
Or for GA4 –
By default the SEO Spider collects the following 11 metrics in Universal Analytics –
- Sessions
- % New Sessions
- New Users
- Users
- Bounce Rate
- Page Views Per Session
- Avg Session Duration
- Page Value
- Goal Conversion Rate
- Goal Completions All
- Goal Value All
For UA you can select up to 30 metrics at a time from their API.
By default the SEO Spider collects the following 7 metrics in GA4 –
- Sessions
- Engaged Sessions
- Engagement Rate
- Views
- Conversions
- Event Count
- Total Revenue
For GA4 you can select up to 65 metrics available via their API.
You can read more about the metrics available and the definition of each metric from Google for Universal Analytics and GA4.
You can also set the dimension of each individual metric against either full page URL (‘Page Path’ in UA), or landing page, which are quite different (and both useful depending on your scenario and objectives).
For GA4 there is also a ‘filters’ tab, which allows you to select additional dimensions. For example, you can choose first user or session channel grouping with dimension values, such as ‘organic search’ to refine to a specific channel.
There are scenarios where URLs in Google Analytics might not match URLs in a crawl, so these are covered by auto matching trailing and non-trailing slash URLs and case sensitivity (upper and lowercase characters in URLs). Google doesn’t pass the protocol (HTTP or HTTPS) via their API, so these are also matched automatically.
When selecting either of the above options, please note that data from Google Analytics is sorted by sessions, so matching is performed against the URL with the highest number of sessions. Data is not aggregated for those URLs.
The following options are available –
- Match Trailing and Non-Trailing Slash URLs – Allows both http://example.com/contact and http://example.com/contact/ to match either http://example.com/contact or http://example.com/contact/ from GA, whichever has the highest number of sessions.
- Match Uppercase & Lowercase URLs – Allows http://example.com/contact.html, http://example.com/Contact.html and http://example.com/CONTACT.html to match the version of this URL from GA with the highest number of sessions.
- Limit Max Results – If you have hundreds of thousands of URLs in GA, you can choose to limit the number of URLs to query, which is by default ordered by sessions to return the top performing page data of the top 100,000 URLs.
- Crawl New URLs Discovered in Google Analytics – This means any new URLs discovered in Google Analytics (that are not found via hyperlinks) will be crawled. If this option isn’t enabled, then new URLs discovered via Google Analytics will only be available to view in the ‘Orphan Pages’ report. They won’t be added to the crawl queue, viewable within the user interface and appear under the respective tabs and filters. Please see our guide on finding orphan pages.
Google Analytics data will be fetched and display in respective columns within the ‘Internal’ and ‘Analytics’ tabs.
There’s an ‘API’ progress bar in the top right and when this has reached 100%, analytics data will start appearing against URLs in real-time. The more URLs and metrics queried the longer this process can take, but generally it’s extremely quick.
There are 5 filters currently under the ‘Analytics’ tab, which allow you to filter the Google Analytics data –
- Sessions Above 0 – This simply means the URL in question has 1 or more sessions.
- Bounce Rate Above 70% – This means the URL has a bounce rate over 70%, which you may wish to investigate. In some scenarios this is normal though!
- No GA Data – This means that for the metrics and dimensions queried, the Google API didn’t return any data for the URLs in the crawl. So the URLs either didn’t receive any visits sessions, or perhaps the URLs in the crawl are just different to those in GA for some reason.
- Non-Indexable with GA Data – This means the URL is non-indexable, but still has data from GA.
- Orphan URLs – This means the URL was only discovered via GA, and was not found via an internal link during the crawl.
Please read the following FAQs for various issues with accessing Google Analytics data in the SEO Spider –
- Why do I receive an error when granting access to my Google account?
- Why does my connection to Google Analytics fail?
- Why doesn’t GA data populate against my URLs?
- Why doesn’t the GA API data in the SEO Spider match what’s reported in the GA interface?
- Why can’t I see GA4 properties when I connect my Google Analytics account?
Please note, Google APIs use the OAuth 2.0 protocol for authentication and authorisation, and the data provided via Google Analytics and other APIs is only accessible locally on your machine. We cannot view and do not store that data ourselves. Please see more in our FAQ.
Using the Google Analytics 4 API is subject to their standard property quotas for core tokens.