SEO Spider

How To Automate The URL Inspection API

How to Automate The URL Inspection API

The Google URL Inspection API allows users to request the data Search Console has about the indexed version of a URL, including index status, coverage, rich results, mobile usability and more.

This means you’re able to check in bulk whether URLs are indexed on Google, and if there are warnings or issues.

The URL Inspection API has been integrated into the Screaming Frog SEO Spider, so users can pull in data for up to 2k URLs per property a day alongside all the usual crawl data.

URL Inspection API Data

This tutorial shows you how to use the SEO Spider to collect URL Inspection data, options to work with or around the 2k URL limit, and how to automate URL Inspection API data and reporting to monitor indexing.


How to Connect to The URL Inspection API

Click ‘Config > API Access > Google Search Console’, connect to a Search Console account, choose the property and then under the ‘URL Inspection’ tab, select ‘Enable URL Inspection’.

Enable URL Inspection API

When you perform a crawl, URL Inspection API data will then be populated in the ‘Search Console’ tab, alongside the usual Search Analytics data (impressions, clicks, etc).

URL Inspection API Filters

The Search Console tab includes the following URL Inspection API related filters –

  • URL Is Not on Google – The URL is not indexed by Google and won’t appear in the search results. This filter can include non-indexable URLs (such as those that are ‘noindex’) as well as Indexable URLs that are able to be indexed. It’s a catch all filter for anything not on Google according to the API.
  • Indexable URL Not Indexed – Indexable URLs found in the crawl that are not indexed by Google and won’t appear in the search results. This can include URLs that are unknown to Google, or those that have been discovered but not indexed, and more.
  • URL is on Google, But Has Issues – The URL has been indexed and can appear in Google Search results, but there are some problems with mobile usability, AMP or Rich results that might mean it doesn’t appear in an optimal way.
  • User-Declared Canonical Not Selected – Google has chosen to index a different URL to the one declared by the user in the HTML. Canonicals are hints, and sometimes Google does a great job of this, other times it’s less than ideal.
  • Page Is Not Mobile Friendly – The page has issues on mobile devices.
  • AMP URL Is Invalid – The AMP has an error that will prevent it from being indexed.
  • Rich Result Invalid – The URL has an error with one or more rich result enhancements that will prevent the rich result from showing in the Google search results.

You can export Google Rich Result types, errors and warnings, details on referring pages and Sitemaps via the ‘Bulk Export > URL Inspection’ menu.

URL Inspection API bulk export

How to Focus On Key Sections or Pages

URL Inspection data will be populated against the first 2k URLs found in the crawl, which is breadth-first (ordered by crawl depth) from the start page of the crawl.

Use the SEO Spider configuration to focus the crawl to key sections, pages or a variety of template types.

Some of the main options include –

Under ‘Config > API Access > Google Search Console’ and the ‘URL Inspection’ tab, you can ‘Ignore Non-Indexable URLs for URL Inspection’, if you’re only interested in data for URLs that are Indexable in a crawl.

URL Inspection API Ignore Non-Indexable URLs

This saves wasting the 2k query budget on URLs you don’t care about.


How to Work with The 2k A Day Limit

Google has a 2k query per day and property limit for the URL Inspection API.

Google didn’t build the API to allow webmasters to check if every single URL on their website is indexed. They think it’s pretty normal for some URLs not to be indexed. The purpose of the API is to allow users to check more than 1 URL at a time, and get a better sample across templates outside of GSC.

If you have hit the 2k URLs per day per property limit for the URL Inspection API you will receive this message.

URL Inspection Daily Quota limit

The crawl itself will obviously continue and complete, URLs just won’t continue to be populated with URL Inspection data. If you’d like data for more URLs, then you have two options.

1) Patience (Wait A Day!)

Let the crawl finish, wait for 24hrs, re-open the crawl, connect to the API again and then bulk highlight and ‘re-spider’ the next 2k URLs to get URL Inspection API data.

Right Click Re-Spider to Update URL Inspection API Data

Or export the previous crawl, copy the URLs you want URL Inspection data from, and upload in list mode.

Before exporting and combining with the previous days crawl data.

2) Verify Multiple Properties & Enable ‘Use Multiple Properties’

You can verify multiple subdomains and subfolders as separate properties in Search Console for a site. Each property would have a 2k URL limit for the URL Inspection API.

For example, all URLs within /blog/ can have their own 2k query limit if verified as a property.

URL Inspection API Multiple Properties

If you have multiple subdomains or subfolders set-up as separate properties, then enable the ‘Use Multiple Properties’ configuration found in ‘Config > API Access > GSC > URL Inspection’.

URL Inspection API Multiple Properties Config

The SEO Spider will automatically detect all relevant properties in the account, and use the most specific property to request data for the URL.

This means it’s now possible to get far more than 2k URLs with URL Inspection API data in a single crawl, if there are multiple properties set up – without having to perform multiple crawls.


How to Automate URL Inspection Data & Index Monitoring

There’s various ways you can automate crawls and fetch URL Inspection API to monitor indexing of the most important pages on a website. The simplest is to use scheduling, the export for Google Data Studio feature and our new URL Inspection API Data Studio template.

If you’re already automating crawl reports in Data Studio, then you’ll be familiar with this process and there is a page within this report. Let’s run through the process of automating index monitoring.

1) Schedule A List Mode Crawl

Go to ‘File > Scheduling’ and under ‘General’ choose a task and project name and daily interval.

Next, click ‘Start Options’ and switch ‘Crawler Mode’ to ‘List’. For ‘Crawl Seed’, click ‘browse’ and select a .txt file with the URLs you want to check every day for URL Inspection data.

URL Inspection API Scheduling

You could crawl a website in regular ‘Spider’ mode if it’s under 2k URLs and gather index data for every single URL.

However, websites are often much larger and there can also be many URLs you don’t care about. So it makes sense to focus on the most important URLs on the site.

This might be the top 10, 20 or 100 URLs, rather than 2k. So many websites have a small number of really key landing pages that drive revenue.

2) Use A Crawl Config with URL Inspection API Enabled

For ‘Crawl Config’ in scheduling ‘Start Options’, ensure you supply a saved configuration file, that has ‘Enable URL Inspection’ activated in ‘Config > API Access > Google Search Console > URL Inspection’.

URL Inspection API Scheduling Config

Setting up a saved configuration is simple. In the SEO Spider interface, just select the configuration you want, then click ‘File > Config > Save As’. This is the file that needs to be supplied in the ‘Crawl Config’.

3) Select The Google Search Console API

Enable the ‘Google Search Console’ API, click ‘Configure’ and select the account and property.

URL Inspection API GSC

4) Export For Data Studio

On the ‘Export’ tab, enable ‘Headless’ and choose the ‘Google Drive Account’ to export the URL Inspection API data in a Google Sheet.

Next, click ‘Export For Data Studio’ and then the ‘Configure’ button next to it.

Scheduling Export tab for URL Inspection API

The configure button will then show you a list of available metrics from tabs and filters on the left, which need to be selected for the export by clicking the right arrow.

Export for Data Studio Available Metrics

Select ‘Site Crawled’, ‘Date’ and ‘Time’ metrics, and then search for ‘Search Console’ to see the list of metrics available for this tab. Select the bottom 7 metrics, which are related to URL Inspection and click the right arrow.

Export For Data Studio Metrics Selected

When the scheduled crawl has run the ‘Export for Data Studio’ Google Sheet will be exported into your chosen Google Drive account.

URL Inspection API Data In Google Sheets

By default the ‘Export for Data Studio’ location is ‘My Drive > Screaming Frog SEO Spider > Project Name > [task_name]_crawl_summary_report’.

5) Connect to URL Inspection Google Data Studio Template

Now make a copy of our URL Inspection Monitoring Data Studio template and connect to your own Google Sheet with data from the ‘Export for Data Studio’ crawl summary report.

URL Inspection Data Studio Template

You now have a daily index monitoring system for the most important URLs on the website, which will alert you to any URLs that are not indexed, or have issues.

If you’re not familiar with how to take a copy of a Data Studio dashboard and connect to a different data source, have a read of our ‘Connecting to Data Studio‘ guide, and follow the same process.


Summary

The tutorial should help you utilise the SEO Spider to fetch URL Inspection API data you need.

Check out our Screaming Frog SEO Spider user guide, FAQs and tutorials for more advice and tips.

If you have any further queries, feedback or suggestions to improve our URL Inspection API or Data Studio integration in the SEO Spider then just get in touch with our team via support.

Join the mailing list for updates, tips & giveaways

Back to top