SEO Spider

How To Crawl With AI Prompts

Introduction

The SEO Spider allows you to connect to OpenAI, Gemini, Anthropic and Ollama APIs and set up custom prompts to run against crawl data.

Having AI on hand while crawling opens up a world of possibilities. It enables you to run prompts against elements of a page while crawling.

You could use AI for all kinds of purposes, such as:

  • Generating alt text of images.
  • Language, sentiment or intent analysis of page content.
  • Scraping specific data.
  • Extracting embeddings from page content for a variety of analysis.

And much more!

This tutorial walks you through how to use our direct AI API Integrations for basic prompts, how to save custom prompts to the library and advanced use cases.


How To Set Up AI Prompts

There are various preset AI prompts within the SEO Spider, but you can also write your own custom prompts. Here’s how to get started.


1) Select An AI Provider

Click ‘Config > API Access > AI’ and select an AI provider you wish to use.

There are paid LLMs including OpenAI, Gemini and Anthropic. Alternatively, you can use Ollama, a free framework for local LLMs.

Follow our guides above on how to set up an account for each provider.

Select an AI Provider

Later in this tutorial we also show you how you can use Grok and Deepseek using the custom endpoint in OpenAI.

Tip! Gemini is free in select regions such as the US and UK through AI Studio. Check out the ‘free’ plan as it’s excellent. There are free and paid account types with different rate limits.


2) Enter Your API Key

When you have created an account and have an API key, enter it into the appropriate API key field on the ‘Account Information’ tab.

OpenAI API Key

If you’re using Ollama, then there isn’t an API key, and you can ignore this step.

While you’re on the ‘Account Information’ tab, remember to click ‘Connect’.

Connect to API

3) Configure Your Prompt

Navigate to the ‘Prompt Configuration’ tab to configure up to 100 prompts against crawl data.

The ‘Add from Library’ function includes half a dozen prompts for inspiration that you can select.

OpenAI Add From Library

Alternatively, click ‘Add’ to configure a custom prompt.

Add an AI Prompt

For OpenAI in the example, you can select the category of model (ChatGPT, Moderation, Embeddings, Image Generation or Text To Speech), the model used (for example, ‘gpt-4o’), content type and data to be used for the prompt, such as page text, HTML, or a custom extraction, as well as write your custom prompt.

The default conversational model (such as ChatGPT) is used for each provider, so typically you can just write your prompt in the ‘Enter Prompt’ field.

Enter your prompt

Tip! ‘Page Text’ is based upon the content area settings. This automatically excludes the nav and footer elements of a page, but you can customise it so it provides the exact content you wish to send to the LLM.

The warning icon to the right-hand side of the prompt field is warning that because ‘Page Text’ is selected for the prompt, that ‘Store HTML‘ must be selected.


4) Enable Store HTML

To use ‘Page Text’ or ‘HTML’ for a prompt, you will need to enable ‘Store HTML‘ via ‘Config > Spider > Extraction’.

Store HTML

If you have not selected one of these elements, then you can skip this step!


5) Test Your Prompt

To test a prompt, use the ‘play’ icon to the right of the prompt field.

Test your prompt

In the prompt tester, input the URL to test and click the ‘Test’ button to display both the extraction, and response.

Prompt tester

The OpenAI response in the lower window shows that the prompt is working as expected with the output as ‘English’.


6) Crawl the Site

Enter the website you wish to crawl in the ‘Enter URL to spider’ box and hit ‘Start’.

Crawl the site

Wait until the crawl and API progress bar reaches 100%, or view data in real-time.


7) View Results in the API Tab

Data from the prompt will appear in the AI tab and the filter and column with the prompt name set-up earlier. If the name wasn’t updated, it will just remain as ‘AI Provider: 1’.

AI Tab Results

Prompt data will also appear in the Internal tab, combined with all other data from the crawl for additional analysis, exporting or reporting where required.


Having AI prompts available while crawling opens up a world of possibilities.

Often the beauty in the functionality is not the common use cases, but the unique problems that are presented to SEOs where there isn’t an obvious ‘out of the box’ solution, and AI can offer that extra layer of flexibility.

While there are many different use cases, common ways to include prompts include:

Generating Alt Text

Alt text is essential for accessibility, but it can be difficult to get resource to write them for images.

AI is great for automating repetitive, low-level tasks, and LLMs can now view the image and describe them accurately.

Generate Alt Text

Sometimes AI can lack the context from the page itself, but typically the results are still useful.

Generating Meta Descriptions

While meta descriptions are not used as a signal in scoring directly, they do impact CTR in the SERPs. We therefore urge caution using AI to generate them. All descriptions should be thoroughly reviewed pre-publishing to a live site.

Generate Meta Descriptions

In the prompt, it’s advisable to include a character length limit, as well as a call to action at the end.

Classifying Language of Pages

LLMs are excellent at identifying the language of content.

Classify Language of a Page

This can be useful when pages have a mixture of languages, or when verifying the language of pages matches hreflang attributes.

Classifying Intent of Pages

It can be useful to classify pages as commercial or informational during content auditing tasks.

Classifying Intent of Pages

LLMs are not perfect at this, so we recommend carefully reviewing the results.

Sentiment Analysis

Use a prompt to classify text based upon Google’s NLP API sentiment classification of positive, negative, neutral or mixed.

Sentiment Analysis

Detecting Inappropriate Content

Identify any content that might be of concern to users or search engines.

Detecting Inappropriate Content

Extracting Entities

Quickly identify the key entities that are most important or central to the content.

Extracting Entities

Extracting Data

While we recommend using custom extraction to extract data from content as it’s quick and doesn’t cost credits, data can be easily extracted via prompts as well.

Extracting Data

Vector Embeddings

The SEO Spider is able to utilise vector embeddings to identify semantically similar pages and low relevance content, as well as semantic search and the content cluster diagram visualisation.

Embeddings configuration

Check out the tutorials linked above on how to use embeddings in the software.


Saving & Opening Custom Prompts

To save a custom prompt, simply click the ‘Save’ icon next to the prompt after creation.

Save Prompt

Custom user prompts can be selected via ‘Add from Library’ and the ‘User’ tab.

Saved Prompts in the user library

Prompts can be exported and shared in JSON format with other users via the export button in the ‘Add From Library’ menu.

Export User Prompts

These can then be imported using the same menu and ‘Import’ icon.

AI prompts can be saved as part of your configuration by setting them up and then using the ‘Config > Profiles > Save As’ menu.

Configuration Profiles

They can also be enabled as the default configuration via the same menu.


Multiple Prompt Targets

You can click the cog against a prompt and write a more advanced prompt, including using multiple prompt target elements.

For example, you can use page text, and custom extractors and be really specific.

Multiple Prompt Targets

This opens an ‘Edit Prompt’ window. Click the ‘Advanced Prompt’ setting and select multiple elements within your prompt.


Run Prompts For Specific Segments & Issues

You’re able to choose to run AI prompts against URLs that match a specific segment.

This means you can set up segments for different scenarios you wish AI prompts to be run against, and not waste credits.

Click ‘Config > Segments’, then ‘Add’ and select the conditions of the segment. You can select ‘Issues’ and a specific issue you want the AI prompt to run against, such as ‘Missing Alt Text’.

Segment for specific Issue

In the ‘Prompt Configuration’, you can then click the ‘Cog’ icon next to the prompt, click the ‘No Segment Matching’ dropdown and select the segment you set up.

Match a Prompt to a Segment or Issue

This means, only images with missing alt text will have alt text generated for them via the prompt.

Alt text created only for images missing alt text

This is obviously far more efficient than blindly creating alt text for all images on a site.


Custom Endpoints

You can customise the OpenAI endpoint, which allows you to enable private LLM APIs and other AI providers that use the same structure.

For example, you can use DeepSeek or Grok by customising the endpoint, and entering the relevant API key in the ‘Account Information’ tab.

Custom Endpoint

You can also customise the model parameters, headers, and limit page content length to reduce token exceeded errors on long content pages.

The endpoints for common LLMs that are compatible with the OpenAI API format include:

Azure OpenAIAPI Reference

https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

DeepSeekAPI Reference

https://api.deepseek.com

GrokAPI Reference

https://api.x.ai/

It’s also possible to customise the endpoints of Gemini and Ollama in the same way.


Generate Images & Text Speech

You can also use OpenAI and Gemini for image and text speech generation. As an example, this can be used to crawl blog posts, and create a hero image for each of them.

Image Generation using AI

The SEO Spider will show an image or sound preview in the UI, which you can expand, or listen to.


Summary

The guide above should illustrate how to use the SEO Spider with AI prompts to enrich the crawl data and improve efficiency of repetitive tasks that otherwise get left in dev queues.

We urge users to utilise these AI capabilities responsibly for genuine ‘value-add’ use cases.

Please also read our Screaming Frog SEO Spider FAQs and full user guide for more on the tool.

Get in touch with our support team with any queries.

Join the mailing list for updates, tips & giveaways

Back to top