SEO Spider

How To Use Custom Search

Introduction To Custom Search

The SEO Spider allows you to find anything you want in the HTML or text of a website using its custom search feature.

This can be helpful when verifying analytics tags or discovering which pages have certain words or phrases, such as an old brand name, ‘out of stock’ or key phrases for internal linking opportunities.

You’re able to configure up to 100 search filters using custom search, which allow you to input text or regex and find pages that either ‘contain’ or ‘does not contain’ your chosen input and reports the number of occurrences.

This tutorial walks you through how to use the feature, common scenarios and more advanced searches.

To get started, download the SEO Spider which is free for crawling up to 500 URLs – however, this feature does require a paid licence to use custom search.


1) Add Custom Search Filters

Click ‘Config > Custom > Search’ from the top-level menu to open the custom search configuration.

Custom Search

Then click ‘Add’ (in the bottom right) to set-up a custom search filter.

Add Custom Search Filter

A custom search filter will appear. You’re able to add up to 100 separate filters in a crawl.

Custom Search Filter

2) Input Your Search

Now enter your search in the ‘Enter Search Query’ box and adjust each search filter options.

From left to right, you can name the search filter, select ‘contains’ or ‘does not contain’, choose ‘text’ or ‘regex’, input your search query – and choose where the search is performed (HTML, page text, an element, or XPath and more).

Custom Search Filters

The example above shows a search for ‘Out of stock’ across any page’s text and a search for any pages that do not contain a Google Tag Manager tracking code in the HTML head element of a page.

When the filters are set-up, you can click ‘OK’ and run a crawl to perform the search.


3) Crawl The Website

Type or copy in the website you wish to crawl in the ‘Enter URL to spider’ box and hit ‘Start’.

Custom Search Site Crawl

Wait until the crawl finishes and reaches 100%, or watch in real-time as the custom search tab filters populate.


4) View Data In The Custom Search Tab & Filters

Click on the Custom Search tab to view the results of your custom search in real-time. By default, data from all searches are shown together in the tab, but the filters can be used to refine the data to only show each separate filter.

Custom Search Results Data

The ‘contains’ filter will show the number of occurrences of the search, while a ‘does not contain’ search will either return ‘Contains’ or ‘Does Not Contain’.

In this search, there are 2 pages with ‘Out of stock’ text, each containing the word just once – while the GTM code was not found on any of the 10 pages.

These numbers can also be seen in the right-hand ‘Overview’ pane, which updates the filter counts in real-time.

Custom Search Right Hand Overview

5) Exporting

Export custom search data by clicking the ‘export’ button, which works alongside the filters and your current view.

Custom Search Exporting

You can also export ‘inlinks’ (the source pages that link) to custom search filters via ‘Bulk Export > Custom Search > Filter X Inlinks’.

Custom Search Bulk Exporting

Advanced Search Filter Options

Custom search can be really powerful by combining filters together and adjusting the search filter configurations. In particular, using regex and choosing where the search is performed.

Case Sensitivity

If you need to perform a case sensitive search, when searching for ‘text’ you can click on the arrows to the right side of the box to expand the text area and choose ‘case sensitive’.

Case Sensitivity with Custom Search

‘Regex’ is case sensitive by default, to make it case insensitive use (?i) before the word. For example –

(?i)optimisation

Would match against ‘optimisation’ and ‘OPTIMISATION’, or even ‘OpTiMiSaTiOn’.

Case sensitivity can be particularly useful when searching for misspellings of brand names, or acronyms etc.

Exact & Multiple Words

You can choose to search using regular text, or for more advanced uses you can switch to regex.

Custom search regex

For example, using regular expressions you can match exact words using the following.

\bword\b

This would match a particular word (‘word’ in this case), as \b matches word boundaries.

This can be useful when searching for words or phrases that can be in other words, like ‘pr’, (which will appear in ‘promotion’, pre-rendering’ and more on our site!).

Without using word boundaries ‘pr’ is found 12 times on our digital PR page. With an exact, case sensitive match it’s actually 0.

Exact word search using regex

You can also combine words together in a search. For example, if you wanted to find any pages with the words ‘natural’, ‘organic’ and ‘free’ you could combine words in a single filter using a pipe.

\bnatural\b|\borganic\b|\bfree\b

This will count every instance of each of the words, for example, our ‘search engine optimisation’ page has ‘organic’ 3 times and ‘natural’ and ‘free’ once, to make 5 in total.

Multiple Words Custom Search

You’re able to click on the heading to sort by occurrences as shown in the example.

Combining Searches

You’re able to combine filters and view them together at the same time. So if you wanted to search for any page that contains a word, but does not contain another word – then use multiple filters and view together in the custom search tab.

Combining Search Filters

In this example, you can see that there are no instances where the word ‘crawler’ and ‘best’ are not both used. This is appropriate!

Search In

Custom search will check the raw HTML or rendered HTML dependent on your rendering mode. By default, it will check the raw HTML, but if you’ve configured JavaScript rendering mode it will check the rendered HTML.

You’re then able to refine exactly where the custom search is performed.

Custom Search In

These 7 options available provide you with control of where you search –

  • HTML – The full HTML of the web page.
  • Page Text – The text of web pages, excluding any HTML.
  • Page Text No Anchors – The text of web pages, excluding any HTML or any text contained within HTML anchor tags (also known as A Elements). This can be helpful when searching for words that are also included in link text within menus, which can cause every page to be flagged to contain the search otherwise.
  • HTML Head – The HTML head of the web page.
  • HTML Body – The HTML body of the web page, which can include both HTML and page text.
  • XPath – You’re able to supply an XPath to specify the location in the HTML where the search is performed. For example, if you wanted to run the search only against text contained in h3 headings, you could supply //h3.
  • Content Area – You can specify the content area used for word count, near duplicate content analysis and spelling and grammar checks – which can also be selected for custom search. By default this includes text contained within the body HTML element, excluding both the nav and footer elements to focus on the main content of the page. HTML elements, classes and IDs can be excluded and included, as per the content area guide.

Choosing where to search is often very powerful. A good example of this is finding where we misspell ‘Screaming Frog’ as ‘Screaming frog’, without a capital ‘F’ on our own website.

Running a case sensitive search with ‘Page Text’ brings back 7 occurrences on our broken links blog post.

Screaming frog custom search

However, when checking the page the misspellings are within the ‘comments’ section of the blog post, rather than in the main blog body.

To exclude this comments section from the custom search, you can right-click in a browser and ‘view source’ of the HTML and search for the appropriate ‘comments’ section in the HTML.

This shows an HTML ID of ‘comments’, which can be used for exclusion.

Exclude HTML ID in Content Settings For Custom Search

The ‘comments’ ID can then be excluded in ‘Content Area’ under ‘Configuration > Content > Area’.

Content Area Used For Spelling & Grammar Check

The comments section then won’t be analysed for custom search, and we can see that re-running the search this shows there are 0 occurrences on this page.

Custom search using content area

Multi-Line

You’re able to expand your custom search to be multiple lines in the HTML. This means it can be used to find full code in HTML, such as Google Analytics tracking codes (other analytics platforms are available).

Click on the arrows to the right side of the search query box to expand the text area and you can input an entire GTM container snippet for example.

Multi Line Custom Search

This means you don’t need to compromise searches to smaller singular lines or words of a tracking tag, you can verify the whole snippet.

Analyse With Crawl Data

Custom search filter data is auto appended to the ‘Internal’ tab which combines all internal data in a crawl.

Custom Search With Crawl Data

So you can match the custom searches against other crawl data for more insight.

Extracting Data

Finally, it’s worth reiterating that custom search doesn’t ‘scrape’ or extract data, it only searches.

To extract content, you’ll need to use custom extraction instead.


Summary

The guide above should illustrate how to use the SEO Spider to find words, phrases, tracking tags or any snippets of text across pages on your website.

Please also read our Screaming Frog SEO Spider FAQs and full user guide for more information on the tool.

If you have any further queries, feedback or suggestions to improve custom search in the SEO Spider then just get in touch with our team via support.

Join the mailing list for updates, tips & giveaways

Back to top