Custom search
Table of Contents
General
- Installation
- Installation on Windows
- Installation on macOS
- Installation on Ubuntu
- Installation on Fedora
- Crawling
- Saving, opening, exporting & importing crawls
- Configuration
- Scheduling
- Exporting
- Robots.txt
- User agent
- Memory
- Checking memory allocation
- Cookies
- XML sitemap creation
- Visualisations
- Reports
- Command line interface set-up
- Command line interface
- User Interface
- Search function
- Auto Updates
Configuration Options
Spider Crawl Tab
- Images
- Media
- CSS
- JavaScript
- SWF
- Internal hyperlinks
- External links
- Canonicals
- Pagination (rel next/prev)
- Hreflang
- AMP
- Meta refresh
- iframes
- Mobile alternate
- Check links outside of start folder
- Crawl outside of start folder
- Crawl all subdomains
- Follow internal or external ‘nofollow’
- Crawl linked XML sitemaps
Spider Extraction Tab
Spider Limits Tab
Spider Rendering Tab
Spider Advanced Tab
- Cookie storage
- Ignore non-indexable URLs for Issues
- Ignore paginated URLs for duplicate filters
- Always follow redirects
- Always follow canonicals
- Respect noindex
- Respect canonical
- Respect next/prev
- Respect HSTS policy
- Respect self referencing meta refresh
- Extract images from img srcset attribute
- Crawl fragment identifiers
- Perform HTML validation
- Green hosting carbon calculation
- Assume pages are HTML
- Response timeout
- 5XX response retries
Spider Preferences Tab
Other Configuration Options
- Content area
- Duplicates
- Spelling & grammar
- Robots.txt
- URL rewriting
- CDNs
- Include
- Exclude
- Speed
- User agent
- HTTP header
- Custom search
- Custom extraction
- Custom link positions
- Custom JavaScript
- Google Analytics integration
- Google Search Console integration
- PageSpeed Insights integration
- Majestic
- Ahrefs
- Moz
- Authentication
- Segments
- Crawl analysis
- User Interface
- Language
- Proxy
- Storage mode
- Memory allocation
- Trusted Certificates
- Mode
Tabs
Top Tabs
- Internal
- External
- Security
- Response Codes
- URL
- Page titles
- Meta description
- Meta keywords
- h1
- h2
- Content
- Images
- Canonicals
- Pagination
- Directives
- hreflang
- JavaScript
- Links
- AMP
- Structured data
- Sitemaps
- PageSpeed
- Mobile
- Custom search
- Custom extraction
- Custom JavaScript
- Analytics
- Search Console
- Validation
- Link Metrics
- Change Detection
Lower Window Tabs
Right Side Window Tabs
Custom search
The custom search tab works alongside the custom search configuration. The custom search feature allows you to search the source code of HTML pages and can be configured by clicking ‘Config > Custom > Search’.
You’re able to configure up to 100 search filters in the custom search configuration, which allow you to input your regex and find pages that either ‘contain’ or ‘does not contain’ your chosen input. The results appear within the custom search tab as outlined below.
Columns
This tab includes the following columns.
- Address – The URI crawled.
- Content – The content type of the URI.
- Status Code – HTTP response code.
- Status – The HTTP header response.
- Contains: [x] – The number of times [x] appears within the source code of the URL. [x] is the query string that has been entered in the custom search configuration.
- Does Not Contain: [y] – The column will either return ‘Contains’ or ‘Does Not Contain’ [y]. [y] is the query string that has been entered in the custom search configuration.
Filters
This tab includes the following filters.
- [Search Filter Name] – Filters are dynamic, and will match the name of the custom configuration and relevant column. They show URLs that either contain or do not contain the query string entered.