External

Table of Contents

General

Configuration Options

Spider Crawl Tab

Spider Extraction Tab

Spider Limits Tab

Spider Rendering Tab

Spider Advanced Tab

Spider Preferences Tab

Other Configuration Options

Tabs

External

The external tab includes data about external URLs. URLs classed as ‘External’ are on a different subdomain as the start page of the crawl.


Columns

This tab includes the following columns.

  • Address – The external URL address
  • Content – The content type of the URL.
  • Status Code – The HTTP response code.
  • Status – The HTTP header response.
  • Crawl Depth – Depth of the page from the homepage or start page (number of ‘clicks’ away from the start page).
  • Inlinks – Number of links found pointing to the external URL.

Filters

This tab includes the following filters.

  • HTML – HTML pages.
  • JavaScript – Any JavaScript files.
  • CSS – Any style sheets discovered.
  • Images – Any images.
  • PDF – Any portable document files.
  • Flash – Any .swf files.
  • Other – Any other file types, like docs etc.
  • Unknown – Any URLs with an unknown content type. Either because it’s not been supplied, or because the URL can’t be crawled. URLs blocked by robots.txt will also appear here, as their filetype is unknown for example.

Join the mailing list for updates, tips & giveaways

Back to top