XML sitemap creation

Table of Contents

General

Configuration Options

Spider Crawl Tab

Spider Extraction Tab

Spider Limits Tab

Spider Rendering Tab

Spider Advanced Tab

Spider Preferences Tab

Other Configuration Options

Tabs

XML sitemap creation

The Screaming Frog SEO Spider allows you to create an XML sitemap or a specific image XML sitemap, located under ‘Sitemaps’ in the top level navigation.

Create An XML Sitemap

The ‘XML Sitemap’ feature allows you to create an XML Sitemap with all HTML 200 response pages discovered in a crawl, as well as PDFs and images. The ‘Images Sitemap’ is a little bit different to the ‘XML Sitemap’ option and including ‘images’. This option includes all images with a 200 response and ONLY pages that have images on them.

If you have over 49,999 URLs the SEO Spider will automatically create additional sitemap files and create a sitemap index file referencing the sitemap locations. The SEO Spider conforms to the standards outlined in sitemaps.org protocol.

Read our detailed tutorial on how to use the SEO Spider as an XML Sitemap Generator, or continue below for a quick overview of each of the XML Sitemap configuration options.


Adjusting Pages To Include

By default, only HTML pages with a ‘200’ response from a crawl will be included in the sitemap, so no 3XX, 4XX or 5XX responses. Pages which are ‘noindex’, ‘canonicalised’ (the canonical URL is different to the URL of the page), paginated (URLs with a rel=“prev”) or PDFs are also not included as standard, but this can be adjusted within the XML Sitemap ‘pages’ configuration.

pages to include in the xml sitemap

If you have crawled URLs which you don’t want included in the XML Sitemap export, then simply highlight them in the user interface, right click and ‘remove’ before creating the XML sitemap. Alternatively you can export the ‘internal’ tab to Excel, filter and delete any URLs that are not required and re-upload the file in list mode before exporting the sitemap. Alternatively, simply block them via the exclude feature or robots.txt before a crawl.


Last Modified

It’s optional whether to include the ‘lastmod’ attribute in a XML Sitemap, so this is also optional in the SEO Spider. This configuration allows you to either use the server response, or a custom date for all URLs.

last modified

Priority

‘Priority’ is an optional attribute to include in an XML Sitemap. You can ‘untick’ the ‘include priority tag’ box, if you don’t want to set the priority of URLs.

xml sitemap priority

Change Frequency

It’s optional whether to include the ‘changefreq’ attribute and the SEO Spider allows you to configure these based from the ‘last modification header’ or ‘level’ (depth) of the URLs. The ‘calculate from last modified header’ option means if the page has been changed in the last 24 hours, it will be set to ‘daily’, if not, it’s set as ‘monthly’.

change frequency

Images

It’s entirely optional whether to include images in the XML sitemap. If the ‘include images’ option is ticked, then all images under the ‘Internal’ tab (and under ‘Images’) will be included by default. As shown in the screenshot below, you can also choose to include images which reside on a CDN and appear under the ‘external’ tab within the UI.

image xml sitemap

Typically images like logos or social profile icons are not included in an image sitemap, so you can also choose to only include images with a certain number of source attribute references to help exclude these. Often images like logos are linked to sitewide, while images on product pages for example might only be linked to once or twice. There is a IMG Inlinks column in the ‘images’ tab which shows how many times an image is referenced to help decide the number of ‘inlinks’ which might be a suitable for inclusion.

Join the mailing list for updates, tips & giveaways

Back to top