PDF

Dan Sharp

Posted 5 December, 2022 by Dan Sharp in

PDF

Configuration > Spider > Extraction > PDF

Store PDF

This allows you to save PDFs to disk during a crawl. They can be bulk exported via ‘Bulk Export > Web > All PDF Documents’, or just the content can be exported as .txt files via ‘Bulk Export > Web > All PDF Content’.

When PDFs are stored, the PDF can be viewed in the ‘Rendered Page’ tab and the text content of the PDF can be viewed in the ‘View Source’ tab and ‘Visible Content’ filter.

Extract PDF Properties

By default the PDF title and keywords will be extracted. These will appear in the ‘Title’ and ‘Meta Keywords’ columns in the Internal tab of the SEO Spider.

Google will convert the PDF to HTML and use the PDF title as the title element and the keywords as meta keywords, although it doesn’t use meta keywords in scoring.

By enabling ‘Extract PDF properties’, the following additional properties will also be extracted.

Subject
Author
Creation Date
Modification Date
Page Count
Word Count

These new columns are displayed in the Internal tab.

Extract Link Text

When this setting is enabled, the SEO Spider will attempt to locate the text associated with links within PDFs. When this is disabled, the columns will be blank.

The anchor text can be viewed in the lower Outlinks (and Inlinks) tabs associated with links.

Depending on the format of the PDF, this can be inaccurate, slow and memory intensive.

Dan Sharp

Dan Sharp is founder & Director of Screaming Frog. He has developed search strategies for a variety of clients from international brands to small and medium-sized businesses and designed and managed the build of the innovative SEO Spider software.

Comments are closed.

PDF

PDF

Store PDF

Extract PDF Properties

Extract Link Text

Get in touch

Back to top

PDF

PDF

Store PDF

Extract PDF Properties

Extract Link Text

Join the mailing list for updates, tips & giveaways

Get in touch

Back to top

SEO Spider v.22.2

SEO Spider v.22.2

SEO Spider v.22.2

Log File Analyser v.6.3

Log File Analyser v.6.3

Log File Analyser v.6.3

Support Ticket

Support Ticket

Training Request