If you’re not familiar with the world of internet marketing, and someone told you that you should use Screaming Frog, what would you think? I’d probably call them crazy. As it turns out, though, Screaming Frog is an excellent tool far too few people actually use to its full extent.
Now, when people talk about Screaming Frog, they aren’t talking about the company and their web marketing services. You’re certainly free to contract them for any and all SEO work you want done, but that’s not what I’m here for.
I’m here to teach you how to use the free tool they provide, the Screaming Frog SEO Spider Tool.
A spider is a piece of software that crawls around on a website, harvesting data and presenting it to the owner of the spider. Google has a fleet of these things it uses to index the Internet as completely as possible, for use in the search results.
Other search engines – everything from Yahoo and Bing to oddballs like Million Short either use search spiders of their own or pull data indexes from other entities that use spiders.
This particular spider is a desktop application you can download and run from your local PC, regardless of platform. It fetches SEO data, including URL, metadata, Schema categories, and more.
The primary benefit of Screaming Frog’s Spider is the ability to search for and filter various SEO issues. You don’t have to have a deep knowledge of SEO to figure out what is and isn’t done properly; the tool will help filter it for you. It can find bad redirects, meta refreshes, duplicate pages, missing metadata, and a whole lot more.
The tool is extremely robust. The data it collects includes server and link error, redirects, URLs blocked by robots.txt, external and internal links and their status, the security status of links, URL issues, issues with page titles, metadata, page response time, page word count, canonicalization, link anchor text, images with URLs, sizes, alt text, and a heck of a lot more.
Essentially, when I talk about doing a site audit or a content audit, everything I recommend you harvest can be harvested with Screaming Frog, and a whole lot more. Plus, since the tool is made to be SEO-friendly, it follows Google’s AJAX (Francis) standard for web crawling.
Now, the basic tool is the Lite version of the tool, which you can download and use for free. However, it limits you in several notable ways. Primarily, you can only crawl 500 URLs with it, and you lack access to some custom options, Google Analytics integration, and a handful of other features.
I highly recommend, if you have a medium or large-sized site with over 500 URLs you would want to crawl, that you buy the full license. It’s an annual fee of 99 British Pounds, which works out as of this writing to be about $140. Given that it works out to be under $12 per month, most businesses can easily afford it, and it’s well worth the price.
By default, Screaming Frog obeys the same directives as the Googlebot, including nofollow and noindex tags in your robots.txt.
However, if you want, you can give it unique directives using its own user agent, “Screaming Frog SEO Spider”. This allows you to control it more directly, and potentially give it more access than Google gets. You can read more about how to do that on their download page, at the bottom.
Regardless of the size of your site, unless you’re 100% certain you’ve done everything right and you haven’t made a mistake – you’re wrong if you believe that, by the way – the first thing you want to do is complete a total site crawl.
I’m going to be assuming you’re using the full version of Screaming Frog to make sure you haven’t missed anything.
Again, it’s super cheap, just buy the license.
If you’ve found that Screaming Frog crashes when crawling a large site, you might be having high memory issues. The spider will use all the memory available to it, and sometimes it will go higher than your computer will allow it to handle.
In order to put a throttle on it and keep it from crashing, you will need to go back to that spider configuration menu. Under Advanced, check “pause on high memory usage.” This will pause the spider when it’s eating your resources beyond where it can handle.
If you find that your crawl is timing out, it might be due to the server not handling as many requests as you want to send in. To rate limit your crawling, go to the speed submenu in the configuration menu and pick a limit for the number of requests it can make per second.
If you want to use proxies with your crawling – for competitive research or to avoid bot-capture blocking – you will need to click configuration and click proxy. From within this menu, you can set a proxy setup of your choice.
Screaming Frog supports pretty much any kind of proxy you want to use, though you will want to make sure it’s fast and responsive, otherwise your crawl will probably take forever.
Links are difficult to audit because they can be difficult to harvest. How many links do you have on a typical page? Couple that with all of your parameters and you have a lot of information you need to gather. Here’s how to do it with the spider.
From here you can export the data, or you can sort it as much as you like. Here are some sorts and actions you can perform.
Content audits are hugely important because a ton of the most important search ranking factors today are all content-based. Site speed, HTTPS integration, mobile integration, Schema.org; these are all important, but they aren’t as important as having high-quality content, good images, and a lack of duplication.
Sitemaps are incredibly helpful for Google, as they let the search engine know where all of your pages are and when they were last updated. You can generate one in a number of different ways, but Screaming Frog has its own method if you want to use it. All you need to do is crawl your site completely, including all subdomains.
Then click on the “Advanced Export” menu and click the bottom option, the XML Sitemap option. This will save your sitemap as an Excel table, which you can then edit. Open it and select read online and “open as an SML table.” Ignore any warnings that pop up. In table form, you can edit your sitemap easily, and you can save it as an XML file. When that’s done, you can upload it to Google.
If you are finding that certain sections of your site are not being indexed, you may have an issue with robots.txt flagging those subfolders as noindex. Additionally, if a page has no internal links pointing to it, it won’t be crawlable.
Make sure any page you know exists but that doesn’t show up as an internal link pointing to it.
And, there you have it! The beginner’s guide to Screaming Frog.