How to Find and Delete Low Quality Content on Your Site

The views of contributors are their own, and not necessarily those of SEOBlog.com

Delete a Page by Website

Back in 2011, Google released an algorithmic update called Panda, and in the five years since, it has become one of the backbones of Google’s web search. Panda, and many of the tertiary algorithms surrounding it – the zoo of Penguin, Hummingbird, Pigeon, and all the rest – tend to focus on analyzing the quality of content on a site. Prior to Panda, links and keywords were the most important metrics. Yet, if you remember those old days, you would recognize how easy it was to game those metrics and get a mediocre or spammy page ranked in the top spots on Google search results.

Why do I bring this up? Well, the reason is that even if Panda has not be publicly updating as a stand-alone thing for a while, it’s still there and it still matters. What it means is that the quality of content on your site matters. If you have old, low quality content floating around, don’t ignore it. Chances are pretty good that it’s hurting your search ranking, and dealing with it could give you quite a significant boost to your search ranking.

On a large old site, though, you could have thousands, tens of thousands, or more pages all over the place. It would be a massive pain to try to manually go through them and analyze them for quality. Thankfully, you can use a few tools to make it all significantly easier.

Step 1: Crawl Your Site

The first thing you need to do is make a comprehensive list of all of the content on your site. Again, this would be a huge pain to do manually, so I’m going to recommend a tool to you. The tool is called Screaming Frog, and you can find it here.

Screaming Frog is a very comprehensive web crawling spider. It has a ton of features, from auditing content to analyzing meta data, checking links, creating a sitemap, integrating with Google Analytics, and a whole lot more. The free version is sufficient for what we’re doing today, with one caveat; it only works on sites with under 500 URLs to crawl. If you have more than that, and you probably will, you want to buy a license to the paid version. This costs about $200 (150 British Pounds) for a one year license.

Screaming Frog Website

I’m aware that it’s sometimes a tough sell to spend that much money on a tool without knowing how to use it, so feel free to get the free version and experiment. The wide array of features means the cost is well worth it if you learn to make use of them, but I understand if you don’t want to dive into it all right away. Still, I’m going to assume that bought the paid version and go from there.

With Screaming Frog, you will need to put in a few options to crawl your site. With the program open, click Configuration and click Spider. Under basic options, make sure every box is unchecked, then check “crawl all subdomains” to make sure you get every page on your site. If you know all of content pages are in a specific subdomain, you can crawl that subdomain and the subdomains of that subdomain, without crawling other top level subdomains. If that makes sense. With those options set, input your website URL in the “URL to Spider” box up at the top, then hit start.

Depending on how large your site is, how good your internet connection is, and how much memory your computer has, the crawl could take a while. Just take a break and let it do its work. When it finishes, you’ll be presented with a ton of data about your site, all organized based on the URL of each individual page. You can see the URL, the page titles and meta data, link information, HTTP response codes, and a lot more.

Once you have all of that data on hand, you can start to run an audit. Before you do, though, you should understand the markers of quality on the modern web.

Step 2: Understand Thin Content

What you’re looking for on your site are pages that could be holding you back as a whole, because of some lackluster element. I’m not going to cover site-wide issues here, like having poorly optimized meta data throughout your site or not using human-readable URLs. Those are good to fix, but they’re outside the scope of this article.

Thin Content Google Webmasters

So, what constitutes poor quality content? You can look at it like this:

  • Content on your site that gets no organic traffic might be thin or low quality content. However, it could also just be old content, which is perfectly fine. You don’t want to delete or move old content that isn’t hurting you, because Google uses it as a sign of age and trust. Removing it makes your site look smaller or sparser, neither of which you want.
  • Content on your site that has low rankings on Google search could be considered thin content. However, you need to consider context with this; a page could rank #30 and still be great content, you just have a ton of competition for that search. On the other hand, a piece of thin content could rank in the top 10, just because it’s one of the few pieces on the web that works for that search.
  • Content that has little or no social shares could be considered thin content. Of course, this is only applicable if you were using social media at the time the content was published. If it’s old enough to have been published before social media was a big part of your marketing, the social share count isn’t a valuable metric.
  • Content of under 1,000 words is very likely to be thin content. There are occasionally reasons for content to be short, but most of them can still benefit from being expanded. I’ll discuss this more later.

Essentially, what I’m going to do is consider word count to be the number one indicator of content quality, and let you use the other metrics to help you judge what to do with that content once you have identified it.

Step 3: Identify Potential Thin Content

Going back to your data in Screaming Frog, go to the “internal” tab and sort your pages by word count. Export this data as a CSV file so you can work on it. Anything under, say, 1,500 words should be pulled aside. Everything else can be ignored; it’s probably fine.

Word Count Screaming Frog

Any piece of content under 1,500 words has to have a good reason to be that short. Here are some valid reasons:

  • The URL is a category page, 404 page, or other site page that is not actually a piece of content. These pages are generally not major factors for search ranking.
  • The URL is a link-based FAQ sending visitors to other pages for answers to questions. This is not a good strategy, however; you should ideally merge the questions and answers into one large page, like this one.
  • The URL is the answer page for a specific question on the FAQ mentioned above. Again, merge these with the main page and redirect them. You’ll get a lot more value out of it.
  • The URL is a product page and only has a couple images and a product description on it. The ideal path for this, though, is to go the Amazon route and fill the page with a ton of supplementary content to make it more valuable.

If, however, the page is short because it’s a simple blog post and you didn’t think you needed to write much about it, that’s probably thin content. Covering an issue on a superficial level is fine, but you’re always going to be outranked by anyone covering it in more depth. It’s better if you find ways to expand the content, or remove it so it isn’t holding you back.

Go ahead and filter through your potentially thin URLs to find pages that meet the criteria for potentially being thin content. Once you have that list, you can move on to the next step.

Step 4: Determine a Plan of Action

At this point, you have a list of the thin content on your site. Now you have to decide what to do with it. This is the point where you want to have some idea of those other metrics, like traffic, social shares, and incoming links. Use your analytics suite of choice to gather that data.

Delete Content WordPress

The way I see it, there are generally four possible options.

  1. You can leave the content as it is, unchanged. This is not generally a great idea, because if the content is holding you back, you’re just living with that anchor. You had better have a good reason to leave the content as it is.
  2. You can delete the content. This is generally going to be the ideal option if the content gets no traffic, has no links, no social shares, and no information of value. If it’s duplicated from another site, if it’s redundant to another part of your site, or if it’s spam in some way, that’s a red flag making it well worth deleting.
  3. You can merge the content with other pieces of content. For example, if you have written three short blog posts about slightly different aspects of one topic, those might be better off merged into one. This is an even better idea if you notice that traffic typically goes from one to another. The FAQ example is another case where merging is a good idea. Pick the best of the pages and make it the default page, merge the relevant information from the others into it, and redirect the other URLs to the primary URL. This will consolidate the value of the pages into one place.
  4. You can improve the content on its own. This is generally going to be your ideal course of action when the content is valuable but short. If it has traffic, links, social shares, and other forms of external value, it’s going to be a good idea to give the content a little attention. Older posts can be updated with new data, guides can be updated with more detail and images, etc. Work to improve the content in any way you can think to do so. Shoot for at least 2,000 words, if not more.

What you do specifically will depend on your own analysis and your own personal thresholds. If you’re on the edge, go ahead and err on the side of saving the content and improving it in some way.

Step 5: Taking Action

Before you go and start making changes, you want to do a few things. First, make a back up of your site. You don’t want to make changes that tank your site and have no option for recovering the lost value. Sometimes a page you think is thin is a page Google doesn’t mind, and removing it can hurt you.

Second, take snapshots of your important traffic and search ranking metrics. You want to know where your site starts so you know what effect your changes have.

Traffic Drop Snapshot

Third, identify any other issues that could be having a negative effect on your SEO, particularly on those pages. Image captions, page meta data, URL structure, scripts, site speed; these are changes you can make while you’re making large changes to your site, in hopes of getting all of the improvement at once.

Once you’ve set up that groundwork, go ahead and make the content changes you want to make. You have two choices here; you can either change them gradually or you can change them all at once. Doing it all at once is likely to make your search ranking fluctuate a lot for a week or two, while Google figures out what you did. You might go up and you might go down before you settle in your new position, so don’t freak out until your metrics have stabilized.

Next, just to make sure Google understands what you’ve done, create a comprehensive site map and upload it to Google Webmaster Tools. This ensures that any lingering pages, new pages, or potentially missing old pages are indexed by the search engine.

Traffic Increase Post Deleting Content

That’s it! Once you’ve made your chances, all that’s left is to monitor your metrics and see what they did. Ideally, your site will have improved.

Written by Dan Virgillito

Dan Virgillito

Dan Virgillito is a freelance content strategist with a passion for good storytelling and all things digital. He lived in the Netherlands, Poland, England and Sicily. Say hi on Twitter.

Join the Discussion

Featured SEO Service

Get Social Shares on
Each New Blog Post

Blog like a pro and get real human shares as soon as you publish

Trending Posts
Share26
Tweet
+14
Share6
Buffer