In the early days of Google Panda, duplicate content became public enemy number one. Sites of all stripes, in all color hats, found themselves face with penalties for low quality content, scraped content or duplicate content. In the years since, Google has refined it’s detection and penalty system, but the fear still remains. Duplicate content is bad, isn’t it?
The Problem with Ecommerce
Ecommerce sites were hit heavily by Panda, though many of the reasons they suffered have been minimized in the last year or so. Still, to identify the solution, one must first know the problem. Ecommerce pages suffered from all of the major flaws targeted by Panda.
- Duplicate content. A storefront with 50 products, each very similar, each with its own page, would have a significant amount of overlapping duplicate content when speaking of product specifications that are shared between the similar products.
- Thin content. A storefront using a template shared among many other online commerce sites would likely be classified as thin, particularly if the issue with duplicate product descriptions exists as well. Google initially viewed such sites as providing no real unique value to users.
These are just the problems specific to ecommerce sites. In addition, duplicate content can be created by a wide range of URL quirks, including using www or not in a URL, appending a / after a URL or the differences between htm and html.
Considering all of this, duplicate content hits a huge number of sites online. Yet very few actually received penalties, and fewer still continue to receive them. What’s the deal?
Google’s Stance on Duplicate Content
The short answer is that Google doesn’t actively penalize most duplicate content. The worst that can happen to your ecommerce site, if it suffers from these duplicate content issues, is that the duplicate pages are ignored by the search engine. This means, in essence, they may as well not exist. They won’t actively penalize you, but they won’t bring in any additional benefit, in SEO terms.
The reason for this is that Google knows very well that there are a wide range of quirks, many of which webmasters don’t even notice, that produce duplicate content when a site is indexed. Even simple things, like including a printer friendly version of your site, counts as duplicate content.
With all things Google, it comes down to one factor; value. Duplicate content on the order of product descriptions and printer friendly pages is valuable. If you sell six products, all six have different pages, and forcing you to remove five of them is unfeasible. These pages are not uselessly scraped or intentionally copied for SEO purposes, they’re legitimate parts of your business.
The sites that actually receive duplicate content penalties are those that scrape content from other sites to pass off as its own. They are the sites that use duplicate content in an attempt to bypass search regulations. Google actively penalizes sites attempting to circumvent the rules.
Fixing the Harmless Problems
Even though the duplicate content doesn’t harm your search ranking, you should still consider fixing the problems. You may not receive an active penalty, but you can probably optimize your site. Consider:
Issue: In any individual product category, there are a dozen or more products with very similar descriptions, triggering duplicate content warnings.
Solution: Write unique content for each product. It can be difficult to uniquely describe 30 different types of door hinge, but there are freelancers available who will write unique product descriptions for a fee. With a tight budget, focus on the most profitable items first and work backwards; it doesn’t need to be done all at once.
Issue: Even with unique product descriptions for most products, many still trigger duplicate warnings. The same product with different pages for each color, for example.
Solution: Create a single product page with selection boxes for individual parameters. For a shoe, for instance, there can be a selection box for color and for size. Another other pages, such as individual pages for colors, can be redirected to the primary page using a 301 reidrect to pass any SEO power it may have. Warning: some preconstructed ecommerce platforms are not SEO friendly, as they append these indicators to the URL, creating hundreds of duplicate pages. Consider switching to an SEO-friendly platform if this remains a persistent issue.
Issue: Faceted navigation – the style of ecommerce site with a large block of products and a bank of filters to the side – is a very powerful tool for actual users. Unfortunately, each filter is appended to the URL, creating an exponential number of pages registering as duplicate content.
Solution: Block faceted URLs with your Robots.txt file. This helps keep the crawlers from seeing anything more than the basic selection of products; all appended URLs may as well not exist. You can also add rel=canonical tags to each page, telling the search engine which page is “real” if any do get through.
Issue: Product description pages, even when condensed for parameters of a given product, are still considered thin.
Solution: Find a way to increase the density of content on the page. Add more descriptors and expand product descriptions. Solicit reviews for individual products and display them on the page. Add more related content links, including related products and a “users also bought” window, if possible. Anything that adds more content to each individual page increases the thickness of the content.
Too Much, Not Too Little
Another possible issue large ecommerce sites face is the issue of too much content. In this era of content, this seems like a contradiction, but it makes sense. If you have a warehouse of 10,000 products you sell, you don’t want 10,000 product pages on your site. Even compressing into category pages can only do so much. This particularly becomes an issue with paginated search results. 10,000 product pages at 5 products per page of search results is still way too many pages of too-similar content.
The best solution to this issue is to label the page links with the rel=next and rel=previous tags. This tells Google that each page of the search results is part of a large whole, not an individual page meant to stand alone. This consolidates your link juice and keeps the SEO power flowing.
Of course, as mentioned above, Google isn’t going to penalize you for not digging into the deep details of your ecommerce platform. The search giant knows that many small businesses don’t have the time or the knowledge to make many of these changes, which is why they don’t penalize you for any of these issues. You may miss out on some organic SEO growth, but you save the time and energy of making these changes if you put them off. It’s only malicious content duplication that you need to worry about in terms of penalties.