You’ve got to hand it to Matt Cutts. The poor guy is inundated with what seems like hundreds of SEO questions every month, ranging from the absolutely dim-witted to the completely esoteric. Worse, he’s seemingly confronted with questions that are already answered in Google’s “Webmaster Guidelines,” the webmaster’s virtual SEO bible.
The questions are never-ending, repetitive, and sometimes, downright boring. Yet, he trudges on, politely addressing some rather colorful internet debates. In one of his most recent videos, Matt tackles copied content, and despite the number of times and different ways the issue is addressed in Google’s guidelines, his latest response is sure to ruffle some feathers.
The Question at Hand and the Crux of the Problem
The question in the video asks whether it is acceptable to publish “stitched content,” or, small bits of copyrighted content presented as a one-paged collection of information. Specifically, it asks, “Can a site still do well in Google if I copy only a small portion of content from different websites and create my own article by combining all, considering I will mention the source of that content (by giving their URLs in the article)?”
You’ve probably seen this strategy before, and there’s no doubt from this video that Google has seen it as well. Stitched content is the result of third-party content aggregation obtained through an automated process like Really Simple Syndication (RSS). Webmasters will not only use it to achieve massive amounts of content in a short amount of time, they’ll also defend it as “content syndication,” or “content curation.”
The reasons why they’ll defend it as such are simple. For one, Google doesn’t directly penalize syndicated or curated content because they’re content development strategies used by professional organizations (news outlets, libraries, museums, published magazines, etc.). Second, Google will reprimand aggregated content with the “Automatically generated content” penalty or the “Scraped content” penalty. The punishment is a low search engine position – a webmaster’s worst nightmare.
The problem is this: Because both methods use similar processes, and because both methods can produce similar results, the line between punishable, stitched content (i.e. aggregated content) and non-punishable, syndicated or curated content isn’t as clear as we’d like it to be. That is, until now.
Matt Points to Yahoo for an Answer
Pointing to Yahoo’s blatant disregard towards stitched content, Matt suggests this type of content creation has always been a form of spam because it doesn’t add value. At Yahoo, spam is content that’s substantially the same across sites and pages. It’s also a collection of automatically generated webpages. And if Matt’s reference to Yahoo’s attitude toward spam is any indication of Google’s stance on the subject, we can begin to understand why it isn’t welcome at either one of these search engines.
Both Google and Yahoo clearly prefer content that is original, unique and helpful. In Matt’s latest discourse, he falls short of calling a collection of referenced material nothing more than an incorrectly formatted bibliography page. Matt compared it to a “clip show” on television, in fact, and insisted people prefer to see new content over bits and pieces of existing information plucked from a multitude of websites. He even physically motioned the act of stitching content presumably so there’s no confusion about what he’s describing.
Stitched Content is High Risk Content
<iframe width=”550″ height=”360″ src=”//www.youtube.com/embed/Z13-yP3Zhns?feature=player_detailpage” frameborder=”0″ allowfullscreen></iframe>
In Matt’s words, people “don’t want to just see an excerpt, and then a one line, and then an excerpt, and then a one line and that sort of thing.” He added Google will probably see this type of content as “high risk” content, or, content that is likely to violate Google’s guidelines. If we turn to Google’s guidelines, in particular, the section regarding automatically generated content, we can see that it’s defined several ways. More important to this issue, it’s specifically defined as:
• Text translated by an automated tool without human review or curation before publishing
• Text generated from scraping Atom/RSS feeds or search results
• Stitching or combining content from different web pages without adding sufficient value
Similarly, the section regarding scraped content, associates the practice with thin affiliate content. It’s defined in several different ways as well, however, in the context of this article, it is:
• Content that copies and republishes content from other sites without adding any original content or value
• Content that reproduces content feeds from other sites without providing some type of unique organization or benefit to the user
Research Writing Gets an Exemption (Sort of…)
While Matt attempts to clear the confusion surrounding what is and isn’t acceptable, he’s careful to exempt research writing and its classic format. Research writing quotes its references and cites its resources. It presents a position on an issue, and it uses existing content to substantiate that position. Wikipedia articles, the type of acceptable content Matt references in his video, is a great example of research writing, but more importantly, the kind of content Google rewards with a higher search engine position.
Yes, it’s true that Wikipedia’s strategy of synthesizing content extracts information from different sources, but it’s careful to properly cite that information with footnotes, primary and secondary sources and external links. As a webmaster, you’re going to need the same strategy in your own content plan of action if you want to maintain an existing search engine ranking or improve the one that you already have.
A New Type of Content – A New Frontier
Content that’s built from RSS feeds or copied without context and accreditation is no longer acceptable. Matt’s video makes that clear. Google’s guidelines make that clear and so do Yahoo’s. Yesterday’s curation and syndication buzzwords no longer disguise aggregated pages as something that they’re not. So it’s time to become a real thought leader with our own engaging content. It may be harder to do, and it may take longer to accomplish, but as Booker T. Washington once wrote, “Nothing ever comes to one, that is worth having, except as a result of hard work.”
The benefits of following through with this approach are twofold. As webmasters, we gain major SEO and SEM points with Google, Yahoo and every other search engine that enforces similar content rules. As authors, we gain respect from an engaged and growing audience. No one loses here, so there’s no reason to continue using a strategy that works against us.
Fortunately, there are plenty of resources that can help. Professional writing guides and tips can be had through a simple search. Even better, they can be had online, at no cost. Start at Google’s “Webmaster Guidelines.” There, you can find out what turns mediocre or uninteresting and non-engaging content into quality that people will want to read and share on social networks. Then visit Yahoo’s “Content quality guidelines.” Yahoo’s definition of quality content is a bit more explicit than Google’s even though both tend to favor and punish sites with similar criteria. Even Bing offers webmaster guidelines, albeit they tend to stress characteristics of SEO more than writing quality.
Other places that can help are sites that address professional writing exclusively. College and university websites are great resources for learning required techniques and formats.