Schema.org is a type of information codification presented by Google, Microsoft, Yahoo, and Yandex. Since it’s supported by so many big players online, it’s the only real way to structure data online.
The idea is that by adding tags to different types of data on your page, you give mechanical validation to your data. Any software coming in, like a search engine scraper, can see and know that a bit of data is a review, or a blog post, or a title, or a caption, or whatever else. Rather than having to index it based on patterns, implications, and semantics, Google and the others can know without a doubt it is what it should be.
For reference, you can read along and look up documentation yourself on the official website, schema.org. It may look a little daunting, but it’s not all that bad; 90% of what you see will apply to sites using a template different from yours, and can safely be ignored.
Do a quick Google search for a big name movie or new release. You’ll see a lot of various search results, but look around for the inevitable Wikipedia entry. You’ll see some additional links beneath the description, like links to actor pages. These links are part of Google’s rich snippets, which typically come from schema implementation.
As far as website SEO, that’s what we’re shooting for with schema implementation. There are a lot of benefits to schema, but they’re generally all on the mechanical end of things. They aren’t direct benefits for your users, and they aren’t necessarily going to be directly visible at all. However, it builds a foundation for a lot of potential benefit, and clarity of information.
Schema, essentially, tells search engines what your data is and what it means, not just what it says. It tells Google that a particular bit of data is, for example, written by a specific author, covering a specific topic, in the format of a blog post, and so forth. Think of it like HTML formatting tags, but for meaning. Instead of a <h1> tag indicating a heading, it’s a microdata tag indicating author.
Schema microdata is still valid HTML code. If you’re into labyrinthine developer documentation and code specifics, you can read about it here. If your eyes glaze over about that stuff, just recognize that you aren’t going to have to learn a new form of coding or anything like that. In fact, the majority of the implementation we’ll be doing is through a tool that eliminates the need for much of any manual coding.
Another way to think about schema is that normal web meta data is a sort of infant form of it. Meta data tells search engines – and thus users – a general summary of the content of the page. Your meta title and description are bits of information users can read to figure out what is on the page. Schema is the same sort of thing, except it’s meta data for every bit of information on your page.
This is important because it allows you to properly categorize your content and, thus, show up more prevalently in searches related to that kind of content. If you’re using the schema markup for movie data, you’ll show up more competitive to Wikipedia movie pages and IMDB search results. If you’re using the book review schema markup, you’ll show up more like Goodreads. If you’re using the product data markup, you’ll show up as a robust product search result, rather than a half-formed poorly-scraped product listing.
Even if you’re not a specialized type of content, there’s a schema for you. The basic blog articles schema works fine for most blogs, and there’s no penalty for adding it unless you’re implementing it improperly. If you’re following this post, you should be fine on that count.
About 99 times out of 100, you’ll get a search ranking boost from implementing schema markup. A lot of the time it will be a very minor boost, but occasionally you’ll be able to get rich snippets and, with them, more visibility in the search results. It’s a potentially huge source of visibility, benefit, and ranking.
Schema says you should mark up every bit of data you can, but cautions that you should only mark up the data that is visible to a user. If it’s information in a hidden div, or otherwise invisible to the user, it’s information they won’t see. The disconnect between seeing it in search results and seeing it on the page is a bad thing as far as SEO goes. So, uh, don’t do that.
Now, all of that sounds pretty complex, and it is. There’s a lot to schema. Just take a look at how much you need to drill down just to find your category. Thankfully, we’re going to be using a structured data tool from Google to make this all about 1,000 times easier to implement.
The first thing you’re going to do is navigate to Google’s structured data markup helper. You can find it over here. It allows you to flag data on your site and will apply the proper schema code to it, and then give you the data you need to put in your page. You’ll see.
For a standard blog, you’ll probably want to select the “articles” category. If your page – this goes by page, not by site as a whole – covers different categories, choose the appropriate one. If your category isn’t on the list, don’t worry; you can still implement the process, you’ll just have to play around with data a little more directly.
Plug in the URL of the page you’re going to mark up, or paste the raw HTML code if you’re working with the data directly. This feeds in the base data from which you can work.
At this point, the markup tool will load a sample version of the page on the left hand side, and will show data items on the right. This is the window we’ll be working on.
The process will basically go like this. Select part of the article, like the title. Highlight the whole thing. When you do, a menu pops up with elements from the right pane. Choose the right one. Repeat until everything is flagged.
The only piece of data that is actually required to be flagged here is the name, which is the title of your blog article in this case. In other cases it might be the name of the review, the name of the product, or something else. You’ll probably want to fill out everything, though. Name is important. Author is important for author associations, though it may be more beneficial to multi-site authors. It’s like Google Authorship was trying to be, before it was killed.
Some data, like a related posts widget, a comments section, social sharing buttons, and navigation doesn’t matter. Breadcrumbs can matter, but implementing them will be a different process, one we covered over here.
Continue tagging as much of the post as you can, but make sure it’s accurate. Misinformation and mis-tagging in schema is what leads to errors and Google ignoring your markup. It’s better to leave something minor untagged than it is to tag it improperly.
When you’ve finished tagging everything, notice the red button in the upper right of the page, the one that says “create html.” Click this, and the data pane will turn into a pane of your html code. It’s a lot of messy code if you aren’t familiar with HTML, but don’t worry. You may notice that some sections of the code are highlighted in yellow; these are the important bits.
Now you have two options.
In either case, I highly recommend backing up the original file before you upload or change it. This will make sure that if something goes wrong, you will be able to restore a backup and fix the problem in a local testing environment rather than your live server.
In both cases, you’re going to have to make chances to your files, and there’s no easy way to automatically do it to every page on your site. If you’re using a CMS like WordPress, where you’re not comfortable editing files, you can use a plugin to do schema instead.
There are others, of course, so feel free to take your pick and find one that works in a way you appreciate.
Now, once you’ve implemented your schema and uploaded the new file – or just prepared the new file for upload – you want to test it and make sure the schema is well-formed and not poorly implemented or broken. If you’ve ever coded before, you know that pesky commas and other symbols can get in the way and break things. Google, of course, provides a testing tool as well.
What you want to do is copy your html, or upload the file, into this tool. Paste the data and click “validate.” It will take a moment for Google to scan and parse the code, and then you will be presented with a list of the tags and whether or not they’re good.
For my example document, I intentionally tagged some items incorrectly. I had some images tagged as part of the article body, which throws an error because an image is not text. I also had part of the article body tagged as an image, to throw a similar error.
You may see some errors even if you tagged everything properly. For example, one “error” Google gave me was that I didn’t tag a publisher. It’s “missing and recommended” but not a game breaking error. If you have publisher data visible, or want to add it to schema directly, feel free to do so. Otherwise, you can safely ignore those sorts of errors.
There’s a lot more to schema if you want to really get into it. You can essentially find a tag for every element on your site, which helps Google and the other search engines learn specifically what every bit of data is and what it means. However, it’s really just the top level basic stuff that gives you the most benefit. Schema encourages you to mark as much as you can, though. “You should mark up only the content that is visible to people who visit the web page,” however. Only mark up what a user can see.
Article-based blogs can benefit from schema by flagging important data like author, title, and so forth. Don’t think you’re free to ignore it just because you’re not getting a ton of specific benefit. Meanwhile, product-based sites get a ton of value out of flagging product information. Recipe blogs get a lot out of it as well, as do review sites.
Alternatively, you can talk to the people you have developing your website and get them to plug in the schema data. It’s a bit of a pain to do manually, particularly if you have a huge old site or are using an exotic CMS. If you’re doing it yourself, you can see some examples of what to do here. Just keep good backups and make sure you’re plugging in the code properly.