In most circumstances, a 404 error page on your site is bad thing. It means a page didn’t load, a page moved without redirect or a page is broken in some way. In some cases, it means a mistyped link is pointing to a bad page. All of these are a hindrance to your site, but you can use them as an opportunity.
Think about it like this. When a user visits a URL attached to your domain, it’s either a page that exists or a page that doesn’t. When it’s a page that exists, good. You’ve drawn in a new user. If it’s a page that doesn’t exist, you have two options. You can display a 404 page – the default in most web setups – or you can automatically redirect them to the homepage.
When you realize that out of the infinite combinations of letters and numbers in varying lengths that can make up a URL, and you see how few of them are associated with a real page on your site, you begin to see the opportunity presented to you. Every possible typo, malformed URL or broken link is an opportunity.
The first thing you should know is that a passive redirect to your homepage is the wrong way to go. Sure, it looks a little better than a basic 404 page, but only a little. Often, a user will click to open your link in a new tab while they read another page. When they finish and tab over, and find themselves on your homepage, they’re going to have forgotten why they were there. Unless your homepage is incredibly compelling, they’re probably going to bounce.
Instead of allowing that to happen, you can customize your 404 page to give users a better chance of sticking around. Here’s how.
There are two types of 404 error page; the soft 404 and the hard 404. Soft 404 pages are pages that are broken links, but that don’t return the actual 404 error code. Hard 404 pages are registered by Google and other web crawlers for what they are; broken or missing pages.
The primary reason you want to make sure your 404 page is a hard 404 is because of search indexing. At some point, if a search crawler is crawling the URL, that URL is or was in the search index. Returning a soft 404 will keep the page in the index, despite its low value. A hard 404 will tell Google that, if the page remains broken, the link should be removed from the index.
While having a page removed from the index normally sounds like a bad thing, in the case of the 404, it’s a good idea. After all, the page doesn’t exist normally, so having an entry in the search rankings for it doesn’t help you.
The process to make sure your 404 is a hard 404 requires using some tools to check your server header status for broken link pages. If it’s not configured by default, you will need to modify your .htaccess file.
You can check Google’s Webmaster Tools for any incoming referrals to links that don’t exist. When a broken link like this occurs, it’s for one of two reasons. Either the page exists and the URL is wrong, or the page doesn’t exist, either because it never existed or because it was removed.
If the URL exists but the link is incorrect, you have two options, both of which you should do. The first option is to contact the webmaster of the originating site, informing them of the broken URL, and giving them the good URL to fix it. This will help correct the broken link. If you cannot correct the link, or if it’s linked improperly in too many places, you should implement a 301 redirect from the broken destination page to the real destination page. This will shunt users and search crawlers to the real page.
For URLs that don’t exist, never have existed or have been removed, you will present your hard 404 page.
All of this has been preparation for creating your optimized 404 page. This will essentially be a real webpage on your site that every broken link not otherwise redirected will point to.
404 pages on your site will have different URLs, unless you redirect all traffic to one static 404 page. In either case, it’s an error page, not a piece of content. Add in the noindex and nofollow attributes to avoid any issues with duplicate content or other SEO problems.
Once you have your 404 page set up, you can go back to your Webmaster Tools and gather a list of all pages people have attempted to access and have been served a 404, hard or soft. You can also find this information in cPanel. Essentially, what you’re looking for are any pages that users have attempted to access that may be sensitive. For example, users trying to access admin.php, login.php or /administrator are probably poking around, guessing, trying to find a hidden login page.
When you find these attempts, check what IP addresses are making them. It’s generally a good idea to block these IPs, to make sure the hacker doesn’t actually find a login page and gain unwanted access to your site.