Google’s index is huge and sprawling, and sometimes they need a little help keeping up with it. If you want to get a list of URLs removed from the index, there are a few ways to go about it, but they all center around the Google Remove URLs Tool.
Options for Removing URLs
There are a few different options you will have to explore when you want to remove content from Google’s index. Let’s explore them.
First up are two questions you have to answer.
- Do you own the content you want removed?
- Do you want the content removed from the search index, or from the web entirely?
The answers to these questions will guide the process you have to follow. I’ll explore each one in its own section.
You Own the Content and Want it Removed from the Index
If you own or control the content you want removed, and you want it removed from the Google search index, you can use the official Google URL Removal tool. That tool can be found here. In order to remove content from the index, however, you need to remove it from your website first.
There are a few ways you can do this. The first is to fill out a removal request without removing it from your site permanently. We covered the repercussions and process in more detail here, but there are a few quirks.
For a temporary removal, you can use this tool instead. The other tool is for a permanent removal. Temporary removals last about 90 days, after which the content will be re-indexed if it still exists on the web in visible form. You can undo the removal by submitting a reinclude request once a page has been removed.
Google has a few guidelines for the use of the tool.
- Never use it to clean up broken pages, outdated URLs, or system pages. Google is smart enough to remove those from its index when it crawls them again, particularly if you’ve implemented the right HTTP codes.
- Never use it to clean up pages that are penalizing your site. Those pages still exist, so the penalty will still exist.
- Never use it to try to scrap your site and start over. If your site is changing entirely, Google will reindex it from scratch on its own. If you’re purchasing and cleaning up a site, implement your changes and then file a reconsideration request.
- Never use it to hide your site after being hacked. There are more appropriate ways to deal with a compromised site.
- Never use it to “craft” an ideal version of your site. Canonicalization is the proper tool to use.
If you want the content to be removed permanently, you need to remove it from your site or restrict Google’s access to it. Putting it behind a login screen, using meta directives to implement NoIndex, or deleting the page will accomplish this task.
Note that using NoIndex is not a guaranteed way to have content hidden from the index. Google will not index the content on your own site, but they might still create a search result if another site links to the noindexed page with enough descriptive text to create a result. As such, you generally want to hide the content entirely.
You Own the Content and Want it Removed from the Web
If you want the content removed from the web entirely, so that no one can access it even if they have the URL, regardless of whether or not it comes from Google, you can do so. Since you have control over the page, this is within your power. You have several options.
- Put the content behind a password protected login screen. If users and Google cannot log in to see it, it’s effectively gone.
- Delete the content from your site. This will eventually remove it from the index, and immediately remove it from the web, with the exception of cached versions in the Google index or on sites like the Web Archive. You will need to contact the Web Archive or whatever other site is hosting a cached version to ask for removal. Google will remove it from the cache on their own after a while, or you can use a URL removal request with “remove from cache” checked.
If you control the page but do not own it, like if your content is showing up in Google+ or Google Shopping, you can still delete the page or remove it from that Google property. Each Google property has their own process, you will need to look up whichever one is most relevant.
You Don’t Own the Content and Want it Removed from the Index
If you do not own the content you want removed from the search index, you may or may not be able to get it removed. The Google URL Removal tool only works if you are the owner of the site, so you will not be able to use it. Your options are limited.
If the content does not exist on the web, but still exists in the search index or in the cached index, you can use the Remove Outdated Content tool to submit it for removal. This essentially just assists Google in crawling, notifying them that a page is gone and should be removed from the search results.
If the content still exists and you want it removed, you might not be able to do so. Google will not remove content at your request if you do not own the content, unless you are the actual owner of the content and it is being hosted without your permission.
- If the content violates a Google policy, you can have it removed by submitting a policy removal request. This applies to illegal content or content that violates their site policies as a whole.
- If the content is hosted illegally, you can ask Google to remove it with a legal request. This can happen for a number of reasons, including legal proceedings that are ongoing, or violations of copyright.
- If you don’t have an ongoing legal dispute but you own the content, you can file a DMCA takedown request. This can get the content removed from the index, but not from the offending page.
If, on the other hand, you simply object to the content, you don’t have grounds to stand on. Adult content that is not properly filtered by the SafeSearch filter can be reported and hidden, but it will not be removed, only recategorized. Objecting to content on religious grounds, for example, won’t get it removed, as that would infringe on the rights of others. The only case where this can happen is in heavily regulated countries like Iran or China, and even then, it’s usually the nation’s own firewalls filtering the content, not Google.
You Don’t Own the Content and Want it Removed from the Web
If you do not own or control the content that you want removed, but still want it removed from the web, Google will not help you. Google doesn’t own the web, they merely index it, no matter how powerful they are.
If the content is simply objectionable to you, you’re probably out of luck. You can sometimes contact a site owner and have the content removed, but often they’ll just laugh. “This offends me, remove it!” is not a valid complaint.
If you own the content in some way, such as the site scraped your blog post wholesale, stole your copyrighted images, or created a phishing site, you can contact the site owner and inform them of the illegality of their use of your content. This is a copyright violation, meaning you can use Google to file a DMCA and get it removed from their index, as well as threatening legal action against whoever is hosting the content.
If the site owner does not respond to your threats or requests to have the content removed, you can proceed to other avenues. If you have the legal right to have the content removed, like a copyright claim, you can talk to the web host that hosts the site. They will often take down content that is in violation of copyright. From there, you can use the Google “remove outdated content” tool linked above to get it removed from the index and cache as well.
If the web host also does not respond, or if they rule against you, you will need to consult with a lawyer. There’s not really anything more I can do for you, and Google, again, doesn’t own the web.
Bulk Use of the Removal Tool
The premise of this post, of course, is about the BULK removal of URLs from the index. All of the preceding content is simply to help you determine if you actually want content removed from the index rather than the web, and what process you can use.
If you control the content and you want it removed from the index, in bulk, you have pretty much two options.
The first option is to simply remove the content from your site and get Google to re-crawl it. You can use the removed content tools listed above to hasten the process, or simply submit a brand new sitemap to get Google to re-crawl the content faster.
If you have properly removed the content with a 301 redirect, a 410 gone, or other HTTP status code, Google will remove it from their index quickly. If you have a 404 in place, Google will retain at least the cache until it is determined that the content is gone for good, at which point it will be purged.
If, for some reason, you want to bulk remove URLs from the index without removing the content, first understand that this will be a temporary removal. Google will remove it from the index for roughly 90 days, after which the “ban” on indexing it will be lifted and it will return to the index upon the next site crawl. You will have to regularly resubmit your list.
If you are sure a bulk temporary removal – or a bulk permanent removal submission in conjunction with removal from your site itself – is what you want, then you can use this tool.
The tool I just linked is a Chrome extension created by Lih Chen, aka noitcudni on GitHub. Simply download the extension and unzip it. Go to Chrome and enable developer mode (in chrome://extensions/). Click to load an unpacked extension and load the unzipped extension from wherever you left it on your computer.
To use the tool, you will need to create a list of the URLs you want removed. Each URL will need to be separated by a \n.
Go to Google’s search console/webmaster tools and log into the appropriate property. Click on Optimization – Remove URLs. The tool will now have a new button for “choose file”, which will allow you to upload your list of URLs. Upload it and the script will execute, submitting each URL as a removal request according to your settings.
Note that you may have to execute this in batches if you’re trying to remove too many URLs at a time. Google has rate limits for many forms to protect against bot submissions, and this one is no different. Just pay attention to it and break up your files as necessary.