Hello Friends,
Sometimes we need to remove URL/Pages/Directory from indexing in Various search engine. So in this Blog we will be discussing about various method to disallow or stop major search engine from indexing search engine crawler like in the case of Google GoogleBot and for Bing its Bingbot.
Robots.txt Method:
It should be your primary method of disallowing URL from being indexed in search engine. thoroughly check content of pages you want to remove. You must place robots.txt file in root folder of your website.
For example:
http://findnerd.com/robots.txt
and File content will be like below:
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
# User-Agent: *
# Disallow: /
User-agent: *
Disallow: /account
Disallow: /cache/
Disallow: /components/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /tmp
If you want particular file you can use syntax like this:
# To ban all spiders from the entire site uncomment the next two lines:
# User-Agent: *
# Disallow: /
User-agent: *
Disallow: /account
Disallow: /user/data.php
Google Webmaster Method:
Google Webmaster tools to remove specific pages:
-
Open the Google Link: https://www.google.com/webmasters/tools/
-
Sign in to Google Webmaster Tools with your credentials
-
click Site configuration
-
Go to Crawler access
-
Click on URL removals tab.
-
Click on Create a new removal request
-
Just type or paste the URL to be removed, and then click Continue.
-
From the dropdown list, select the type of data removal you want (cache only, cache and SERP, or entire directory)
-
click Submit Request.
You can check you request in listing.
If you are little familiar to meta tags or Html tags you can use this method. You just need to paste or type some meta tags in Head section of your page that you want to remove.
Example:
<meta name="robots" content="noindex, noarchive">
nonindex- Restrict the bot from indexing the contents of the page, but links on the page can be followed.
noarchive- Prevents the display of a cache link for that page in the SERP.
X-Robots-Tag
If one want to stop indexing some files pdf, doc, other file so meta tags are worthwhile in that case. Then you can use X-Robots-Tag with the help of php. When you generate file with php than you can use this method.
header("X-Robots-Tag: noindex, nofollow", true);
Conclusion:
You can use whatever suits your needs and skill set. so, Hopefully you enjoy reading this blog.
0 Comment(s)