Join the social network of Tech Nerds, increase skill rank, get work, manage projects...
 
  • How to Stop Search Engines from Indexing a Particular Pages

    • 0
    • 0
    • 0
    • 0
    • 0
    • 0
    • 0
    • 0
    • 421
    Comment on it

    Hello Friends,

    Sometimes we need to remove URL/Pages/Directory from indexing in Various search engine. So in this Blog we will be discussing about various method to disallow or stop major search engine from indexing search engine crawler like in the case of Google GoogleBot and for Bing its Bingbot.

    Robots.txt Method:
    It should be your primary method of disallowing URL from being indexed in search engine. thoroughly check content of pages you want to remove. You must place robots.txt file in root folder of your website.
    For example:
    http://findnerd.com/robots.txt
    and File content will be like below:

    # See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
    #
    # To ban all spiders from the entire site uncomment the next two lines:
    # User-Agent: *
    # Disallow: /
    User-agent: *
    Disallow: /account
    Disallow: /cache/
    Disallow: /components/
    Disallow: /installation/
    Disallow: /language/
    Disallow: /libraries/
    Disallow: /tmp
    

    If you want particular file you can use syntax like this:

    # To ban all spiders from the entire site uncomment the next two lines:
    # User-Agent: *
    # Disallow: /
    User-agent: *
    Disallow: /account
    Disallow: /user/data.php
    


    Google Webmaster Method:

    Google Webmaster tools to remove specific pages:

    • Open the Google Link: https://www.google.com/webmasters/tools/

    • Sign in to Google Webmaster Tools with your credentials

    • click Site configuration

    • Go to Crawler access

    • Click on URL removals tab.

    • Click on Create a new removal request

    • Just type or paste the URL to be removed, and then click Continue.

    • From the dropdown list, select the type of data removal you want (cache only, cache and SERP, or entire directory)

    • click Submit Request.


    You can check you request in listing.


     

    Meta Tag robots Method:


    If you are little familiar to meta tags or Html tags you can use this method. You just need to paste or type some meta tags in Head section of your page that you want to remove.

    Example:

    <meta name="robots" content="noindex, noarchive">
    

    nonindex- Restrict the bot from indexing the contents of the page, but links on the page can be followed. noarchive- Prevents the display of a cache link for that page in the SERP.

    X-Robots-Tag

    If one want to stop indexing some files pdf, doc, other file so meta tags are worthwhile in that case. Then you can use X-Robots-Tag with the help of php. When you generate file with php than you can use this method.
    header("X-Robots-Tag: noindex, nofollow", true);

    Conclusion:

    You can use whatever suits your needs and skill set. so, Hopefully you enjoy reading this blog.

 0 Comment(s)

Sign In
                           OR                           
                           OR                           
Register

Sign up using

                           OR                           
Forgot Password
Fill out the form below and instructions to reset your password will be emailed to you:
Reset Password
Fill out the form below and reset your password: