Tame Google Spider for Blogger

Monday, June 1, 2009

Tame Google Spider for Blogger

Google SpiderGoogle Spider will crawl over the whole internet and index all pages it comes across. It is not always good!

In case of Blogger blogs, the archive pages and label pages are also indexed, leading to duplicate content. It makes Google think something is wrong...

Here is how to restrict Google from indexing unwanted pages in Blogger...

1. Backup your layout

Go to Layout -> Edit HTML. Click on Download Full Template. Save it giving some name to remember what you are about to do.

2. Expand Widgets

Check the box namd Expand Widget Templates.

3. Placing code

Add the following code to your template, immediately after <b:include data='blog' name='all-head-content'/> or <head> tag:

<!-- Google Indexing -->
<b:if cond='data:blog.pageType == "item"'>
<!-- Post Pages : all -->
<meta content='all' name='robots'/>
<b:if cond='data:blog.url == data:blog.homepageUrl'>
<!-- Main Page : all -->
<meta content='all' name='robots'/>
<!-- Archive, Labels : noindex,follow -->
<meta content='noindex,follow' name='robots'/>


Now, only the Main Blog page and each Post page will be indexed. Archive and Label pages will not be indexed, but the links in those pages will be followed.

 See also... » Ripping off Blog Name from Blog Post Title

» Custom Icon for your Website

ATOzTOA : Latest Headlines


Unknown said...

Nice info. Thanks mate.
For blogger the labels(search) are already by default dis-allowed


atoztoa said...

@Jadu Saikia:

A correction:

My robots.txt file says,
User-agent: Mediapartners-Google

User-agent: *
Disallow: /search

This means nothing would be disallowed for Mediapartners-Google. The User-agent: * is overrided by the specific User-agent section. So /search will not be indexed by any other bots, but Mediapartners-Google will index that too...

Hope this helps :)

Post a Comment