Skip to main content

Understanding Robot.txt Files For Blogspot Blogs

We begin our discussion by understanding what robot.txt files are all about and how they influence search crawlers. Robot.txt files (as set up by webmasters) are used by search engine spiders to determine which pages of a particular domain should be indexed by it and which pages should be left out of the index.

Blogspot by default restricts search engines from indexing Label pages of Blogspot blogs. This is done with the help of robot.txt files. 

Blogspot Labels

Labels help bloggers categorize their posts easily. For example all my posts related to 'Blogging' can be found here :
http://www.inkjam.org/search/label/Blogging

An interesting thing to note here is that fact that a Blogspot blog does not have separate page assigned for its 'Labels'. In fact, post categorized under a label can be accessed only with a 'search' command. 

Robots.txt for Blogspot Blogs 

As stated earlier, Blogspot by default prevents web spiders from crawling its Label pages. This is how the Robots.txt files of a blogspot blog (by default) looks like :
User-agent: *
Disallow: /search
Allow: /

Sitemap: http://abc.blogspot.com/feeds/posts/default?orderby=updated 
The "User-agent: *" command implies that this section applies to all robots. 
The "Disallow: /search" command tells the web spiders not to index pages that use the search command.

Note that the slash (/) stands for your homepage URL. The use of the above robot.txt files can be better explained with an example. Let us imagine that your homepage URL is http://abc.blogspot.com/

The cumulative effect of the above robot.txt files would result in the search engines indexing all your pages except pages that use the following URL structure :
http://abc.blogspot.com/search/

Thus label pages which can be accessed only using a search command are kept out of search engine index.

To get search engines to index your blogspot blog's label pages the above mentioned robot.txt files can be modified as follows:
User-agent: *
Disallow: 
Allow: /

Sitemap: http://abc.blogspot.com/feeds/posts/default?orderby=updated 
Accessing your Robots.txt files 

Robot.txt files for your blogspot blog can be accessed as follows:

1. Go to 'Settings' and then click on 'Search Preferences'
2. Under 'Crawlers and Indexing' 'Edit' the 'Custom Robot.txt'
3. 'Enable Custom Robots.txt content'to access your Blogspot blog's robot.txt files
Important Note: I would strongly recommend that you do not modify your robots.txt files to allow search engines to index your Label pages as this might lead to duplicate content issues.

Setting up robot.txt files to prevent Search Engines from indexing Blogspot Archive pages 

Archive pages if indexed by search engines can lead to duplicate content worries and push your search rankings down. You can however, set up robot.txt files to prevent search engines from indexing your Blogspot blogs's Archive pages.

Continuing with our example blog - http://abc.blogspot.com

The monthly archive pages for the said blog would look something like this:
Month URL
January, 2013 http://abc.blogspot.com/2013_01_01_archive.html 
February, 2013 http://abc.blogspot.com/2013_02_01_archive.html 
March, 2013 http://abc.blogspot.com/2013_03_01_archive.html 
and so forth...

You can remove Blogspot archive pages by modifying the robots.txt files as follows:
User-agent: *
Disallow: /search
Disallow: /2013_01_01_archive.html 
Disallow: /2013_02_01_archive.html 
Disallow: /2013_03_01_archive.html 
Allow: /

Sitemap: http://www.example.com/feeds/posts/default?orderby=updated
Note that you will have to manually add disallow tags for each of your archive pages in a manner similar to what has been displayed above.

Popular posts from this blog

A Super Funny Joke - The Boy And The Priest

A housewife takes a lover during the day, while her husband is at work. Unknown to her, her 9 year old son was hiding in the closet. Her husband came home unexpectedly, so she hid her lover in the closet. The boy now has company. Boy: Dark in here.. Man: Yes it is. Boy: I have a baseball. Man: That’s nice. Boy: Want to buy it? Man: No, thanks. Boy: My dad’s outside. Man: OK, how much? Boy: $250. In the next few weeks, it happens again that the boy and the mom’s lover are in the closet together. Boy: Dark in here. Man: Yes, it is. Boy: I have a baseball glove. Man: How much? Boy: $750. Man: Fine. A few days later, the father says to the boy, “Grab your glove. Let’s go outside and toss the baseball.” The boy say's, “I can’t. I sold them.” The father asks, “How much did you sell them for?” The son says, “$1,000.” The father says, “That’s terrible to overcharge your friends like that, that is way more than those two things cost. I’m going to

Locked Keys - Blonde Joke

A blonde walks into a gas station and says to the manager, "I have locked my keys in my car. Do you have a coat hanger or something I can stick through the window to unlock the door?" The manager gives the blonde a bent coat hanger. A few minutes later, he goes out to check on her. As her approaches the blonde working the hanger in the window, he notices another blonde inside the car, shouting "No, no! A little to the left."

The Great SMS Android Swindle

Cyber crime is nothing new but now smart phones have become premium targets for malicious criminals. They are attractive to cyber criminals because of the data that they contain, and the services they provide particularly the payment processing services. They also are often much less secure than computers and laptops.  In the last week Google removed over 22 apps from the android marketplace and suspended the developer accounts. Android users in Europe should be wary as some apps on the market are more sinister than they appear to be at first glance. These apps are capable of sending premium SMS messages without the mobile users being aware. The attack has been centered around European users, who do not realise that they may be charged upto $5 per SMS. The apps being developed and made available for download are being disguised as popular and authentic games or add-ons to already popular and trusted games,  often just by adding the word FREE, like Angry Birds FREE to dupe use