Skip to main content

Power of Robots.txt File

The robots.txt file was designed to inform bots how to behave on your site. What information they can get, what information they can’t. It is a simple text file that is very easy to create, once you understand the proper format. This system is called the Robots Exclusion Standard.


To create your robots.txt file, use the Notepad or another text editor. DO NOT create your robots.txt file in an HTML Editor like DreamWeaver, GoLive or FrontPage. FTP clients usually convert the file into Unix mode, but there are occasions when it will fail.


The two parts of a robots.txt file:

  1. User-agent - This line specifies the robot.



    For example:

    User-agent: googlebot



    You may also use the wildcard character " * " to specify all robots.

    For example:

    User-agent: *



    You can find user agent names in your own logs by checking for requests to robots.txt. Most major search engines have names for their spiders.


  2. Disallow – this consists of Disallow: directive lines. Just because the Disallow statement is there, doesn’t mean that the bot(s) are completely disallowed on the site. These lines can specify files and/or directories.


Examples below allows all robots to visit all files because the wildcard asterisk (*) specifies all robots.



User-agent: *

Disallow:



To deny all robots out, use this!

User-agent: *

Disallow: /



To restrict all bots in downloading a particular web page, apply this:

User-agent: *

Disallow: secret.html



To deny a single bot? Let’s prohibit the bot named GYMbot!

User-agent: GYMbot

Disallow: /



To keep bots out of your images folder, do this:

User-agent: *

Disallow: /images/



You can place comments in your robots.txt file. Any line that begins with # is considered to be a comment line and is ignored. I recommend this style of formatting the line and it follows as:



#Disallowing access to the scripts folder

Disallow: /scripts/



The robots.txt file should be placed in the root directory of your server. In other words, in the same place as your index.html file for your home page.


In about two weeks, you will begin to see improve spidering, a greater depth of indexing and even a rise in your rankings.


Wondering why in using a robots.txt file when you can use the meta-robots tag instead?


The meta-robots tag is not compliant to the needs of search engines and it is often not read. All the major engines and most of the minor engines look for the robots.txt and do their best to obey it.


Read about Meta Tags

Comments