« Home | SEO en español » | Google Spreadsheets? » | Domain Managment » | Graduate Web Sites » | Search Engine Watch - Seattle Post » | Googleblogs Does Sitemaps » | Dental and Orthodonitc Directory Project » | Sesame Communications' Orthodontists Achieve Top S... » | Being busy is a good thing right?!We have launched... » | Welcome to Monday!Very busy this week and still ne... »

Google webmaster help center updates

Vanessa Fox over at Google Sitemaps posted on the Sitemaps blog that they have updated the webmaster help center in regards to robots.txt files and HTTP status codes.

Using a robots.txt file
We've added a new section of help topics in the How Google crawls my site section. These topics include information on:

How to create a robots.txt file
Descriptions of each user-agent that Google uses
How to use pattern matching
How often we recrawl your robots.txt file (around once a day)

Understanding HTTP status codes
This section explains HTTP status codes that your server might return when we request a page of your site. We display HTTP status codes in several places in Google Sitemaps (such as on the robots.txt analysis page and on the crawl errors page) and some site owners have asked us to provide more information about what these mean.


Proper use of HTTP status codes can make or seriously penalize a website. Lots of 404 or File Not Found errors are a sure sign of a poorly maintained website. The use of 301 Redirect codes should be properly used to forward multiple domain names to a single main site. Having multiple copies of websites can hurt your rankings on SERPs. (A search engine will see www.sesame-webdesign.com as a separate web site from www.sesamewebdesign.com and see duplicate content unless properly forwarded to one place.)

Incorrect uses of th robots.txt file can be disasterous if improperly configured. You can accidentally block your whole web site from being indexed! Take time to review the proper format and usages of the robots.txt file. It can be extremely useful in blocking results you dont want others to end up finding on the web, such as scripting or internal documentation. (Always be careful of information you put out on the web. Even if you block it via a robots.txt file you still run the risk of having it seen by someone somewhere.)