PDA

View Full Version : robots.txt



michaeledward
2008-11-26, 11:31 AM
Neophyte here ....

There is a file on the root of my site ... 'robots.txt' ...

What does it do?

According to the access log files, it gets quite a few hits. It's a simple enough looking file.
One thread on this board mentions it ... and talks about alterations users might have made to the file prior to backing up.

Thanks for any info.

tima
2008-11-26, 12:44 PM
Good question :) The idea is to tell robots or "search engine spiders" where they should not go. Search engines use robots/spiders to index the internet and produce the search results. If you have an area of your site you want them to stay away from, you list it in the robots.txt.

There's more info here: http://www.robotstxt.org/

ColleenB
2009-03-05, 12:17 PM
Hi,
I am new here and to my website. I am working on SEO and getting indexed.

I checked in Google Webmaster tools and have found that my robot.txt file is preventing Googlebot from indexing my site.

Does anyone know:

How I change it
What do I change it to

I don't know if this matters but I built my site in Front Page and am on the old servers.

Like i said, I am new and not as computer literate as I would like to be but I am learning.

Thank you in advance for your help
ColleenB

tima
2009-03-05, 04:08 PM
Hi Colleen,

The robots.txt is a simple text file which basically tells bots, such as googlebot, where to go or not go. If you connect to your site using FTP or our File Manager, you should see a file called 'robots.txt' in your main directory. You can simply download that file, and open it up with 'notepad' or any other text editor, and make changes to it.

One solution would be to simply remove the robots.txt file altogether. If there are no 'instructions' for googlebot, it will simply go through and spider your whole site. And this should be fine, assuming you don't have anything you want to hide from any search engine. You can also check out the site I posted previously, it should give you all sorts of information for what to do next if you want to keep your robots.txt.

Also, if you post the contents of your robots.txt file here, I'm sure we can try to help you further!

ColleenB
2009-03-05, 04:42 PM
Thanks Tim! I want Googlebot to see EVERYTHING..LOL I even want my images to be in Google Images. I want as much exposure as possible.
this is the contents of my robot.txt file:

User-agent: *

Disallow: /_fpclass
Disallow: /_private
Disallow: /_themes
Disallow: /_vti_cnf
Disallow: /_vti_log
Disallow: /_vti_pvt
Disallow: /_vti_script
Disallow: /_vti_txt

Disallow: /cgi-bin
Disallow: /email
Disallow: /fpdb
Disallow: /image

#Disallow: /w3svc? #Change ? with the instance number please.


I don't know what most of these files are so I didn't know if I should delete the entire file. Like I said, I am so new at all this.

My website address is: http://www.ooak-fairy-artdoll-sculptures.com in case that will help:)

I did go to that website as well as study on Google Webmaster's section but I am not confident enough in my knowledge to really know if what I am doing is correct.

Thanks again SO MUCH for your help:)

tima
2009-03-06, 12:00 PM
This part means the rules should apply to all robots:


User-agent: *

And these entries simply say, "do not go here"


Disallow: /_fpclass
Disallow: /_private
Disallow: /_themes
Disallow: /_vti_cnf
Disallow: /_vti_log
Disallow: /_vti_pvt
Disallow: /_vti_script
Disallow: /_vti_txt

Disallow: /cgi-bin
Disallow: /email
Disallow: /fpdb
Disallow: /image

All of the "/_" directories are related to FrontPage Server Extensions, which you generally don't want indexed. Similarly /cgi-bin and /fpdb are where you would normally store scripts or databases, which you would also want to exclude.

The /email directory doesn't matter, since its empty. You could remove the directory from your site and the robots.txt file if you want. And finally, I noticed you don't even have a directory called /image on your site, so you could also remove that entry. But these 2 don't really matter since there is no content being disallowed here.

Any other file or directory not listed above as a "disallow" (i.e. everything else) would be allowed for indexing.

I don't see why googlebot would be blocked from indexing your site by this robots.txt file. What was the message you got from google webmaster tools?

ColleenB
2009-03-08, 11:14 PM
Thanks for your reply Tim.

Google said: (which referred to my main website address)

We had problems crawling the pages listed here, and as a result they won't be added to our index and will not appear in search results

Error icon: robots.txt unreachable
----------------------------------
I have deleted the file but they have not tried to crawl the site again. It's been since 3/2. I have no clue how long it will take them to try again.

I wrote my entire site in Front Page so every page has the same extention.
Thanks again
Colleen

tima
2009-03-09, 02:28 PM
I'm not sure why they couldn't find your robots.txt file, unless it was moved or renamed. I found the following tool which could be usefull in finding errors in a robots.txt:

If you still have your robots.txt located in the root folder of your site you can do the following to check it for errors:

* Sign into Google Webmaster Tools with your Google Account.
* On the Dashboard, click the URL for the site you want.
* Click Tools, and then click Analyze robots.txt.


We had problems crawling the pages listed here, and as a result they won't be added to our index and will not appear in search results

This error could also be related to problems crawling your site. Make sure your pages are all linked together through internal hyperlinks. You could also try adding a sitemaps.xml file, which you can submit to search engines to help them crawl your site and to suggest to them how often to come back for updates.

ColleenB
2009-03-09, 04:22 PM
My internal links are buttons. I had read in the Google Webmaster Tools that they should be text. So I think I am going to add the text links at the bottom of each page to see if that helps.

I also saw the information about the site maps but haven't been able to figure out how to do them, even with the demonstrations in Webmaster Tools. Like I said, I am such a beginner in all of this and am doggone proud that I have got this far. LOL I will try again to understand how to make a site map and go that route and see what happens.

Thanks again, I appreciate your time and efforts with me.
((HUGS))
Colleen

Zath
2009-03-26, 10:32 AM
My internal links are buttons. I had read in the Google Webmaster Tools that they should be text. So I think I am going to add the text links at the bottom of each page to see if that helps.

I also saw the information about the site maps but haven't been able to figure out how to do them, even with the demonstrations in Webmaster Tools. Like I said, I am such a beginner in all of this and am doggone proud that I have got this far. LOL I will try again to understand how to make a site map and go that route and see what happens.

Thanks again, I appreciate your time and efforts with me.
((HUGS))
Colleen

You can use graphic images as buttons. What you also need to do is use the alt tag and title tags. These can say the name of the image or the button name/action or anything you want "found" for SEO purposes.
The alt tag is incorrectly used as a float over option when that should really be handled by the title tag. It's widespread and even accepted in strict validation.