Top VPS Hosting Provider

Welcome to the myhosting.com Forums.
+ Reply to Thread
Results 1 to 10 of 10

Thread: robots.txt

  1. #1
    michaeledward is offline Junior Member
    Join Date
    Nov 2008
    Posts
    3

    Default robots.txt

    Neophyte here ....

    There is a file on the root of my site ... 'robots.txt' ...

    What does it do?

    According to the access log files, it gets quite a few hits. It's a simple enough looking file.
    One thread on this board mentions it ... and talks about alterations users might have made to the file prior to backing up.

    Thanks for any info.

  2. #2
    tima is offline Administrator
    Join Date
    Apr 2008
    Posts
    191

    Default Re: robots.txt

    Good question The idea is to tell robots or "search engine spiders" where they should not go. Search engines use robots/spiders to index the internet and produce the search results. If you have an area of your site you want them to stay away from, you list it in the robots.txt.

    There's more info here: http://www.robotstxt.org/
    Tim Attwood
    Product Manager
    myhosting.com

  3. #3
    ColleenB is offline Junior Member
    Join Date
    Mar 2009
    Posts
    4

    Default Re: robots.txt

    Hi,
    I am new here and to my website. I am working on SEO and getting indexed.

    I checked in Google Webmaster tools and have found that my robot.txt file is preventing Googlebot from indexing my site.

    Does anyone know:

    How I change it
    What do I change it to

    I don't know if this matters but I built my site in Front Page and am on the old servers.

    Like i said, I am new and not as computer literate as I would like to be but I am learning.

    Thank you in advance for your help
    ColleenB

  4. #4
    tima is offline Administrator
    Join Date
    Apr 2008
    Posts
    191

    Default Re: robots.txt

    Hi Colleen,

    The robots.txt is a simple text file which basically tells bots, such as googlebot, where to go or not go. If you connect to your site using FTP or our File Manager, you should see a file called 'robots.txt' in your main directory. You can simply download that file, and open it up with 'notepad' or any other text editor, and make changes to it.

    One solution would be to simply remove the robots.txt file altogether. If there are no 'instructions' for googlebot, it will simply go through and spider your whole site. And this should be fine, assuming you don't have anything you want to hide from any search engine. You can also check out the site I posted previously, it should give you all sorts of information for what to do next if you want to keep your robots.txt.

    Also, if you post the contents of your robots.txt file here, I'm sure we can try to help you further!
    Tim Attwood
    Product Manager
    myhosting.com

  5. #5
    ColleenB is offline Junior Member
    Join Date
    Mar 2009
    Posts
    4

    Default Re: robots.txt

    Thanks Tim! I want Googlebot to see EVERYTHING..LOL I even want my images to be in Google Images. I want as much exposure as possible.
    this is the contents of my robot.txt file:

    User-agent: *

    Disallow: /_fpclass
    Disallow: /_private
    Disallow: /_themes
    Disallow: /_vti_cnf
    Disallow: /_vti_log
    Disallow: /_vti_pvt
    Disallow: /_vti_script
    Disallow: /_vti_txt

    Disallow: /cgi-bin
    Disallow: /email
    Disallow: /fpdb
    Disallow: /image

    #Disallow: /w3svc? #Change ? with the instance number please.


    I don't know what most of these files are so I didn't know if I should delete the entire file. Like I said, I am so new at all this.

    My website address is: http://www.ooak-fairy-artdoll-sculptures.com in case that will help

    I did go to that website as well as study on Google Webmaster's section but I am not confident enough in my knowledge to really know if what I am doing is correct.

    Thanks again SO MUCH for your help

  6. #6
    tima is offline Administrator
    Join Date
    Apr 2008
    Posts
    191

    Default Re: robots.txt

    This part means the rules should apply to all robots:

    User-agent: *
    And these entries simply say, "do not go here"

    Disallow: /_fpclass
    Disallow: /_private
    Disallow: /_themes
    Disallow: /_vti_cnf
    Disallow: /_vti_log
    Disallow: /_vti_pvt
    Disallow: /_vti_script
    Disallow: /_vti_txt

    Disallow: /cgi-bin
    Disallow: /email
    Disallow: /fpdb
    Disallow: /image
    All of the "/_" directories are related to FrontPage Server Extensions, which you generally don't want indexed. Similarly /cgi-bin and /fpdb are where you would normally store scripts or databases, which you would also want to exclude.

    The /email directory doesn't matter, since its empty. You could remove the directory from your site and the robots.txt file if you want. And finally, I noticed you don't even have a directory called /image on your site, so you could also remove that entry. But these 2 don't really matter since there is no content being disallowed here.

    Any other file or directory not listed above as a "disallow" (i.e. everything else) would be allowed for indexing.

    I don't see why googlebot would be blocked from indexing your site by this robots.txt file. What was the message you got from google webmaster tools?
    Tim Attwood
    Product Manager
    myhosting.com

  7. #7
    ColleenB is offline Junior Member
    Join Date
    Mar 2009
    Posts
    4

    Default Re: robots.txt

    Thanks for your reply Tim.

    Google said: (which referred to my main website address)

    We had problems crawling the pages listed here, and as a result they won't be added to our index and will not appear in search results

    Error icon: robots.txt unreachable
    ----------------------------------
    I have deleted the file but they have not tried to crawl the site again. It's been since 3/2. I have no clue how long it will take them to try again.

    I wrote my entire site in Front Page so every page has the same extention.
    Thanks again
    Colleen

  8. #8
    tima is offline Administrator
    Join Date
    Apr 2008
    Posts
    191

    Default Re: robots.txt

    I'm not sure why they couldn't find your robots.txt file, unless it was moved or renamed. I found the following tool which could be usefull in finding errors in a robots.txt:

    If you still have your robots.txt located in the root folder of your site you can do the following to check it for errors:

    * Sign into Google Webmaster Tools with your Google Account.
    * On the Dashboard, click the URL for the site you want.
    * Click Tools, and then click Analyze robots.txt.

    We had problems crawling the pages listed here, and as a result they won't be added to our index and will not appear in search results
    This error could also be related to problems crawling your site. Make sure your pages are all linked together through internal hyperlinks. You could also try adding a sitemaps.xml file, which you can submit to search engines to help them crawl your site and to suggest to them how often to come back for updates.
    Tim Attwood
    Product Manager
    myhosting.com

  9. #9
    ColleenB is offline Junior Member
    Join Date
    Mar 2009
    Posts
    4

    Default Re: robots.txt

    My internal links are buttons. I had read in the Google Webmaster Tools that they should be text. So I think I am going to add the text links at the bottom of each page to see if that helps.

    I also saw the information about the site maps but haven't been able to figure out how to do them, even with the demonstrations in Webmaster Tools. Like I said, I am such a beginner in all of this and am doggone proud that I have got this far. LOL I will try again to understand how to make a site map and go that route and see what happens.

    Thanks again, I appreciate your time and efforts with me.
    ((HUGS))
    Colleen

  10. #10
    Zath is offline Official Member
    Join Date
    Mar 2009
    Posts
    13

    Default Re: robots.txt

    Quote Originally Posted by ColleenB
    My internal links are buttons. I had read in the Google Webmaster Tools that they should be text. So I think I am going to add the text links at the bottom of each page to see if that helps.

    I also saw the information about the site maps but haven't been able to figure out how to do them, even with the demonstrations in Webmaster Tools. Like I said, I am such a beginner in all of this and am doggone proud that I have got this far. LOL I will try again to understand how to make a site map and go that route and see what happens.

    Thanks again, I appreciate your time and efforts with me.
    ((HUGS))
    Colleen
    You can use graphic images as buttons. What you also need to do is use the alt tag and title tags. These can say the name of the image or the button name/action or anything you want "found" for SEO purposes.
    The alt tag is incorrectly used as a float over option when that should really be handled by the title tag. It's widespread and even accepted in strict validation.


 

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts