Re: infoseeks robot is dumb

DA (k3is@UnbSJ.CA)
Fri, 15 Nov 1996 18:41:34 -0400 (AST)


On Fri, 15 Nov 1996, Otis Gospodnetic wrote:

> Should a well behaved robot do this:
>
> kbackdraft-bbn.infoseek.com - - [04/Nov/1996:01:27:35 -0500] "GET /robots.txt
> HTTP/1.0" 404 207
> backdraft-bbn.infoseek.com - - [04/Nov/1996:01:27:40 -0500] "GET
> /list_archives/webph/0066.html HTTP/1.0" 200 2642
> backdraft-bbn.infoseek.com - - [04/Nov/1996:01:28:04 -0500] "GET
> /list_archives/webph/0069.html HTTP/1.0" 200 3181
>

First of all, note that your site doesn't have 'robots.txt' :-)

In my opinion....I don't think infoseek's robot is dump.....I just looked
at my access_log and saw the same from other search engines such as
opentext, lycos, atext...etc....if a document has a link to one document
in your server then infoseek will have to try to get robots.txt. If there
is another link from another document then I would say it is natural
that it will try to get robots.txt again.

The question is: if a search engine or a robot doesn't find 'robots.txt'
(return code from HTTP is 404), should it try to request 'robots.txt'
again? Certainly not, but this might make the life of a robot writer
harder!

Have fun,
Qusay H. Mahmoud
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html