Re: Broadness of Robots.txt (Re: Washington again !!!)

Rob Hartill (robh@imdb.com)
Thu, 21 Nov 1996 22:32:26 +0000 (GMT)


Erik Selberg wrote:

>Hmm... rather than using a User-Agent, perhaps using a User-Type
>similar to Content-type (someone else made a similar suggestion, now
>that I think about it). For example:
>
>User-Type: robot/webcrawler # WebCrawler's robot
>User-Type: robot/scooter # DEC's robot
>User-Type: watcher/metacrawler # MetaCrawler

Some more new robot headers to consider:

Robot-confidence: 45 # owner's 45% confident that his code won't
#explode. This could be computed on the basis of
#the ratio of sites visited and sites trashed.
Days-since-last-accident: 2 # when robot last screwed up.
Responses-not-understood: 301 302 40* # response codes that are a foreign
# language to the robot and not
# worth sending unless you want to
# see how fast the robot can connect.
Janitor-phonenumber: 555 1234 567 # Just in case the power cord needs
# yanking from the wall.
Standard-excuses: 1,5,9 # hashes to lookup text excuses in a RFC, this saves
# the server admin contacting the site for an
# explanation and saves the robot owner responding.
# A big bandwidth saver.

I hear that Ted Kaczynski used to shave until he had to look after
his brother's web site for a week.

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html