RE: make people use ROBOTS.txt?

Terry Coatta (coatta@opentext.com)
Sun, 24 Nov 1996 16:47:09 -0800


Erik Selberg said:

> So, it seems to me what we want to have is a method whereby there's
> still some openness to the Web, but there's still some accountability
> towards its use. There needs to be some _strong_ method of
> identification (and User-Agents that can be spoofed aren't strong by
> any means) and some _strong_ method of access control.

I am not convinced that this is the best path to pursue. A strong =
identification method is going to be difficult to design, and will =
almost certainly need to fall back on technologies such as digital =
signatures or the like. In the long run, this may be the way to go, but =
for the immediate future it would seem like something a little easier =
(for both implementors and users alike) is needed.

If we can't have strong identification in the near future, the we have =
to live with the fact that some people are going to write programs which =
annoy other people. More or less by definition, these people don't care =
about the rules, and attempting to create a protocol for controlling the =
actions of their programs is futile.

To me, this means two things:

(1) Web site administrators need tools which allow them to block access =
to clients whom they consider bothersome. These clients may be browsers, =
or robots, or anything inbetween. All that matters is that they are =
accessing someone's site in a manner inconsistent with the desires of =
the owner of that site. Ideally these tools could be configured to =
identify various patterns of behaviour and automatically block access. =
The group which I work with has already constructed some crude tools of =
this type, and they are quite effective.

(2) The robots.txt protocol should be designed with the assumption that =
the people using it are *a priori* interested in having their software =
"behave". If you believe this, then the balance to be struck is between =
expressiveness in terms of controlling access to a site (which is what =
web site administrators need) and difficulty of implementation (which is =
what robot writers have to deal with). Its hard for me to say where the =
balance lies because I am more of a robot-writer than a site =
administrator, but I'm certainly interested in the issue because we =
(Open Text) really do want our robots to be well-behaved.

Terry.

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html