Re: Washington again !!!

Gregory Lauckhart (gregl@cs.washington.edu)
Wed, 20 Nov 1996 08:11:31 -0800 (PST)


On Wed, 20 Nov 1996, Rob Hartill wrote:

> Now will the real culprit who knowingly let broken software run from
> cs.washington.edu please step forward. You must have a real good explanation
> for this one.

I am the one who let the software run. My fault. I am very sorry about
this.

My "real good explanation" (offered in no way as an excuse) is that
although I had disabled the main form, the faulty binary was still present
on our servers for a time. Once the servers were brought up, users
evidently accessed this CGI program through forms on their own sites.

The communications layer in the old binary had bugs; the library it used
has been entirely replaced. This I believe to be the issue, how to handle
"robotic insanity", rather than what robots should obey robots.txt,
another (entirely valid) discussion altogether. In this particular case,
the problem would obviously have had much the same effect had it
manifested during a request for robots.txt itself.

For our project, I have installed safeguards right above the transport
layer to prevent this kind of behavior in the future, as well as better
defining the reaction process to such problems.

As for robot exclusion: I will be putting support into MetaCrawler. I
have not been privvy to many of the discussions between Erik Selberg and
others. However, even considering the technical issues, having user
agents such as MetaCrawler obey robots.txt currently seems to be the only
politically wise move.

Certainly I personally wish MetaCrawler to be considered a good Net
citizen while continuing to provide a valuable service to others.

Again, I would like to apologize for every byte of Net traffic and every
cycle of CPU time wasted.

Greg

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html