Re: nastygram from xxx.lanl.gov

Istvan (simon@mcs.mcs.csuhayward.edu)
Tue, 9 Jul 1996 10:07:02 +0800


Paul Francis, Tue, 9 Jul 96 12:43:14 JST, writes

>
>If I personally had gone to the trouble of setting up a
>robots.txt file, and a robot of any flavor ignored it, I'd
>also be annoyed.
>

Hmmm... You may be assuming a bit too much of what other people ought to do
just because you spent a few minutes creating a file on your site.

>PF
>
>ps. By the way, I hear that people in America will sue
>at the drop of a cup of coffee, so if I were running a
>robot, I'd think it in my own best interests to honor the
>robots.txt file.
>

Don't worry. They can be countersued for frivolous law suits,
and in any case in the US at least, you are innocent until proven guilty.
In other words, you have the right to sue, but that does not mean that
you will win the suit, and you may also lose a great deal of money for suing
somebody.

In a later message (Tue, 9 Jul 96 15:31:16 JST) he writes:

>But, anyway, lets assume the following:
>
>1. You are doing a legitimate experiment.
>2. It gets no more than one HEAD every
> couple minutes, and no more than 400
> HEADs for any given server.
>
>Strictly speaking, I think you are ethically obligated,
>within the limits of practicality (and detectability)
>to get permission to do even this to somebody's site.

I beg to disagree. He is not obligated in any way to get your permission.
When you put up a Web site you are running a daemon that accepts
connections, from robots, humans sitting in front of browsers,
whatever. It is reasonable to assume, (and in fact hardly imaginable
in any other way), that you put it there to be accessed without need
for asking for your prior permission. ( I'd guess that
if enough people asked you for such permission, you would quickly
change your mind about the desirability of doing so.)

>...
>
>By the same token, I would probably consider a robots.txt
>file that disallowed access to be tacit denial of
>permission to run the experiment (even though the
>administrator of that system might be perfectly happy
>to assist me in my experiment, but just happens to
>hate indexing robots).
>

If a programmer is writing a program that will access
your site, you cannot even reasonably assume that she/he is even aware
of the Robot Exclusion Protocol, much less that she/he will decide to
implement it in their program and follow it.
You may consider that that would be desirable, the well-informed and
right thing to do, etc. (in general, I do BTW ) but between being
a desirable thing to do versus being mandatory, and being a great offense
if they dont, (or even a suable offense) there is a HUGE distance.

--Steve Simon