RE: nastygram from xxx.lanl.gov

Frank Wales (frank@limitless.co.uk)
Wed, 10 Jul 1996 01:54:25 +-100


Steve Simon:

>Furthermore, I think that it
>is way out of line for a Web administrator to presume that
>just because a robots.txt file exists on a site all robots MUST
>follow it.

Except that such an approach countenances abuse, and
it-doesn't-apply-to-me-ism. Better, to me, to have a
robots.txt standard that allows for all reasonable cases,
with well-defined exceptions. Then the gulf between the
expectations of the administrators and the aims of the
robot-writers isn't filled with bad feeling and abuse. Just because
some robot writers are ignorant or bloody-minded doesn't mean
that we should accept that's how things should be.

>One can reasonably assume that if a robots.txt file is present,
>it is a strong suggestion by the Web administrator for the desired
>behavior from robots. Undoubtedly it is wise, and almost always
>good practice to respect such a request. But no Web administrator
>should expect that it will be followed in all cases.

So it's sometimes good practice to ignore the request, then? As
I said in another message, I feel this is bad manners, since it
seems clear that some people use the robots.txt file to require
that all automated activity is excluded, not just that that's
expensive or resource-hogging. In part, they probably do this
because there is no other mechanism for them to use.

>But I for one will adamantly oppose
>the interpretation that any of this MUST be followed by all robots under
>any circumstances. I believe that such an interpretation was never
>intended for it, and in my opinion would be an intolerable intrusion
>and restriction on other people's freedoms and behavior.
[...]
>The Constitution guarantees you the right to create programs
>that will do useful things for you and/or the Web community
>at large. If your program violates no laws, and in YOUR opinion
>does nothing improper or unethical, go ahead, create it and run it,
>no matter what anybody else thinks.

Now, I have to say that I think these two paragraphs
are unmitigated guff of the highest order. I don't believe that
Ben Franklin et al were countenancing automated disregard
for the published opinions of one's peers when they wrote
the Bill of Rights, nor do I think this is an issue that the
ACLU would spend more than ten seconds thinking about.

Whether or not I agree that someone can put up a web site
that only accepts human-initiated requests, some people seem
to be doing it, and seem to believe that the robots.txt file gives
them the ability to exclude automated processing of their site.
If you want to disabuse them of that notion, then do it through
education or persuasion, not by continuing to automatically
process their site.

If the American Bill of Rights means anything on the net,
it means admins have the freedom to say: "Bog off! This is our
site and we say what happens here," and to have others respect
that, even if they don't agree with it. It doesn't give others
the right to say: "We don't care what you think about how to
run your site, we're going to do what we want to it."

It is, after all, *their* site, and if they don't have final say over
what happens there, who does? Everybody else?
Anybody else? The most voficerous posters on some mailing list?
Or some guy with what he thinks is a harmless program because
it doesn't shag a SparcCenter 10000 to death? I do hope not.

--
Frank Wales [frank@limitless.co.uk]