Re: make people use ROBOTS.txt?

Benjamin Franz (snowhare@netimages.com)
Sat, 23 Nov 1996 06:01:17 -0800 (PST)


On Fri, 22 Nov 1996, HipCrime wrote:

> Let's change the subject.

Lets not.

>
> This attempt to define "robots" has proved futile.

Afraid ActiveAgent might be defined as a robot, are you? That is one
way to avoid having to deal with the fact most people *DON'T LIKE* your
robot. This effort is proving far from futile. The fact you don't like
the answers it is coming up with is one of the *best* indications I have
seen that we are on the right track.

> The deep issue (at least my impression of it), was
> stated by "Thaddeus O. Cooper" <tcooper@mitre.org>:

[...]
> Let's take HipCrime's ActiveAgent as an example.
>
> Q: What is its goal?
> A: To locate as many Email addresses as possible.
>
> Would it be correct for ActiveAgent's author to implement
> something that would inhibit completion of this assignment?

Yes. Because it is not an *ETHICAL* goal to index people's email addresses
for the purpose of generating junk email when they *don't want them
indexed for junk email*.

> Programmers, be truthful to yourselves: didn't you learn
> (early on), that efficiency is tantamount to success?
>
> This is the ONLY reason that ActiveAgent does not observe the
> robot-exclusion "standard": interference with its goal.

This is also why burglars ignore certain sections of the criminal code: It
interferes with their goal of unjustly enriching themselves. This does not
mean they are *RIGHT* to do so.

> We're talking about the WORLD wide web here. Who on this list
> (really) believes that he/she can define the motives of "the world"?
> Please, please, let me know, if you can, because this is the real
> information that I am looking for.
>

You are playing sophistic games.

A) Web site owners tolerate (or encourage sometimes) robots because they
provide *value to the website owners*. Robots that add up negative on the
value scale are not wanted. Robots that are used to generate junk email
are in that unwanted category.

B) Robots that ignore the wishes of sites owners as expressed in
robots.txt are rogue and should not be on the net *AT ALL*.

C) If I figure out a way to reliably detect your rogue robot (ActiveAgent
*IS* a robot - the debate is _what class of robot_) it will be the
*LAST* time your robot visits any site I control (I control dozens). And
you can be sure I will publish far and wide how to block it.

-- 
Benjamin Franz

_________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html