Re: Broadness of Robots.txt (Re: Washington again !!!)

Brian Clark (bclark@radzone.org)
Wed, 20 Nov 96 21:21:17 -0500


-- [ From: Brian Clark * EMC.Ver #2.5.02 ] --

Thaddeus wrote:

Keeping them out of areas that humans have access to doesn't make much sense
if you want people to watch those areas and return to them.

Brian responds:

But you fail to take into account is that a web site maintainer might change
their mind. How are they to communicate that change to a program that
doesn't actually "read" the contents of that page.

Seem like an academic argument? I've got a very real example:

http://www.vivid.com/pr/articles/ap082895.html

When we built a robot to check out the "new question" page of a Silicon
Graphics contest, it probably defeated the purpose (of the designer) to make
you have to come back physically and check to keep with the game. In effect,
it was just a "Page Watcher" ... fortunately, they had many options open to
them (as designers of the contest) than most webmasters would have had:

* they could have banned the IP (but what if the "page watcher" was from
aol.com instead of radzone.org???)

* they could have disqualified us from the contest (a context-specific
issue)

* they could have changed the file name to generate a 404 (but our spider
would have kept hitting it anyway ... it was pretty dumb.)

The point being: in another situation, robots.txt might have been the only
thing to save them. Now, image 10,000 players in that contest all using
different "Page Watchers" ... *brrrr* I'm sure Vivid Studios would have been
banging their shoe on the podium like Robert does sometimes.

I'd still maintain that even a "Page Watcher" is a robot (anything that
doesn't think about what they are doing and does it either repeatedly or
broadly.) I don't think "a user told it to do it" fairly qualifies the
extent of this protocol's usage.

Brian

--

------------------------------------------------------------------ Brian Clark - President - bclark@radzone.org http://www.radzone.org/gmd/ ------------------------------------------------------------------ _________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html