Re: Broadness of Robots.txt (Re: Washington again !!!)

Brian Clark (bclark@radzone.org)
Wed, 20 Nov 96 21:21:10 -0500


-- [ From: Brian Clark * EMC.Ver #2.5.02 ] --

Erik wrote:

Perhaps what may be more effective is to adjust the standard to include
CATEGORIES (rathole alert!) of access purposes, and regulate those.

Brian jumps up and down, screaming:

This is exactly what I've been trying to point to ... a "purpose" addition
to this standard that allows webmasters to restrict BEHAVIOR and LOCATION,
not just location. If that "behavior list" were maintained as a MIME-style
extendable standard, you might have something durable enough for crap people
will be building for the next few years.

Ideally, a webmaster should be able to code behavior restrictions similiar
to:

BEHAVIOR: databasing/public
Allow: /

BEHAVIOR: databasing/email
Disallow: /

BEHAVIOR: archiving/*
Disallow: /

BEHAVIOR: personal/pagewatcher
Allow: /
Disallow: /special_contest

BEHAVIOR: *
Disallow: /internal/private

sketchy as a spec, and needing someone more analytical's attention, but
serves to prove the point. I do see any reason why you shouldn't be able to
mix AGENT+LOCATION and BEHAVIOR+LOCATION restrictions as well. Seems like it
would cover a broad number of "robot variants" and help define "robot" as a
broad term for automated data transfer systems of any sort (agents, page
watchers, archivers, indexers, exploratory spiders, etc.)

Brian

--

------------------------------------------------------------------ Brian Clark - President - bclark@radzone.org http://www.radzone.org/gmd/ ------------------------------------------------------------------ _________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html