robot clusion; was Re: we should help spiders and not say NO!

John D. Pritchard (jdp@cs.columbia.edu)
Sat, 29 Jun 1996 10:20:38 -0400


> The current robot standard says
>
> we want so and so bot, and don't go here or there!

my own suggestion would be to complement the "Disallow" operator with an
"Allow" operator which supplements the root html document with additional
root html documents to index. this would preclude the need for web masters
to goto a major index bot more than once to ask it to index a site with
multiple document hierarchy roots. this may also make robots.txt easier to
write if your allow list would be shorter than your disallow list.

is robots.txt a meta information entity which should include only negative
and not positive information? can meta information require meta
information, eg, "expires"? would robots.txt need "expires" more if it
included "Allow" than with only "Disallow"?

perhaps most importantly, the grammatical primacy of regexp-enhanced bot
types in the current robot exclusion draft implies that the "Disallow"
operator is binary, and so the grammatical logic of the exclusion draft is
incomplete in the sense of only half as expressive as could be... without
"Allow" in addition to "Disallow".

so would Dis/Allow make robots.txt the "robots' clusion standard" entity? :)
maybe robot cooperation or robot direction is better.

-john