Again, let me comment. Presumably, the administrator is going to be
the one worried about the robots. Presumably, the administrator will
be willing to at least do something reasonably dynamic to have
reasonable control over robots. The existing robots text can either
forbid all robots from everything (simple) or can be more elaborate
for more dynamic and specific restrictions. (ie any solution can be
designed for both)
If separate instructions files were used, there could be additional
instructions added to let the robot know whether it should look for
robots.txt files in deeper directories. It might make the robot
writers job harder, but would allow users more control while still
allowing the administrator of the site to prevent robots from
endlessly looking for robots.txt files in directories where they do
not exist. Simpler yet, the root of the server could include
information to tell the robot specifically where it may or may not
find additional instructions, giving the administrator complete
control. (ie look for additional instructions under any ~username/)
Hairy, but effective...
Personally, I would like to see support for a single file that would
not only include simple instructions/restrictions for robots, but that
could be much more complex to include actual index information to
prevent the need for the robot to retrieve any resources other than
that file. It would also give the admin control over what is indexed
and how it would read. Adding a META key at the beginning of a 100+K
file may make the robot happy and give me control over keywords and
descriptions, but it still means that the server is going to send out
100K+ to a robot that does not know better.
(perhaps someone can fill me in more if robots can simply request the
HEAD - I have not started my coding yet for my internal test robot)
Scott