I think (and hope) that you're reading way too much into the words "safe"
and "retrieval" in order to justify potentially irresponsible actions.
e.g. if a search (=retrieval) consumes 90% of the CPU time of a server
which "protects" the URL via robots.txt, then there's no justification
from the HTTP spec for you to hit it hard and fast.
Maybe you'd like to applaud the robot owner in this scenario.. a server
on a slow network/system offers plain html files that are only of interest
to a specific group of people. The robot comes along and decides to index
the entire site and in so doing consumes all the server resources after
ignoring the guidelines, robots.txt or both. Three cheers for the HTTP
spec.. the license to act irresponsibly. I think not.
"safe" in this context presumably relates to file upload type operations
and not "safe" = static HTML files.
If you need written 'justification' for acting against other people's wishes
in the way you use *their* services, then shame on you.
The only reason POST is "safe" at the moment is that the bloody-minded and
clueless section of the robot community are too incompetent to work out
how to correctly handle POST. I'm pretty sure those morons would be doing
POST too if they could figure it out.
Would the people who disagree with me please list their robot's IP numbers
on this list so that adequate protection can be made before you visit my
site.... mine's obviously an 'unsafe' site because it's designed for those
funny things called 'humans'.
-- Rob Hartill (robh@imdb.com) The Internet Movie Database (IMDb) http://www.imdb.com/ ...more movie info than you can poke a stick at.