1) 'Blind' probing of their site...
2) Indexing of certain pages...
3) 'Privacy' of certain information in the page (such as email
addresses perhaps, or onward links).
4) Monopolisation of bandwidth and/or server resources...
(have I missed anything?)
The discussion at the moment seems to have moved to categorising the
type, activity or intentions of the robot itself. We are talking about
things such as 'do we allow access to this/that/other indexing robot' or
to 'this/that/other cache_refreshing robot' etc which seems like reverse
engineering of a robot's or agent's motives.
What may happen in the future? A court case against a robot/agent which
is blatantly trangressing the express wishes of the site owner?
Something along the 'junk fax' lines? However for this to happen the
'express wishes' have to be clear and unambiguously presented and the
transgression easily proven. Can't be the case if 'robots.txt' tries to
define the controls on guesswork about the robot's (or robot author's)
intentions...
Regards,
Martin Kiff
mgk@webfeet.co.uk
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html