What I am proposing is that we re-evaluate the reasoning behind
robots.txt. The proposals I have seen in this list seem to rely on the
assumption that robots.txt is enforcible when it is quite clearly not.
One of the primary reasons to restrict access to a site is to prevent
abuse of cpu cycles and/or bandwidth by unscrupulous robots and agents.
There is no way that robots.txt can ever hope to achieve this, certainly
not while clients can choose to ignore it. What is needed are more
intelligent servers to counter the intelligent agents of the future. The
servers should be able to anticipate the load based on experience and
aquired knowledge and dynamically adjust their delivery rates so as not
to overload the network. These servers are going to have to learn how to
prioritise different types of clients based on previous access patterns.
There is a strong evolutionary pressure at work here and the only way to
survive is to adapt.
Ian Davis
iand@fdc.co.uk
--
Figuredata Computing Ltd. | email: info@fdc.co.uk
Making Science Compute | Tel: +44 (0)1932 350554
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html