> I have been following this list for a while now and recent discussions
> have prompted me to add my small voice to the fray. As I see it there is
> no way to force robots/agents to use robots.txt. We are on the brink of a
> Cambrian explosion of intelligent agents,
I believe you could be right.
> and it is futile to expect every
> agent developer to implement support for robots.txt. In addition to the
> arrogant few who choose to ignore robots.txt out of choice, there will be
> countless more who will not support it because they are unaware of its
> existence, its importance or its function.
>
> What I am proposing is that we re-evaluate the reasoning behind
> robots.txt. The proposals I have seen in this list seem to rely on the
> assumption that robots.txt is enforcible when it is quite clearly not.
I am not so sure about unenforcibility, but I think the RES is valuable
even if it is not enforcible.
> One of the primary reasons to restrict access to a site is to prevent
> abuse of cpu cycles and/or bandwidth by unscrupulous robots and agents.
> There is no way that robots.txt can ever hope to achieve this, certainly
> not while clients can choose to ignore it.
Let's not forget that there is also a non-zero-sum cooperative advantage
to using the RES. Developers of _intelligent_ agents should recognize that
the robots.txt contains information that can be used to improve the
efficiency and effectiveness of their programs. Let me describe an example
that is perhaps not typical. I am responsible for a search engine that
lists "official" web pages of my university (see
http://www.usf.edu/Search_USF/). With robots.txt it is very easy for Web
admins to exclude their student pages from this database. This arrangement
helps me as well as the Web admins (and the students).
> What is needed are more
> intelligent servers to counter the intelligent agents of the future. The
> servers should be able to anticipate the load based on experience and
> aquired knowledge and dynamically adjust their delivery rates so as not
> to overload the network. These servers are going to have to learn how to
> prioritise different types of clients based on previous access patterns.
> There is a strong evolutionary pressure at work here and the only way to
> survive is to adapt.
I believe that sooner or later this kind of server-side control will come,
but I hope you are not suggesting that we just abandon the existing RES.
-- LLLL LLLLL LLLLLL Arthur Matheny LIB 612 LL LL LL LL LL Academic Computing University of South Florida LLLLLL LLLLL LL matheny@usf.edu Tampa, FL 33620 LL LL LL LL LL 813-974-1795 FAX: 813-974-1799 LL LL LL LL LL http://www.acomp.usf.edu/_________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html