Bad agent...A *very* bad agent.

Benjamin Franz (snowhare@netimages.com)
Wed, 18 Sep 1996 10:45:22 -0700 (PDT)


It was brought to my attention that there is a 'personal agent' robot
available at <URL:http://www.agentware.com/>. I was asked my opinion on it
and its impact on the web in general, so I downloaded and reviewed it.

Not good.

To be net friendly, this class of robot should be closely associated with
a central general purpose robot that caches high request items. However,
this one does not do that - which makes it a long term lose for the net as
a whole - especially since this robot doesn't even play by the rules most
robots abide by - it doesn't even identify itself to a server.

This *particular* agent has a *severe* problem in that it ignores
robots.txt files and will walk right into infinite tree spaces at high
speed. I set an agent to look for 'devilbunnies rabbits' - and it
attempted to index an archive with in excess of 50,000 saved articles that
was firewalled with a robots.txt file precisely to *prevent* such an
occcurance. It crashed the agent program a few hundred articles into the
attempt after firing a 'machine gun' series of requests to the server. Not
server friendly at all. Lastly - rather than being self-limiting by
default (halting when it doesn't find any high relavance pages or when it
reaches a certain amount of found material) - its default is to try and
explore the entire net, starting from its 'high grade' list from existing
general purpose search engines. To top it all off - it has a 'fire and
forget' mode where you can leave it running for hours..days...weeks...

This is very close to a 'worst case' robot - the only things that could be
worse would be integrating it directly into a web browser or allowing to
to fire parallel requests rather than the 'one request at a time' approach
it uses now.

--
Benjamin Franz