Re: It's not only robots we have to worry about ...

Rob Hartill (robh@imdb.com)
Thu, 26 Dec 1996 13:43:24 +0000 (GMT)


Captain Napalm wrote:

> With some of the recent talk about broken robots, I'd thought I'd share
>some data I've collected that shows that robots aren't the only thing that
>are broken.
>
> In trying to track down problems and bugs, I've enabled extensive logging
>into the meta-search engine at Cyber411. Frankly, I'm amazed at what I'm
>getting.

It's not until you look that you discover the volume of junk coming from
clients... robots, agents and browsers.

Your comment about HEAD and CGI: I had to block all HEAD requests to
CGI some time ago. The accidental abuse just got too much with every man
and his dog wanting to re-check old CGI urls at alarming rates.

A day doesn't go by where I don't add a new USER-AGENT to be blocked or
even an entire site/domain.

What continues to baffle me is the lack of systematic abuse checking
that goes on (or is reported). I've also seen reports such as "NetJet
only accounts for 4% of our traffic so isn't a problem".. when that
4% of traffic is coming from 0.01% of their users. Duh, wake up people,
you're going to have to double your server capacity just to keep up with
your existing userbase if more of them start using NetJet and other
"buy now, let everyone else pay later" devices.

rob
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html