Re: RFC, draft 1

Denis McKeon (dmckeon@swcp.com)
Sat, 16 Nov 1996 07:31:39 -0700


In <199611160416.XAA05928@armigeron.com>,
Captain Napalm <spc@armigeron.com> wrote:
>It was thus said that the Great Martijn Koster once stated:
>...
>> Robots need to be aware that the amount of resources spent on dealing
>> with the /robots.txt is a function of the file contents, which is not
>> under the control of the robot. To prevent denial-of-service attacks,
>> robots are therefore encouraged to place limits on the resources
>> spent on processing of /robots.txt.
>>
> Could you clarify what you mean by this?

I stumbled over it also - at first I thought that the agent was making
many requests for /robots.txt - but I think what Martijn means is this:

Scenario: server admin notices that some agents fetch /robots.txt, but
fail to honor contents of it - so admin has server respond to request
for /robots.txt by those agents with the output of a script, which is
arranged to encourage a poorly written agent to spend a lot of time
parsing the output. The web server is trying to deny access to its
services to the agent.

It would be like calling an 800 number to get tech support, and finding
that the music on hold is something that seems to encourage many callers
to hang up - much cheaper than hiring more humans to answer calls, but
still possible to advertise "free tech support."

-- 
Denis McKeon 
dmckeon@swcp.com
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html