Good question. For short-lived experiments like this, I
usually just use the argv[0]. Probably something like
"mycheckurls".
> And, if
> you don't think it is a robot, a cache, or a proxy,
> what do you think it is? If you have some animal that
> is is none of these things, then we either must expand
> our definition of the animals we have, or define some
> new animals.
Why call it an "animal" at all? Because it's automated? It
was me checking the validity (and gathering some statistics)
about some URLs using a program.
Lemme pose a little more complete scenario for you, just to see
if it (my program) and it's policy makes more sense in context.
Imagine you are writing a cache server, like the Netscape proxy
server. And you let this thing run for a while at a busy site and
cache up a few gig of documents.
Then after it's run a while, to test your caching policy you want
to evaluate how many of the cached documents are valid _right now_. You
might use a program like mine, one that simply records if the document
is available and when it was last changed, and run it against all of
the URLs in your cache.
Well, would you then want that program to obey the exclusion standard? You
wouldn't want it zapping servers with thousands of requests in a few seconds,
but isn't one HEAD every couple minutes slow enough? Does such an application
really necessitate any 'exclusion' at all?
-- nabil@i.net http://www.i.net/aaron/