>
> Paul Francis writes...
> > Well, though I'm curious about your answer, I think
> > even this question is besides the point. I think
> > anything that hits a site automatically and frequently
> > enough to be noticed can legitimately be considered
> > a robot.
...
> > you ran was not a robot), and am curious as to whether
> > or not other people feel the same way.
I'm not really sure about the "description" of a robot above or the
"definition" below; both make somehow sense.
Looking from another side I would suggest not to call a program a robot, if
it "automatically" and regularly (perhaps every few days) accesses a site in
which its user is particularly interested, to check if this site has changed.
This could be "frequently enough to be noticed", but since the user himself
would/could do the same I would say it's simply an agent acting on the users
behalf, but not a robot.
>
> I guess the only thing I can say is that every definition of robot
> always reads something like (from the 'standard for robot exclusion')
>
> WWW Robots (also called wanderers or spiders) are programs that
> traverse many pages in the World Wide Web by recursively retrieving
> linked pages.
>
> which isn't what I was doing (taking a fixed list of URL's and testing
> their reachability). You seem to be advocating not only expanding the
> definition of robot to "any automated access to a site" but "any access
> that gets noticed by the admin." My test, again, didn't "learn" new
> URLs, engage in any recursion, or even index anything.
Testing URLs in the way discussed here might not be equivalent to what I
proposed above, because a human can not access x-hundreds of URLs in, let's
say, half an hour or if he can access them in this time he perhaps can't do
anything else but loading pages.
So I think the comparison to a human user is significant for the distinction
between robots an other programs (agents) automatically accessing Web-pages.
Isn't it somehow this, what lets us see a difference between a browser or
a cache and a robot?
Mike