BackRub robot

L a r r y P a g e (page@CS.Stanford.EDU)
Thu, 13 Jun 1996 11:44:13 -0400


I've fixed the rate problem. I believe it was only happening rarely, which
is why I didn't notice it sooner.

However, I don't have an easy way to determine if a response is made before
firing off a new one, except setting the time between hits to be large.
This is because of the way my robot is distributed. Do people think this
is reasonable?

I was also thinking that something we all might collaborate on would be a
list of sites that as a robot you shouldn't index, or should only index the
top levels. This might save the net a lot of bandwidth, and everyone a lot
of hassle. We could develop a database to hold this information. Besides
being useful to us, it might provide a good forum to show site designers
why they should or shouldn't do certain things with real examples.

I am using the data to do clustering and some economic models of the web.
I'll send mail to this list when I have my query engine up.

Sorry for any problems,
-Larry

>FYI, the following robot
>
>huron.stanford.edu backrub@pcd.stanford.edu:BackRub/0.5
> and
>grand.stanford.edu backrub@pcd.stanford.edu:BackRub/0.5
>
>is hitting a site once a second and isn't waiting for responses
>before firing off new requests. The owner has been notified.
>