[...]Re: Cache Filler

Benjamin Franz (snowhare@netimages.com)
Fri, 29 Nov 1996 08:34:40 -0800 (PST)


On Fri, 29 Nov 1996, Nigel Rantor wrote:

>
> Well I have read a lot of the archived stuff on this group, and consumed
> Martijn Koster's pages. I expect to conform to robots.txt, deal with
> relative links including the '.' and '..' directories, use raw IP
> addresses to index previously visited servers to get around aliasing,

While this was a good idea when every server used a unique IP address - it
is no longer a good idea. New generation servers can have several servers
sharing the same IP address while still being distinct.

> possibly limit depth of searches(although this depends on the site), and
> all that other stuff to make it a 'nice' wobot...
>
> I'll be on this group from now on to catch any other ideas or proposals
> for 'bots and if they apply I guess I'll try to stick to it.
>
> Apart from that I'll accept any suggestions of what [not] to do,

I am concerned that this proposed robot seems to increase network traffic
by prefetching pages that may never be viewed solely to improve the
response for the *first* person to visit the site. Subsequent people would
see no improvement over what they would see without prefetching the pages
since the first person will have caused them to be loaded into your
caching proxies cache.

-- 
Benjamin Franz

_________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html