Re: Inktomi & large scale spidering

Otis Gospodnetic (otisg@panther.middlebury.edu)
Sun, 26 Jan 1997 12:03:59 -0500 (EST)


> Date: Sun, 26 Jan 1997 12:37:04 +0000
> From: Martin Hamilton <martin@mrrl.lut.ac.uk>
> To: robots@webcrawler.com
> Subject: Re: Inktomi & large scale spidering
>
> andy@andy.net writes:
>
> | I think we can all agree the aliweb model was not the greatest idea.
> | If webmaster have to create the index manually it will not gain
> | widespread support. If somebody were to invent a nice clean java
> | app that would form the index and took virtually no time to setup
> | perhaps you could get some support. However, there were a number
> | of great perl scripts to form site indices for aliweb and very few
> | sites used them. Would and applet be enough less configuration
> | and hassle that people might actually use it??? Nobody knows. I
> | guess there is only one way to find out, but I'm not going to rush
> | out and write it.
>
> I suspect the most effective approach would be to make this automatic
> index generation a feature of the WWW server itself - i.e. something
> which the server admin doesn't have to install & configure separately.
> Let them add manually generated metadata, and let the indexer harvest
> author-supplied metadata (e.g. embedded in HTML docs), but don't rely
> on either of these. Oh, and ship the HTTP daemon with this feature
> turned on by default ;-)

stole the words from my mouth.
no separate applet, script or whatever.
here, we have Rob Harthill on this mailing list, and we all know how much he
loves robots roaming around his sites. Since he works for/with the Apache
group and since Apache is the most popular Web server, maybe he could do
something to include this feature in the future versions of Apache.
Netscape's Web servers already have tons of configurable options accessed via
Web GUI, so adding something like a On/Off switch for this local index would
be easy, and persuading Netscape's developers would be the only 'difficult'
task.

Am I missing an obvious obstacle to this ? Comments ?

Otis

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html