Re: Single tar (Re: Inktomi & large scale spidering)

Martin Hamilton (martin@mrrl.lut.ac.uk)
Tue, 11 Feb 1997 18:44:22 +0000


--==_Exmh_716213861P
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

Erik Selberg writes:

| I think the only reasonable solution in a lot of ways is to make a
| spider which is attached to a server. This spider would then create
| the tarballs (and you could have one big file, as well as a week's
| worth of changes in another for incremental updating). The spider
| could also do other useful things, like make sure you got all your
| scripting done right, you didn't forget any links, etc. etc. etc.

I think there are two sides to this - one is how you choose to build
your local index data for exporting to the outside world, and the
other is the format(s) and protocol(s) you use to do the exporting.

What I'm really interested in is whether robot authors can be
persuaded to pick up index data in a small number (ideally one?) of
common formats and via a small number of common protocols (one?!).
For example: SOIF and the Harvest Gatherer protocol, or RDM over HTTP.

How about it ?

Cheerio,

Martin

--==_Exmh_716213861P
Content-Type: application/pgp-signature

-----BEGIN PGP MESSAGE-----
Version: 2.6.3i

iQCVAwUBMwC9/9ZdpXZXTSjhAQEHEAP9H4zz1CP2lyfSMy+PEfPVMBizbtWHk9l+
sC++4Tb6x93UIAYyzWzucG91zu67GFMFbj+KZXOZtwGxeccLSmFRHnyoieouwkT+
hFnVdhdVarsduCxEjC+X4pF84g53D49b/q7dTcXfB+aTmVM57w/al2NRrD3PckKP
Ej2coexDRMk=
=LCMq
-----END PGP MESSAGE-----

--==_Exmh_716213861P--
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html