RE: The Internet Archive robot (fwd)

Brewster Kahle (brewster@archive.org)
Fri, 06 Sep 1996 20:41:38 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Kevin Hoogheem: "Re: Looking for subcontracting spider-programmers"
Previous message: Robert Stober: "(no subject)"
Next in thread: Fred Douglis: "Re: The Internet Archive robot (fwd)"
Reply: Fred Douglis: "Re: The Internet Archive robot (fwd)"

Ted,

I am the president of the Internet Archive. A detailed background on what
we are doing is at www.archive.org. The San Jose Mercury (a local paper)
wrote a piece last week:
http://www.sjmercury.com/business/gillmor/dg090196.htm

We are funding this through donations of equipment and data, and I am
funding the cash part.

I think the net is important, and there is reasons to save it for historians
and scholars. I also think there are things we can do with this data to
understand the internet phenomenon that could lead to better internet tools.

If you have any further questions, I would be happy to answer.

-brewster

>---------- Forwarded message ----------
>Date: Fri, 6 Sep 1996 16:42:04 -0700
>From: Ted Sullivan <tsullivan@snowymtn.com>
>Reply-To: robots@webcrawler.com
>To: "'robots@webcrawler.com'" <robots@webcrawler.com>
>Subject: RE: The Internet Archive robot
>
>Can you tell us who you are collecting this information for. Surely
>nobody except governments or related agencies would be interested or
>able to fund such an effort. Not that I can stop you I suppose from
>grabbing and indexing my site ..... but I would wonder at your liability
>for giving erroneous data collected at mine or my customers sites to
>others after it has been corrected at the original source.
>
>Ted Sullivan
>
>>----------
>>From: Mike Burner[SMTP:burner@archive.org]
>>Sent: Thursday, September 05, 1996 8:46 PM
>>To: robots@webcrawler.com
>>Subject: The Internet Archive robot
>>
>>Hello World,
>>
>>The Internet Archive robot, which will identify itself as "ia_archiver" in
>>the "User-Agent:" HTTP header field, will begin archiving the Web over the
>>next few days.
>>
>>In the short term, the archiver will focus only on images; the lists of
>>these have been derived from our existing HTML feeds, kindly donated to us
>>by several of the commercial search engines.
>>
>>The archiver will obey the Standard for Robot Exclusion, and will take pains
>>to tread softly on the surface of the Net. If anyone feels we have failed
>>in either case, please let us know so they we can rectify the problem as
>>quickly as possible.
>>
>>Internet Archive is gathering, storing, and providing access to public
>>materials on the Internet such as the World Wide Web, Netnews, and
>>downloadable software. The collection, reaching ten terabytes, will provide
>>historians, researchers, scholars, and others access to this vast collection
>>of data, and ensure the longevity of the information.
>>
>>For more information, including how you can help the Archive, please visit
>>our web site (http://www.archive.org).
>>
>>Mike Burner
>>
>>
>
>

Next message: Kevin Hoogheem: "Re: Looking for subcontracting spider-programmers"
Previous message: Robert Stober: "(no subject)"
Next in thread: Fred Douglis: "Re: The Internet Archive robot (fwd)"
Reply: Fred Douglis: "Re: The Internet Archive robot (fwd)"