[Fwd: Re: To: ???? Robot]

Rob Turk (rturk@austin.ibm.com)
Mon, 29 Apr 1996 10:13:21 -0500


This is a multi-part message in MIME format.

--------------3F54FF6ABD
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Well, this has to do with robots, in that, what they want to do
shouldn't be done with robots. I'm offering this to the developer
community for opinions:

What do you guys think?

--------------3F54FF6ABD
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Received: from netmail.austin.ibm.com by toolbox.austin.ibm.com (AIX 4.1/UCB 5.64/4.03-client-2.6)
for rturk at ; id AA33872; Mon, 29 Apr 1996 10:04:57 -0500
Received: from toolbox.austin.ibm.com (toolbox.austin.ibm.com [129.35.203.131]) by netmail.austin.ibm.com (8.6.12/8.6.11) with SMTP id KAA58182; Mon, 29 Apr 1996 10:04:55 -0500
Received: from toolbox.austin.ibm.com by toolbox.austin.ibm.com (AIX 4.1/UCB 5.64/4.03-client-2.6)
for rturk@austin.ibm.com at austin.ibm.com; id AA34890; Mon, 29 Apr 1996 10:04:51 -0500
Sender: rturk@austin.ibm.com
Message-Id: <3184DA93.59E2@austin.ibm.com>
Date: Mon, 29 Apr 1996 10:04:51 -0500
From: Rob Turk <rturk@austin.ibm.com>
Organization: IBM Worldwide AIX Support Tools Development
X-Mailer: Mozilla 3.0b2 (X11; I; AIX 1)
Mime-Version: 1.0
To: Mitchell Elster <elsterm@bwcc.com>
Cc: rturk@megalith.com
Subject: Re: To: ???? Robot
References: <19960427233806.0493ef80.in@BitMaster>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Mitchell Elster wrote:
>
> I said:
> >> Goal: To create a searchable database of e-mail address

The web is already searchable, and sometimes the documents that comprise it contain <LINK..>
tags that refer back to the author of the document. Most of the e-mail addresses found in a
given search would simply be web freaks...not the kind of person to be involved in things that
P.I.'s care about...though that could make a mighty "fresh" episode of yet-another-cop show.
Don't you think?

Most of the e-mail-like strings of text in web docs relate to the author of the document.
Other ones go to mailing lists or are forwarded to different people based on characteristics
of the context of the message. This leads me to believe that searching the web for e-mails
would not be the best way to accomplish your client's goals.
>
> I Said:
> >> I've got a robot I'm thinking of creating. Only I don't care about indexing
> >>HTML Docs. I'm looking for people. Any help will be appreciated.

>
> I reply:
> I'm looking to index only in the U.S.A. starting with the East Coast. For
> the amount of information that I am looking to index, I plan to start with a
> 500meg SQL Server database, scaling up to 2-4gig, as necessary. As I have NO
> idea how many records would be generated (millions I'm sure), I plan to
> start with only the east coast, monitoring closely the data being collected.
>

Okay, the web is this *distributed* network of networks. It has few geographical references
that would enable one to say "This occurred in _____________" about any given document. A
good many of the documents that would be found if a search were made for one of my old e-mail
addresses would refer to Austin, TX. Any particular e-mail address you found on one of those
documents that refer to Austin may refer to someone who lives in Austin, TX. Perhaps not.
For a P.I., the way to get names is to offer to buy them from websites willing to sell in the
market that you're pursuing.

See, if people are filling in a form for say, to register for the Annual Private Investigators
Ball in Baltimore, Maryland then you'd be able to say "These people are going to be in
Maryland (East Coast, remember?) on this or that date." Now, that information you could
BROKER to the P.I.'s themselves, or anyone else who wants to know. Kinda scary isn't it? See
this kind of thing would lead to a diminished trust for users to fill out CGI's that exact
personal information, a commodity that must be respected by merchants in the new economic
frontier. If the merchants violate the personal information of their customers, then they
will lose customers. That's basic business offline, but for some reason mostly ignored by all
the P.I.'s with $$$ in their eyes, and all their insurance-salesperson ilk.

But really, you don't need a robot to do what you're asking, although the process of making
website CGI's manage visitor information is VERY similar to a robot.

-- 
Rob Turk <mailto:rturk@austin.ibm.com> Unofficially Speaking.

--------------3F54FF6ABD--