Re: in-document directive to discourage indexing ?

Rob Turk (rturk@austin.ibm.com)
Tue, 25 Jun 1996 12:16:53 -0500


Kevin Hoogheem wrote:
>
> Well ya notice I didnt say it was a mandatory thing each robot
> should have to do, I was merly suggesting it for some robots
> might want to do or probly would come down to a robot that
> was just ment for this type of activty.

I said it would be a breach of netiquette for bot authors to have their
program notify ANYONE but the author or user of that bot of anything
they find. Perhaps it wouldn't be. I think that for this to be a
viable surrogate for a person to browse lots and lots of webspace, then
each agent should report its findings to one person. Not their
user/author/master and every e-mail message desitnation they can get
their grubby little mittens on.

> Oh well it was just a though.
> Also a serious breach of netiquette, comeon-- I would say it is more
> of a breach to write a crapy robot that bombs someons sites
> trying to collect data that was poorly written or writing an
> OS like windoze ;)-

Yes, that seems like an uncool thing to do, which is why if you read up
on bots and agents, everyone's saying that bot authors should control
their creations, not release them on the net until they have been proven
to perform well on a private network or machine. It's not just
netiquette, it seems like common sense.

> also like i said in a previous letter, It is up to the admin
> to have a site that works properly, If I am a customer a paying
> or nonpaying customer and visit IBM's website I want links that work
> not ones that were broken when moving although I should understand
> at 3am in the morn people dont and then think your service is always
> like this.

The good people who administer the IBM website keep the links good and
fresh on a pretty consistent (more than once a day), and if one of the
admins of the webspace notices a problem, or receives notification of a
problem, that person(s) can fix it. I think that if every user's bot
were reporting problems to every site they hit and have trouble with it,
there could be thousands of messages flooding httpd servers all over the
place. This would be a bandwidth hog for the network, for the hard
drives of each server, and for the unfortunate people who're responsible
for the site.

Imagine waking up to find your e-mail box filled with 4,876 new
messages, with approximately .76% of them coming from humans, perhaps 1%
coming from listserv or majordomo servers, and a few more from the
machines that you're responsible for...then there's 4,700 emails from
2,500 automated programs telling you that they had this or that problem
with your site(s).

You'd be freaked out because it would take you until 4pm in the
afternoon to go through all the messages (who knows, one could be from a
company that wants to pay you $1,000,000 for your BlabberBot, The Bot
That Tells All To Everyone(tm) so you can't just bulk-delete them all,
right?)

I suspect the problem here is that you're not a sysadmin, never have
been one, never have been friends with one, or have somehow never
received more emails than you know what to do with.

> What is the differance between 1000 people mailing you saying your site
> is broken then an effecient robot doing it.

1000 people wouldn't NECCESARILY write e-mails. They'd just be on their
way. After the first couple of messages arrived, the problem could be
solved. The bots would have no restraint or patience. If you set them
up to report problems to people other than their
owner/master/programmer/user then they'll do it WAY MORE than you expect
them to do so.

> WHY WERE robots made to index couldnt a human do it? was this a serious breach of netiquette?

Good luck, Kevin. Hit the books, or hit the web, but READ darnit!

<end discussion, I hope>

-- 
Rob Turk <mailto:rturk@austin.ibm.com> Unofficially Speaking.