Re: define a page?

Rob Hartill (robh@imdb.com)
Fri, 22 Nov 1996 14:32:28 +0000 (GMT)


HipCrime wrote:
>
>Rob Hartill <robh@imdb.com> wrote:
>
>> Define a "page" !
>
>A single HTML document, returned by a single request to
>a server's hypertext transport daemon via a URL of the
>form:
> http://some_server/some_file

Only HTML now ?

What about the gigabytes of postscript, pdf and dvi that can be found
on a single site that we all love.

>> click <here> to turn the satelite 0.5 degrees, map the surface
>> of the earth in a 100 mile radius and image process the results
>> to determine how many trees have fallen this week...
>
>Does anybody really believe there's a page somewhere like this?

That's not the point. The point is that a robot does not understand
what it's doing. All it knows is that if it sends a request, something
should come back a little later. There are plenty of URLs (pages if you
prefer) out there that start a considerable amount of processing or
consume lots of bandwidth or both. The people providing the server services
will have configured their services so that it responds well to
reasonable usage patterns. When a robot/agent comes along and behaves
in a way unlike a human, then the balance can be upset, maybe nobody
really notices until another ones joins in, then another... You're
saying that I must buy more hardware to allow robots to access my sites
in order to compenstate for lost CPU cycles, or that I must pay for another
T3 so you can rapid fire looking for changes. If your users are so in
need of sending your agent to my site, then you pay for my upgrades.

I'm arguing that if I say "no robots" then that should be respected.
If that causes inconenience to users of "agents" then that's too bad, but
I can live with it and I am willing to listen on a case by case basis to
requests to ease a restriction or two. Concentrate on working within the
areas that are open to robots, there are plenty of them out there.

rob
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html