Re: PS

Rob Hartill (robh@imdb.com)
Mon, 15 Jul 1996 18:06:43 -0600 (MDT)


>>How about this instead: Instead of trying to make a HTTP protocal change -
>>simply ask all robots to identify themselves as a robot via a (robot) in
>>the User-Agent header?
>
>Great idea, but again is begs the question, "What qualifies as a robot?" For
>example, does WebCompass (which retrieves two levels deep to summarize
>findings) count as a robot? What about other types of "user agent"
>interaction?
>
>Getting everyone to agree to put (robot) in their User-Agent will pretty
>much require (again) an active definition of what should and shouldn't. In
>my opinion, this is even less a cut-and-dried issue than "Should it adhere
>to the robot exclusion standard?"

Anything that's not directly human controlled and that has not been given
permission to visit the site unregulated.

Both requirements are there to protect sites that have made an adequate
attempt to guide automated systems around their system.

I'm not sure how Netscape's "what's new" bookmarks checker fits into this
but I'd like it to show some sense if it needs to visit a site multiple
times. Rapid fire requests to the same server should be avoided even if
there's a human present.

-- 
Rob Hartill (robh@imdb.com)
The Internet Movie Database (IMDb)  http://www.imdb.com/
           ...more movie info than you can poke a stick at.