> How about if the USER_AGENT matches one of the list of known browsers, =
> then it is a person. Else, it is a robot.
Nasty thing to do to new browsers, or browsers you don't know about.
> I know that there are a = number of browsers out there, but I think it
> would be fairly easy to = check the server logs to pull out the vast
> majority of them and to = adjust the list as necessary.
Check the server logs for what? How do you see from the server logs
whether it's a robot or a browser?
I think you'll just have to recognize "robot behaviour" and add that to
the list of robots.
I also think that we need a new version of robots.txt, which tells
robots to send a `Robot:' line, or let USER_AGENT start with `robot', or
whatever. And an official way to tell robots to not follow links from a
given page.
(And also Allow: plus regexps and in robots.txt:-)
Also, I think Carthage should be destroyed.
Hallvard