> [suggestion of server-side /robots.txt parsing, and a 'Robot' header]
I think server-side /robots.txt enforcement is a cool-enough idea.
Because it puts policy-makers in control of applying the policy this
means you get immediate effect rather than having to
wait until the robot updates it's copy of /robots.txt.
I feel less enthousiastic about a 'Robots' header. The quick-
and-dirty robots still won't send that header, so you're still
hosed. Anyway, do we need a new header? Why not make it part of
the User-agent:
User-agent: WebCrawler/1.0 Robot/1.0
In practice I wonder how many servers would say it's worth the
overhead and extra work.
-- Martijn
Email: m.koster@webcrawler.com
WWW: http://info.webcrawler.com/mak/mak.html