robots.txt extensions

Adam Jack (ajack@corp.micrognosis.com)
Wed, 10 Jan 1996 10:35:10 -0500 (EST)


Hello,

Since this list started I've only ever seen one suggestion
for an extension to robots.txt. That, from Tim Bray,

http://info.webcrawler.com/mailing-lists/robots/0001.html

seemed sensible enough -- to add expiry information for the
robots.txt file itself. No response appears to have been given
-- did people not think it worth while? Did people think the
HTTP response field, Expires, should be used for that?

I don't know if this was discussed to death somewhere -- but
are people still considering extensions to robots.txt? I'd be
interested in any pointers to an archive of such a discussion.

If there is point in discussion additions pls read on --
otherwise bin this mail.

MinRequestInterval: X

Minimum request interval in seconds, (0=no minimum),
with a default, if missing, of 60.

This is for those of us lowely enough not to have huge
gathering tasks and the luxury ;-) of a backlog of URLs
over distributed sites. (I.e. Those of us doing a
sequential search exhausting our interest in a site in
one slurp.) Additionally local admins would have more
control over wanderers that visted.

DefaultIndex: index.html

Stating that XXXX/ and XXXX/index.html are identicle.

You can argue that this is lamely inadequate - or that it
makes a saving. I know the bigger issue is recusion. Here
I am merely hoping to save those single page recusions.

CGIMask: *.cgi

Rather than guessing at CGI urls -- why not get the local
admin to answer it? I know that the WN server uses a file
extension to indicate a CGI script -- not /cgi-bin/.

Q: Are CGI scripts universally avoided in advance -- or do
robots look at the HTTP flags of results to try to work
out wether some content is dynamically generated?

Finally -- I never understood why robots.txt was exclusion only.
Why does it not have some of positive hints added? I.e. you are
allowed & welcome to browse XXXX/fred.html. Was this a choice
built upon pragmatism -- thinking that this would open a can of
worms?

Thanks for any feedback,

Adam

--
+1-203-730-5437 | http://www.micrognosis.com/~ajack/index.html