robots.txt , authors of robots , webmasters ....

savron@world-net.sct.fr
Wed, 17 Jan 1996 09:07:02 +0000


A few thoughts about the robots stuff :

-- there should be no need to include a line such as :
/cgi-bin/
in robots.txt
because it should come as a standard of indexer robots

The one exception I see is an automated query of search engines .

-- Webmasters complaining about robots indexing partially built
document trees . So why are they linked to the main tree ???

-- I agree with the proposed 'positive' extension of robots.txt to
include 'these pages should score more than the others of my site'

-- I don't understand why , if a web site is publicly accessible it
shouldn't be indexable and so why there is a need for such a thing as
robots.txt .

-- Correct me if I'm wrong on this : If webmasters want to reserve
access to certain pages to certain specific users they can do it ,
without needing to passwording it , by giving the pages names to
these users and not linking them to the main tree .
As robots follows the links they find and can't guess ( well , if you
don't choose an obvious page name ) ( snoopers sort of robots ) you
are pretty safe ( and if you really need it

-- setup a password query form ( only a partial tree is reserved )
-- choose another port than 80 and password it too ( in case of a
http port scanner sort of robot )

-- Why in the HTTP protocol there is not such an info about the
required delay between to successive queries to the same server ( see
the webmasters complaining about rapid fire queries from robots )
that the webserver should send in the header of each answer .

If anyone wants to comment on this , I will be pleased to hear his
opinion

Bye Bye