Server name in /robots.txt

Martin Kiff (MGK@NEWTON.NPL.CO.UK)
Fri, 19 Jan 1996 10:32 UT


Looking up your site in the indexes is indeed educational... I have found
the same pages appearing under multiple domain names - the canonical DNS
name, various CNAME equivalents and the raw IP address *despite* having a

<BASE HREF="http://xxx.xxx.xxx.xxx/xxx.html">

giving a 'preferred URL' in the header. Obviously indexers don't
(or some indexer don't) recognise this and just build on incorrect,
but currently working, links from other pages.

Would it be an option to include a the preferred site name in the
/robots.txt file? Couldn't enforce anything of course but would act as
a reminder to the robots.

Regards,
Martin Kiff
mgk@newton.npl.co.uk