Re: Server name in /robots.txt

Christopher Penrose (penrose@grasshopper.ucsd.edu)
Fri, 19 Jan 96 10:21:09 -0800


With my agent, I have dealt with the lack of the canonical form
fairly well, though
Tim is right, there is no perfect solution.

My agent searches the url database for each alias attributed to the
found url.
I think though that alias checking works quite well, and for http
sites I have
chosen to first use the name used in the found url if the name has
the text,
www in it. I then choose to use an alias if this first condition
is not met, which
meets this condition. Of course this fails to select www.netscape.com
if the reference was www.mcom.com, but I think that it is a
reasonable solution
given that it is very difficult to infer a preference between such
two domain names
without direct specification.

A good solution would be the inclusion of a new optional html tag:
<host="www.anchovies.com">

Christopher Penrose
penrose@ucsd.edu
http://www-crca.ucsd.edu/TajMahal/after.html