This seems like a really useful idea. It certainly isn't robot exclusion, but
robots.txt seems as appropriate a place as any for it. (It could go in another
file as well) I'd like to see something like this though:
use-name: myhost.mydomain.edu
Which would allow a robot to know what hostname to return for the URLs on that
site when that host goes by many different names.
Anyone?
Issac
>
> A robot could then check these domain names against the one it originally
> used to access the site, make sure they all map to the same IP
> address ( ;-) ), and then switch to the preferred name. In principle, it
> could also iterate through existing names, and update those that are
> obsolete.
>
> Admittedly this is a bit outside the bounds of robot restriction rules,
> and perhaps more properly belongs as part of a server meta-information
> document or the HTTP protocol. But, neither of these two solutions
> exist, and the required modifications to robots.txt are both simple
> and quickly beneficial to site maintainers and robot-based indexers.
>
> --
> Ian Graham ......................................ian.graham@utoronto.ca
> Information Commons Tel: 416-978-4548
> University of Toronto Fax: 416-978-0440
>