Re: Possible robots.txt addition

Ian Graham (ianweb@smaug.java.utoronto.ca)
Wed, 6 Nov 1996 12:34:49 -0500 (EST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: John D. Pritchard: "Re: Back of the envelope computations"
Previous message: Ian Graham: "Re: Possible robots.txt addition"

> In message <Pine.SUN.3.95.961105204518.18054D-
> 100000@airedale.cisco.com>, Issac Roth <iroth@cisco.com> writes
>
> >> Unfortunately, it is currently impossible to tell robots which of
> >> domain name should be used for a particular site. Consequently, a robot
> >> can continue to index under an obsolete name, until the domain actually
> >> disappears and all the references become invalid.
>
> I asked about this some time ago and was advised to include
>
> <Meta http-equiv="URI" contents="http://the.proper/location.html">
>
> in the HEAD. I also used <Base Href.....> to make sure all links pointed
> to the new location. Problem is I'm not sure whether it helped :-)

This is fine, but means that you have to unnecessarily modify all the
HTML documents, and also does not work for non-html based data (pdf, ps,
images, etc.) It also does not indicate why the redirection occurs --
there is nothing to say that the original domain *name* is invalid, only
that the particular URL is moved.

> The next step I did was run a CGI script on the old machine which
> reported a 302 to any known robots and a redirect to others.
>
> I still had to email the maintainers of the search engines to drop the
> old pages. I did have my mail address in the header of each page so that
> the maintainers could mail and confirm that the 'delete' request was not
> spoofed email... It still took 6 to 9 months to effectively complete the
> move.

Which are among many of the reasons I am making this proposal.

> Anybody care to comment on these (or other) techniques? The Web
> does seem to need a way of handling moves and renames... What will HTTP
> 1.1 bring?

As far as I can tell, HTTP 1.1 does carefully address domain name
redirection (necessary for proper handling by HTTP proxy server and
server tunneling) necessary for a given transaction, but does not
allow for the type of general-purpose information I am proposing.

Ian

--
Ian Graham ........................................ ian.graham@utoronto.ca
Information Commons                                      Tel: 416-978-4548
University of Toronto                                    Fax: 416-978-0440
> 
> -- 
> Martin Kiff
>   mgk@webfeet.co.uk
>

Next message: John D. Pritchard: "Re: Back of the envelope computations"
Previous message: Ian Graham: "Re: Possible robots.txt addition"