Re: indexing via redirectors

Eric Miller (emiller@oclc.org)
Tue, 21 Jan 1997 08:52:32 -0500


Sigfrid Lundberg writes:
>
>
> > My question is: What do spiders usually do when they come across such a
> > link?
> > Will they follow the redirection? If yes, which URL is then indexed?
> > This seems to be important for commercial purposes (pay-per-hits). If
> > the final (destination) URL is indexed, my program will be useless.
> >
>
> If a spider follows redirects at all, it will delete your URL, and index the
> destination. That's the only sensible way to handle redirects in general,
> isn't it?. Your URL does not deliver any content. This is indeed one of
> the problems with the OCLC PURL schemes.
>
> Sigfrid

Hi Sigfred,

I missed your concern about the PURL service (and redirection in
general). Could you elaborate? You point about *how* people use it
is valid, however, it seems that the problem lies in the fact that
browsers and robots interpret 301 "Move Permanent" and 302 "Move
Temporary" redirections as the same. A "Move Permanent" redirection
should delete the source URL and use the redirected target URL and
target document. This indicates that there has been a permanent
change. A "Move Temporary" on the otherhand does not, and thus (in
theory) should keep the source URL as the locator but follow and index
the redirected target.

"302 Moved Temporarily

The requested resource resides temporarily under a different URL.
Since the redirection may be altered on occasion, the client should
continue to use the Request-URI for future requests."

http://www.w3.org/pub/WWW/Protocols/rfc1945/rfc1945.txt

take care,

eric j. miller <URL:http://purl.oclc.org/net/eric>
emiller@oclc.org Office of Research, OCLC, Inc.
emiller@cis.ohio-state.edu Dept. of Geography, The Ohio State University

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html