Re: indexing via redirectors

Captain Napalm (spc@armigeron.com)
Tue, 21 Jan 1997 15:47:22 -0500 (EST)


It was thus said that the Great Sigfrid Lundberg once stated:
>
> > My question is: What do spiders usually do when they come across such a
> > link?
> > Will they follow the redirection? If yes, which URL is then indexed?
> > This seems to be important for commercial purposes (pay-per-hits). If
> > the final (destination) URL is indexed, my program will be useless.
> >
>
> If a spider follows redirects at all, it will delete your URL, and index the
> destination. That's the only sensible way to handle redirects in general,
> isn't it?. Your URL does not deliver any content. This is indeed one of
> the problems with the OCLC PURL schemes.
>
Given:

Response 301 - moved permanently
Response 302 - moved temporarily

and the following URLs:

1. http://www.foo.com/bar.html 301 -> http://www.foo.com/snafu.html
2. http://www.foo.com/baz.html 302 -> http://www.foo.com/bazfix.html


I would think (and I'm planning to do this with the my robot) the right
thing to would be to index http://www.foo.com/snafu.html directly, while
index http://www.foo.com/bazfix.html but keep the URL as
http://www.foo.com/baz.html, under the assumption that the redirect is
temporary in nature (the server gave a 302 response, no?). If it's any
other error code in the 300 range ... hmmmmmmmmmmmmmmm ... maybe drop it on
the floor?

-spc (and what is the purpose for 304?)

_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html