Re: Frames ? Lycos ?

Mitch Allen (mitch@webdroid.com)
Wed, 08 Jan 1997 08:50:02 -0500


On Tue, 7 Jan 1997, Mitch Allen wrote:
> I gather you are wondering why it will find http://www-rfi.eunet.fr/,
> but not follow the link on that homepage to http://www.rfi.fr? Most
> commercial engines will only follow relative links under the current
> URL.

Theo Van Dinter wrote:
> Sounds like a good concept, but what about the people who don't believe in
> relative URLs?

Let me rephrase it then: 'Most commercial engines will only follow links
that share the same BASE URL as the current URL.'

A good robot program would deduce that if it's vistiting
http://www.foo.com and it encounters <A
HREF="http://www.foo.com/foo.html"> vs. <A HREF="foo.html"> that it's
the same thing and that it's still a link to the same site.

Granted you could write a program to drill down through other links you
find. You could say put a "depth limit" as to how far you would go (to
avoid the Yahoo problem). But the idea is to only go where you are
invited. So if the robot is welcome at other sites found it would
already have them in its database. In the original case, Lycos probably
isn't stopping by at http://www.rfi.fr because it wasn't invited.

-- 
==================================================================
Mitch Allen                                               WebDroid
http://www.webdroid.com                              P.O. Box 6569
mailto:mitch@webdroid.com                        Boston, MA. 02114
_________________________________________________
This messages was sent by the robots mailing list. To unsubscribe, send mail
to robots-request@webcrawler.com with the word "unsubscribe" in the body.
For more info see http://info.webcrawler.com/mak/projects/robots/robots.html