robots.txt usage

Wiebe Weikamp (wiebe@il.ft.HSE.NL)
Tue, 18 Jun 1996 11:28:57 +0200 (MET DST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Robert Raisch, The Internet Company: "Re: Keyword indexing"
Previous message: Terry O'Neill: "Re: robot.polite"
Next in thread: Reinier Post: "Re: robots.txt usage"
Reply: Reinier Post: "Re: robots.txt usage"
Maybe reply: Daniel Lo: "Re: robots.txt usage"
Maybe reply: Brian Clark: "Re: robots.txt usage"
Maybe reply: Ulrich Ruffiner: "Re: robots.txt usage"

Hi,

I've written a robot+searchengine (www.il.ft.hse.nl/ilse (dutch sites only))
and I just implemented the robots.txt 'stay away' stuff, but it caused some
problems...

First of all, to be a good (dutch) searchengine, my database must contain
as much pages as possible, but some sysadmins on hse.nl (aswell as other
sites) have put a robots.txt with :

User-Agent: *
Disallow: /

Anyway, I think robots.txt, if used correctly, can be of great help, but
things like this made me think again, should I use robots.txt or not ?

Sysadmins probably would like their sites to be as fast as possible, but
the data on those site will not be in Ilse, so people will blame Ilse for
being incomplete...

Have others seen the same ? if so, what should I do ?
I can't go contacting all the webmasters/roots at these machines... And I
don't want to disguard a possible /robots.txt...

Grts
Wiebe

PS - I'm new to this mailinglist, so if I've asked a faq, don't blame me :-)

---
wiebe@il.ft.hse.nl                  | Ilse - dutch searchengine
http://www.il.ft.hse.nl/~wiebe/     | http://www.il.ft.hse.nl/ilse/

Next message: Robert Raisch, The Internet Company: "Re: Keyword indexing"
Previous message: Terry O'Neill: "Re: robot.polite"
Next in thread: Reinier Post: "Re: robots.txt usage"
Reply: Reinier Post: "Re: robots.txt usage"
Maybe reply: Daniel Lo: "Re: robots.txt usage"
Maybe reply: Brian Clark: "Re: robots.txt usage"
Maybe reply: Ulrich Ruffiner: "Re: robots.txt usage"