The spider I wrote (Pioneer) refreshes /robots.txt files for each new
execution of the robot. During its run it will try to
retrieve a robots.txt file for each new host it encounters. Then,
that policy file, if it exists, will be considered active for
that particular site until the robot shuts down.
This works fine for me because the longest period time I've
ever run the robot without interruptions is 10 hours.
If you're gonna rev up the spider then go for a vacation,
then perhaps another method is in order.
-Micah
-- ============================================================================ Micah A. Williams | Computer Science | Fayetteville State University micah@sequent.uncfsu.edu | http://sequent.uncfsu.edu/~micah/ Bjork WebPage: http://sequent.uncfsu.edu/~micah/bjork.html Though we may not realize it, we all, in some capacity, work for Keyser Soze. ============================================================================