I also have a real honest-to-goodness robot running. It does obey
robots.txt. Currently, after transfering it once, it never transfers
it again. That's one extreme of caching robots.txt.
The other extreme would be no caching, and testing it before GETting
each URL. A little less extreme would be doing a get-if-modified-since
on robots.txt before each transfer.
My next robot implementation is going to cache robots.txt for a fixed
period, say 1 week. Does this sound reasonable?
-- Aaron Nabil http://www.i.net/aaron/