>My problem is that none of the major web search robots will
>seem to index the site except for the home page. I assume
>that this is because the URLs look like CGI programs . . .
>Is there any way to convince the robots that it is OK to
>go ahead and traverse these URLs? . . . It seems to me that
>this is a problem that will come up more and more often
>as people start using more complex servers. Has anyone
>figured out a solution?
While this solution probably won't help you much now, we've encountered this
problem over and over again ... in order to avoid this situation with search
engines, we tend to make the URL space populated by HTML files that call
CGIs that filter databases through server side includes ... that way, it's
just a .html or .shtml extention, but you still have much of the flexibility
of an SQL or other database driven dynamic system.
Of course, this doesn't help much if you've shelled out the big bucks for
something like NetObjects or complex document management systems, as you end
up bypassing all the coolest features of the product to make it spider
friendly.
Can your server handle parsed HTML with SSI? I'd love to hear if that
technique is successful through NASPI.
Brian
-- -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + Brian Clark (President, GlobalMedia Design - Orlando/NYC)- - http://www.radzone.org/gmd/ 407.657.8990 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- _________________________________________________ This messages was sent by the robots mailing list. To unsubscribe, send mail to robots-request@webcrawler.com with the word "unsubscribe" in the body. For more info see http://info.webcrawler.com/mak/projects/robots/robots.html