(no subject)

Stacy Cannady (Stacy_Cannady@csg.stercomm.com)
Fri, 06 Sep 96 07:54:33 CST


I am also interested in this. In the event that there are a number of
us lurking around out here, could those of you who reply do so to the
list?

Thanks.
Stacy

______________________________ Reply Separator _________________________________
Subject:
Author: robots@webcrawler.com at Internet-Mail
Date: 9/6/96 1:17 PM

Dear Friends,

I am planning to implement a robot in Java.

If there are any available information/source available already can
you pl. guide me to its presence? ( I do not want to reinvent the
wheel or at least all of it)

To get the data out of a document I am looking for contents inside
the <Body> ... </body> region and the <title> ... </title> region. I
am ignoring the tag information altogether and the contents inside the
comment.
Do you think this is enough to get reliable and correct information
about the web page. Pl. suggest if you think otherwise.

I am planning to make the robot engine(Source code of the classes)
free after I make it.
Can you suggest where it would be best to upload for maximum access.

Is there any available Java classes/C code for implementing the robot
exclusion standard?

Pl. reply to achaks@inf.com


Angsuman