Re: Anyone doing a Java-based robot yet?

David A Weeks (cs31dw@ee.surrey.ac.uk)
Tue, 20 Feb 1996 11:47:26 +0000 (GMT)


Hi,

I must admit, as part of my final year project in computing,
I am doing a Java Based Robot; called 'KeyWord'.

What I have found is that Java has plenty of predefined classes to use for
web searching and document parsing. For example, the 'InputStream' classes allow
various ways to read in files (extendable if you so wish) and the String class
has numerous methods to manipulate text files (as I am currently doing).
In addition, Java has the generalism to enable local file saving and an object-orientated
way of constructing a data-base (although nowadays, the term 'knowledge-base' is
often used).

Before questions about this Java robot are asked, there are four thing to state :

1. KeyWord obeys robots.txt (again the String class makes this easy)
2. KeyWord is restricted to one request per minute regardless of the location.
3. KeyWord keeps a history of visited URLs to avoid duplication.
4. Keyword does not search by depth so does not run into 'black holes'.

The URL of the current documentation can be found at :

http://eeisun2.city.ac.uk/~ftp/Guinness/Hello.html

Regards,
Dave Weeks.