I'm going away for a couple of weeks, but I'll put some thought into it =
during that time. Any comments, requests, ideas for any aspect would be =
greatly appreciated (after all ,that's how the Internet was built). =
When I return I'll setup a part of my WWW server dedicated to this =
project (think I'll call it Project Asimov - seems appropriate for a =
global robot communications system).
The key features I have thought of so far as listed below, so you can =
comment on these also (ie. tell m,e if I'm being too =
stupid/ambitious/etc)
1. Dedicated port approved as an Internet standard port number. (What =
does this require?)
2. Protocol (similar to FTP I think) which allows remote agents to =
exchange URL's, perform searchs and get the results in a standard =
format, database mirroring(?), etc. The idea behind this is that if =
Robot A finds a URL handled by another remote Robot (such as by domain =
name, keywords(?), etc), then it can inform the remote robot of it's =
existance. Similarly, if a user wants to search for something which =
happens to be handled by the remote server, a standard data format will =
be returned which can them be presented in any format.
3. A method of correlating Robots with specialties (what the robot is =
for). An approach similar to DNS may come in handy here - limited =
functionality could be obtained by using a "hosts" type file (called =
"robots" ?), while large scale, transparent functionality would probably =
require a centralised site which would maintain a list of all know =
robots and their specialties. Remote robots would download the list( or =
search parts of it) as required. This could probably be another =
protocol command on the port above.
4. A standard set of data, plus some way to extend it for implementation =
specific users. I use the following fields in FunelWeb
URL
Title (from <TITLE>)
Headings (from <Hx>)
Link Descriptions (from <A HREF=3D"">...</A>)
Keywords (from user entry)
Body Text (from all other non-HTML text)
Document Size (from Content-Length: server field)
Last-Modified Date (from Last-Modified: server field)
Time-To-Live (server dependant)
This also highlights one MAJOR consideration - These fields are =
generally only useful to HTML robots. Something needs to be considered =
to handle any input format, including FTP, WAIS and GOPHER.
Well, this is now MUCH longer than I first intended it to be. Sorry to =
have wasted your time and bandwidth.
Hope you all had a great Christmas and have a Happy New Year.
Regards,
David