Good luck.
At 10:00 1/01/70 +1000, you wrote:
>At 05:29 PM 6/18/96 -0700, you wrote:
>>>> My problem is deciding exactly *which* words are important to index, and
>>>> how to store such a huge amount of data in a manner that will be easily
>>>> accessible for a search engine.
>>Maintaining a list of important keywords by hand is probably not a good
>>idea. A better idea would be to use machine learning techniques to
>>automatically classify pages as relevant to computer science (or whatever
>>topic). Look at Gerard Salton's work for starters.
>I've never heard of machine learning technicques for classification by subject.
>Is the work of the person you mentioned, Gerald Salton, available on the Web?
>Or he the authour of a book? If you happened to know the name of the book,
>or some way of finding it, I'd be very appreciative.
> David Reilly,
> Computer Programmer,