Re: Keyword indexing

Brian Ulicny (ulicny@limbex.com)
Tue, 18 Jun 96 17:29:01 -0700 (PDT)


>> My problem is deciding exactly *which* words are important to index, and
>> how to store such a huge amount of data in a manner that will be easily
>> accessible for a search engine.

Maintaining a list of important keywords by hand is probably not a good
idea. A better idea would be to use machine learning techniques to
automatically classify pages as relevant to computer science (or whatever
topic). Look at Gerard Salton's work for starters.

Best,

Brian Ulicny

Limbex Corporation V/mail: (310) 309-4281 x4505
13160 Mindanao Way, Suite 234 Fax: (310) 309 4282
Marina Del Rey, CA 90292 USA URL: http://www.limbex.com/