Re: Content based search engine

Martijn Koster (mak@surfski.webcrawler.com)
Mon, 10 Jun 1996 11:31:35 -0700


In message <199606101619.MAA06463@tre.thewild.com>, "Scott 'Webster' Wood" writ
es:

> I have not come up with much in teh way of responses to my questions about
> search and index programs for HTML documents. I have run across the code
> for perigrinator in Perl, but I am just starting in Perl, and the programmer
> is not releasing the entire package.

If you're just after Perl, libwww-perl5 is a pretty good starting point;
and it's freely available.

> I am running out of time and spent too much
> time looking for existing packages to no avail. (time that I could have
> perhaps used for programming)

Well, it depends on how far you want to go, but the time it takes to build
an entire indexing robot from scratch easily outweighs some research :-)

> If anyone knows of any packages that exist - free or commercial - that
> could run from a Linux based server to do small scale searching of this
> type (or is willing to assist my lack of knowledge to help me build one)
> I would really appreciate it.

Have you looked into Harvest? That might do it out of the box.

-- Martijn
__________
Email: m.koster@webcrawler.com
WWW: http://info.webcrawler.com/mak/mak.html