Re: Heuristics....

Nick Arnett (narnett@Verity.COM)
Mon, 1 Apr 1996 09:29:12 -0800


>Do you need this complexity? I guess, and it is only a guess, that people
>assume a 'WAIS' like behaviour in the weighting. I.e. the number of times
>*that* word has appeared over the total number of words in the document.
>(If I've got this wrong you can correct me privately :-). A linear
>relationship therefore...

When you say "people assume..." do you mean users? In that case, I think
they assume that don't have to worry about the underlying algorithms;
people want relevant results, which shouldn't have to depend on any deep
understanding of what happens behind the scenes.

>But does it need to be linear? How does a
>
> log (*that* word) / total number

Our density operator is not linear... and it gives pretty good results,
although density is rarely the only evidence involved in coming up with a
relevancy score.

Nick