Re: Should I index all ...

Terry O'Neill (toneill@mariner.com)
Tue, 09 Jul 1996 08:47:09 -0600


Nick Arnett wrote:
>
> >No doubt that you exclude some meaningful information when you
> >use stopwords, but two benefits often outweigh this consideration.
> >First, your database does get smaller, easily 25%, although there are
> >many factors that affect this number.
>
> What's your source for that number? Unless you have a very primitive index
> and a very aggressive stopword list, the size reduction is nowhere near
> that large, I believe.
>

We probably don't want to turn this into too much of a text
search discussion in this forum, but the estimate is based
on a variety of text engine evaluations we did in 1992-1993
in preparation for an overhaul of the Dow Jones Text Library.

Terry O'Neill
mariner.com