Re: word spam

Kevin Hoogheem (khooghee@marys.smumn.edu)
Tue, 16 Apr 96 12:33:16 -0500


I was wondering if any one did this when they index. =20

This idea I have is kinda from Dykstras paper on the indexing =
problem he
wrote about a long time ago, sorry dont have the article numbers =
handy but
what it was doing is going though papers and indexing the articles =
from taking keywords
and noise words and only taking out the keywords.=20

Well what I was thinking is most peoples web robots have or could =
easily take a list
of noise words and then not index them only index words that are =
not in that list.

Well if someone puts sex sex sex sex sex that many times right in =
a row. =20
that should not get realy index.. It might be safe to say that =
for ever keyword a noise
word must have been either infront of it or behind it.

oh well