Re: Handling keyword repetitions

Alan (psiber@earth.ftech.co.uk)
Fri, 12 Apr 1996 02:11:07 +0000


Come on, think about it, Remember the "red herring page?" I have
spotted it's url in my search results some 10 times in three months,
whilst querying diverse subjects. This on-line dictionary (Not), is a
classic example of how the older robots were open to abuse. Putting
"You've been had" at the top of a dictionary on a web page is a laugh
OK, but perhaps this answer's your question.

So this is the argument against indexing the "word spam". If the page
is about, what it is about, there should be little need for key-word
sections. The content should speak louder than key-words. A page about
UFO's and Aliens, will use these words frequently in the body of the
document, not to attract visitors, but because that is what it is
about. An author should be able to place most conceivable key-words to
it's content, in it's content.

There are however some exceptions to this thinking, graphic sites are
one. (although IMO, ALT text is grossly under used, and this
under useage is particularly apparent on Corporate sites, that have
the funds to pay someone to know better)

Another exception is the ever popular mispeeling ov werds,
abbreviations, acronyms, or slang. Key-words can help here, but as you
say, do you use them or not?

I know for one that I would exclude pages where there was a "word
spam". This is so unsightly, amateurish, and frankly quite rude! How
to do this? I guess, I'd start by excluding all text using the same
<FONT Color=> as the <BODY BGCOLOR>.

I would treat as suspicious, all text within a link to the active
page, which would render invisible in most pages without <BACKGROUND
SRC="blahblahblah.gif">. Pages starting in AARDVARK and ending ZYGOTE
could also be steered with a wide berth:-)

I'd say that if you had a sophisticated enough neurology, you could
identify key-word sections, and compare them with page content, and
downgrade pages with poor scores, or repetition within poor grammar
constructs. I think you'd certainly be doing the public a service!

Hope some of this is useful <g>

T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts
T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts T-Shirts

Oops, I mean

Alan

. On the other hand, > Hi all,
>
> I was wondering what people's views are on regarding key-word repetitions found in many web documents. Sometimes they're in the <TITLE></TITLE> tags and sometimes they form a large block at t
> e bottom of the page.
>
> Obviously the authors of such pages are trying to score higher ratings on the index catalogue searches.
>
> Should catalogues really index these key-word repetitions?
>
>
> Regards,
> Dave Weeks.
>
>
>
> ----------------------------
> cs31dw@surrey.ac.uk
> http://eeisun2.city.ac.uk/~ftp/Guinness/Hello.html
>

-- 
".....UFO Scatters Crash Debris Across UK Web Site........"
...**.....:..*.*....^...*...*..*.*....*............*.*..*..
...*.."..    M y  G o d    ......*..:..*..*....*...*...:*..
.^.*..*...    It's  F u l l  o f   S t a r s   ..*.:....*..
.*.... http://www.visionaries.co.uk/webcat/ufos00.hts :.*..