Re: word spam

(monier@pa.dec.com)
Tue, 16 Apr 96 21:26:45 -0700


Once in a while I need to pipe in and speak for my poor bad-mouthed robot and
search engine. It's one of these weeks.

For the last question about getting the content of a Perl script and not its
output: I don't know, ask the server! Someone published the URL, and the server
gave this back. I promise that we don't use out-of-work KGB psychics to get the
pages, just regular GET requests. And no crystal balls to guess URLs, just
following links...

As for the discussion about the "quality" of returns, the distribution of
responses has been wonderfully bimodal: those of us who think that what you get
should be a strong function of what you put in, and the rest (hmm) who believe
in magic. If a query is vague, say "computers", there are indeed tens of
thousands of pages matching the query, in the simple sense of containing the
word. Now the ranking can help some, but most of the returns will fall in very
crowded buckets of similar ranking, say that all pages with "computers" in the
title and mentionning "computers" often enough will all compete for top spot.

Now, what is the "right answer"? Well, if you believe that the game is to guess
the rest of the query in your head, only magic will do. If you think it's some
sort of reference page, Yahoo-style (not a criticism, they play a different
game, and certainly not to be exhaustive), it requires human intervention, and
this is not what search engines are about.
So the right answer is indeed to refine the query.

What's so hard about it?

--Louis