Re: META tag standards, search accuracy

Benjamin Franz (snowhare@netimages.com)
Mon, 14 Oct 1996 07:56:41 -0700 (PDT)


On Mon, 14 Oct 1996, Eric Miller wrote:

> > Benjamin Franz writes:
> >
> > Sure. But unlike the (now defunct) HTML-WG, the real world needs something
> > that works *TODAY*. And it must be above all *SIMPLE* and easy for
> > non-experts to use. Library indexing schemes meet neither requirement.
>
> These issues that are outlined here have been one of the underlying
> foundations behind the Metadata Workshop Series. These workshops have...
[...]
> <URL:http://purl.oclc.org/metadata/dublin_core>
>
> Additionall, with regard to encoding this information in HTML, a
> proposed convention that reflected the consensus of a break-out group
> at the W3C Distributed Indexing and Searching Workshop, May 28-29,
> 1996, concerning tagging of meta information in HTML can be found
> <URL:http://www.oclc.org:5046/~weibel/html-meta.html>

Bluntly, the consensus failed to meet its objective. Outside of academia,
no scheme that requires referencing *external* schema for META data is
going to work and achieve general usage. The Dublin core unfortunately
reflects its academic roots much too strongly. What is (to library
specialists) an apparently simple scheme is *too complex* for actual
general use. I can hear the protests already that 'but no fixed schema is
general enough!'. You're right. But that doesn't affect the fact that
external schema will not recieve widespread acceptance outside the
academic community. It doesn't matter that fixed and simple META fields
are too simple for good indexing: It is just *barely* simple enough for
real world use by webpage authors.

The Dublin core is unfortunately yet another example of where the
the people *envisioning* the system are forgetting yet again that the
people *using* the system are (A) Not technical people (B) Are not even
aware of there being a formal standard to do something.

I'll say it again in a different way: If non-technical users cannot
correctly understand how META content markup works *just by reading
someone else's page* - most of them will not use it at all. Of the
remainder, most will use it incorrectly. This is *worse* than not using it
at all since it will in turn spawn even more broken usages in a world wide
game of 'whisper in my ear'. A good web standard must be stable against
mis-use by the unknowledgable: The Dublin core is not. The *first* thing
that is going to disappear are the <LINK>s to the schema, since people
aren't going to understand them. Worse, it doesn't even address keywords
- which are at the heart of every search engine query! This is a
ridiculous omission for a META data standard. Lastly, in conflict with
existing widespread usage, it renamed 'description' as 'subject'. That
is just begging to be sandbagged.

-- 
Benjamin Franz