Re: Practical application from Murray Altheim on 2001-06-22 (www-rdf-interest@w3.org from June 2001)

From: Murray Altheim <altheim@eng.sun.com>
Date: Fri, 22 Jun 2001 10:22:02 -0700
To: Danny Ayers <danny.ayers@btinternet.com>
CC: Fernanda Hembecker <fernanda@ppgia.pucpr.br>, www-rdf-interest <www-rdf-interest@w3.org>
Message-ID: <3B337EBA.4A8EDC5D@eng.sun.com>

Danny Ayers wrote:
> 
> I'm sorry Murray, first impressions to me look like it's just html <meta>
> tag, plus a bit of arbitrary xml. No doubt this is useful for search
> engines. How are these tags interpreted in the body of a document by
> browsers? I may be wrong, but Dublin Core doesn't express everything that
> might be needed, and the link rel we've seen before - maybe this will be
> different? Here's the bottom line - how would a harvester be any the wiser
> with documents following this syntax than it would with arbitrary HTML?

No need to be sorry. It *is* just the HTML <meta> and some XML, though not
hardly arbitrary but Dublin Core Metadata Element Set (DCMES). And I have no
doubt that if this became the method used to indicate a subject 
classification, that it would be very useful for search engines. 

And you're correct, Dublin Core might not express "everything that might
be needed," but my point all along is that while some people fritter away
entire lifetimes of man hours developing incredibly complex specifications
that few mortals can figure out, this is a very simple way of performing
a very simple task that current has no widely-accepted solution in markup. 
I thought I made it very clear in the draft (perhaps you didn't read that
part) that this was an attempt to invent *very little*. Yes, the idea of
embedding Dublin Core metadata in <meta> tags was invented by DCMI, yes 
the <meta> tag already existed. Yes, DCMES already existed.

The value here is that for the first time an author can annotate a very
specific piece of markup (a "document component") with metadata following
an already-established means, can identify the subject (and about a 
dozen other things, such as responsible party, revision date, format,
etc.) as according to a controlled vocabulary (which is extensible to
fit their particular vertical industry need), and in a way that works with
browsers *today*. I don't expect browsers to "interpret" this information
at all. I expect search engines or metadata harvesters to be able to
locate information about specific content (by subject) without resorting
to a brain dead keyword search. If I'm searching on "Harvester Ants" for
a bio paper, I can search on Dewey 595 or Library of Congress QL568. If
I'm looking for a particular patent application I can search on "Patent
73638-398-737", if I'm searching for information on a "polymurrayphase
interaction in pembroke corgification" (which just happens to be part of
the "MCSF-SPT" scheme as classification index "PIPC-32" in my vertical
industry"), well here's a way to do that.

It doesn't slice your bread or solve world hunger. It does perform one
of the fundamental things I've read about in descriptions of the "Semantic
Web": allow an author to clearly identify the subject of a particular 
document fragment, and not just the entire document.

Murray

...........................................................................
Murray Altheim                            <mailto:altheim&#x40;eng.sun.com>
XML Technology Center
Sun Microsystems, Inc., MS MPK17-102, 1601 Willow Rd., Menlo Park, CA 94025

      In the evening
      The rice leaves in the garden
      Rustle in the autumn wind
      That blows through my reed hut.  -- Minamoto no Tsunenobu

Received on Friday, 22 June 2001 13:23:19 UTC