Re: What are the problems with IDML?

Marc Salomon (marc@ckm.ucsf.edu)
Fri, 16 Aug 1996 16:34:37 -0700


From: "Marc Salomon" <marc@ckm.ucsf.edu>
Message-Id: <9608161634.ZM12044@gaia.ckm.ucsf.edu>
Date: Fri, 16 Aug 1996 16:34:37 -0700
In-Reply-To: marym@Finesse.COM (Mary Morris)
To: www-html@w3.org
Subject: Re: What are the problems with IDML?

On Aug 16, 12:47, Mary Morris wrote:
> Subject: Re: What are the problems with IDML?
> Marc Salomon said"
>
> > The issue of classifying metadata for arbitrary objects is a problem
already
> > solved by the library community and is orders of magnitude less complex
than
> > specifying content models for arbitrary document forms.
>
> While I will agree that the library community has done some interesting
> things with information classification in general, I have yet to have
> seen a good library taxonomy for products and services let alone
> a full description of associated metadata. If you know of something
> that I have missed, I would love to hear about it.

I was referring to objects, pieces of (intellectual) content.  There are many
taxonomy schemes in the hard sciences that the library hasn't been involved
with, so library techniques are not a be-all and end-all of classification.
 Business has always shied away from consumer driven firm-to-firm comparisons,
so those who want that most have contributed the least.

If you all are going to play business on the internet at this early date, you
are going to have to expect to break some new ground.  Just don't do it in a
 way that re-breaks sown ground for the rest of us. :)

> > But <META> is only going to scale for the most lightweight metadata
> > applications, and for more complex collections, first-class metadata
objects
> > will be the answer.
>
> Can you please explain a little more about "first-class metadata
> objects" and how they differ from <META> and who is work on those
> strategies right now?

If you have a collection of documents that all share some common
characteristics, say 'document-maintainer."  Does it make sense to embed that
information in the document so that each doc becomes stale (perhaps forcing a
cache-reload) whenever an essentially trivial field changes?

According to the HTTP spec, a HEAD request is identical to a GET request except
that an entity body cannot be returned with a HEAD.  This means that if you
actually support <META> translated to HTTP headers, in order to be in spec you
would have to send the metadata for each HEAD and GET.  Metadata can become
somewhat complex and might rival the document itself in percentage of the
request bytes.

As metadata can get complex, especially at the collection level, and <META> has
no content model.  It has attributes that define its data, and what can be
encoded in an attribute value is limited according to SGML.  This is the single
bright observation in the IDML document.  Their example of this, however, is
brain-dead.

The IDML thing is trying to cram into HTML what should exist outside HTML.
 META is trying to standardize use of a limited HTML element for a somewhat
complex task.

I could see making an IDML metainformation a first-class object (essentially a
peer of a document) using <LINK REV="IDML" HREF="/xxxxx.idml"> to do what
they're saying.  But schemes that glom together documents and their metadata
won't scale for large, high-value-added collections.

I know of no research going on to standardize product and service description
on the WWW.  It will be fun, though, to see profit margins drop to nil when
programs we write can do price comparisons across the world in real-time...will
anyone be able to afford to do business on the internet?

-marc

--