Perspective on the metadata / discovery struggle

I had a thought about the TAG definition discovery and metadata
architecture issues that might be helpful.  Probably this is obvious, but
it wasn't to me so I thought it was worth writing down.  This relates
to the fact that whenever the httpRange-14 thing comes up in the TAG
we are confused about what issue to put it under. I was inspired to
think this over by F2F remarks of Larry's about the magnitude of
the problem.

There are two distinct application-level communication needs:

  1. web metadata - when I express information about a document
     (image, etc.) how do I say (especially in RDF) that what I am
     talking about is content that's accessed via a particular URI, as
     opposed to other content

  2. definition discovery - given a vocabulary term (URI),
     how is definition-like information for it discovered
     (Definitions are not, in general, metadata.)

Described in this way, the needs seem unrelated.  The first falls
under our ISSUE-63 (metadata architecture), the second under ISSUE-57
(definition discovery).  The first need spawned the Resource
Description Framework and the httpRange-14 2xx rule, while the second
spawned linked data, RDF-style fragment ids, and GET+303.

The connection between them is that the same notation and protocol,
namely what I've been calling dereferencable absolute URIs, has been
advanced as a solution to both problems.  The competition creates a
struggle.  Think of these URIs as a limited natural resource over
which many factions are contending.  Just as a piece of real estate
cannot be used for a wetlands and a high-rise at the same time, one of
these URIs can't simultaneously get its meaning according to two rules
that give different answers most of the time.

So the competition over "linguistic real estate" itself begets a third
problem:

  3. to what referential use are dereferenceable absolute URIs best
     put?

One might then give up and say interoperability is not a good goal,
one might try to carve up or overload the linguistic space, one might
try to "win", one might say it's an inadequate solution for one or the
other problem, or that it's hopeless, and so on - all the arguments
we've heard over and over again.

It's not enough to solve the three problems separately. There is
no divide-and-conquer. That is what makes it so annoying: everything
interacts.

Jonathan

Received on Thursday, 30 June 2011 14:32:40 UTC