Re: Perspective on the metadata / discovery struggle from Dave Reynolds on 2011-07-01 (www-tag@w3.org from July 2011)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Fri, 01 Jul 2011 11:36:00 +0100
To: Jonathan Rees <jar@creativecommons.org>
Cc: www-tag@w3.org
Message-ID: <1309516560.2166.90.camel@Obsidian3>
Hi Johnathan,

[Snipped main point about terminology, since that seems resolved.]

On Thu, 2011-06-30 at 16:29 -0400, Jonathan Rees wrote: 
> On Thu, Jun 30, 2011 at 1:13 PM, Dave Reynolds
> <dave.e.reynolds@gmail.com> wrote:

> > And the Ian Davis "back to basics" proposal is essentially "the owner of
> > the URI gets to choose, it is their real estate, if they say this is a
> > high rise then it is a high rise - if they want to create two related
> > bits of real estate, one for the high rise and one for the wetland
> > that's fine too".
> 
> Well, as you know, there's no general agreement right now on what you
> say, but maybe there will be in the future.

Sure. I was merely noting that your analogy (that the issue is like a
struggle over natural resources) fits with the way that proposal is
framed.

> The publisher-decides rule has many variants, and I've seen its
> proponents embrace a variety of mutually incompatible positions. So
> it's not enough to strengthen or retract httpRange-14 - there has to
> be consensus and a spec of some kind no matter what, if there's to be
> interoperability.

Sure. I was, as you know, just referring to [1] - which is indeed not
sufficiently detailed to constitute a spec and but one possible variant.

> I don't get what you're saying about real estate. Someone encountering
> a URI in an RDF statement (such as a statement of authoriship,
> license, or 'likes') may want to know what it's about.The answer will
> be different under different rules. To avoid getting the wrong answer
> they may need to know which rule applies. Guidance comes from
> specifications and general practice. If you allowed two rules to
> coexist (e.g. it's-the-information-resource and publisher-decides),
> there would be no way for the decoder to know which encoding was used.
> See my xhv:license example in the other thread.

I was just running with your metaphor and stating the "publisher
decides" option in those terms, may not have been helpful :)

Attempting to clarify (not of this is new) ...

If I publish some RDF at U then by [1] what that RDF says about U takes
precedence. If the returned RDF states (directly or by inference) that U
is rdf:type foaf:Document then I regard it as a document, if it says it
is rdf:type org:Organization then I treat it as denoting some
organization and not a web page. 

If I have only the one U then as publisher I can only use it for one
thing - in this case a document or some organization. Which means if I
choose the latter then I don't have a way of talking about the document
carrying the RDF. In particular if I use content negotiation to return a
nice human readable HTML page about the organization then I can't state
anything about the creator or license for that web page, only about the
organization.

One URI (piece of territory) denotes one concept (can be zoned for one
use).

Sometimes (I would argue "lots of times") that is actually just fine.
Often in linked data the HTML page is simply some machine generated
rendering of the exact same RDF and talking about the provenance or
nature of that rendered page separately is not required.

In situations were you do want to talk about both the concept and the
document you get when you browse to that concept (e.g.
legislation.gov.uk), then you need two URIs Upage and Uconcept. In the
"publisher decides" approach that is just fine. You make sure that what
RDF you get back when you deference either of these uses them
appropriately. Preferably you explain the relationship by using
wdr:describedBy or foaf:primaryTopicOf. Using fragmentIDs or 303 as a
way of relating those two URIs at the protocol level is fine and a
convenience but the essence of [1] is that the relationship is primarily
conveyed in the data.

> It doesn't help for the publisher to decide anything, if (a) it's not
> generally recognized that it has that authority, (b) we don't know how
> to learn from the publisher what it decided, (c) we can't distinguish
> cases in which the publisher decides from those in which it doesn't.

Now you've lost me, we are not in that state.

(a) AWWW defines the notion of URI ownership and authority
(b) you just dereference the URI
(c) is that the thing that needs to be distinguished? Surely the primary
requirement is to know whether I'm talking about the page or the thing
denoted by the URI which returns that page. Existing content that makes
that distinction via the existing mechanisms (303, fragID) would
continue to do that and continue to be supported. Existing content that
doesn't make the distinction is ambiguous anyway. Under [1] new content
would be free to not use 303 but if it does so it needs to be
sufficiently clear in its RDF about what it is using U to denote.

> Harry Halpin has suggested that ties like this be tolerated,
> recognized, and if necessary broken by providing additional
> information. I'm open to this solution, too, if it leads to some
> procedure that is reliable. It is just another rule (one that combines
> two or three other rules). All I'm trying to say is that there is a
> real coordination problem here, it will take more than just unilateral
> designs to fix it, and that strategy and process discussion may be
> more important at this point than technical debate.

Agreed. I've not said anything about unilateral designs. 

The primary point of my message was a process one about the framing of
"definition discovery". Now that you've clarified it is a terminology
issue, rather than part of the problem being missed out, I'm happy and
should go back to radio silence.

Dave

[1] http://blog.iandavis.com/2010/12/06/back-to-basics/
Received on Friday, 1 July 2011 10:36:31 UTC