Re: URIs in data primer draft updated & httpRange-14 background

Hi David,

Apologies for not replying sooner. I've been waiting to have time to roll in the changes that we discussed at the F2F in March, which I've just done to produce a new editor's draft at:

  http://www.w3.org/2001/tag/doc/urls-in-data-2013-04-27/

This will be published as a First Public Working Draft, but there's plenty of time left in the W3C process for comments and discussion on its contents.

On 20 Mar 2013, at 03:19, David Booth <david@dbooth.org> wrote:
> 1.  This is nicely written and presents a very good example.  Kudos!

Thank you :)

> 3. The document tries to do two different things:
> 
>  - It suggests certain RDF properties that data publishers can use to indicate to data consumers whether a URI in that data will dereference to the thing that the URI denotes, versus dereferencing to a description of the thing that it denotes.
> 
> - It vaguely describes a protocol between data authors/publishers, URI owners and data consumers, for coordinating the provision and use of URI definitions, roughly along the lines of
> http://www.w3.org/wiki/UriDefinitionDiscoveryProtocol
> 
> The first of these goals is achieved very nicely, and I believe the document should be slightly retitled to more tightly convey this goal, and should focus only on this goal.
> 
> There are multiple problems with attempting to achieve the second goal in this document:
> 
> - It is an independent goal, and would be better addressed in its own document.
> 
> - It is very mushy as written.  It does not rise to the level of precision that a protocol specification needs.  For this reason, it would be harmful to publish as is, as it would simply create more confusion rather than adding clarity.  I *do* think it is a worthy goal (and would be happy to help work on it, as I am sure that you and others are aware that I have devoted a great deal of time and thought to figuring out these issues), but it belongs in its own document.
> 
> I imagine that there are those who would claim that it is okay for this part of the document to be mushy, in the belief that such a protocol is impossible or impractical or whatever.  I firmly disagree.  But regardless, that is a question that should be decided on its own merit, rather than by publishing a mushy spec under the *assumption* that it nothing better could be done.

I think that you are referring to Sections 5.2 [1] and 5.3 [2], am I correct? I think that there are three choices here:

  1. remove these sections and provide no guidance to publishers or consumers
  2. retain these sections roughly as they are, but include a note or some such to indicate that they are merely restating practices that are defined elsewhere, not tightly defining a protocol
  3. tightly define a protocol

I would prefer we do 2 or 3 as 1 leaves publishers and consumers lacking information that they could rightly expect to find gathered in one place within this document, and because I think the TAG is extremely unlikely to spend time in this area once this document is published.

I do not have a strong feeling either way with 2 or 3. What do you think it would take to do 3? How far is the current "mushy" wording from what would be needed?

> 4. There is an important omission in the Note of Section 3, which reminds the user of the importance of using different URIs for a different things:
[snip]
> Returning to the Note in Section 3, I would suggest explicitly acknowledging that there can be different viewpoints about what constitutes the same or different resources, and architecturally it is up to the URI owner to decide whether two resources are the same resource (at a more abstract level) or different.  This would align with the AWWW's existing guidance on the meaning of a URI containing a fragment identifier, when different media types are served via content negotiation.  As AWWW section 3.2.2 states:
> http://www.w3.org/TR/webarch/#p137
> "The representation provider decides when definitions of fragment identifier semantics are are sufficiently consistent."
> Ultimately, the decision about whether two resources need to be considered the same or different depends on the applications that will use them, as different distinctions matter to different applications.
> 
> Perhaps it would be enough to add a sentence at the end of the Note in Section 3, roughly along these lines: "However, it is up to the data publisher to decide whether these resources are similar enough to be considered the same at a more abstract level, in which case the URI that identifies is really identifying that abstract resource.

I have expanded the note a little. Please see whether that's satisfactory.

> 5.  Regarding the JSON examples, although the JSON and Turtle examples are asserted to be equivalent, JSON, when interpreted as a serialization of RDF, is not self-describing
> http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
> because a recipient knowing only that it was JSON would not know the conventions for interpreting it as an RDF serialization.  Section 2 does mention JSON-LD, which hopefully will become a full-fledged RDF serialization of RDF (though last I knew it was at risk of becoming a competing language), in which case JSON-LD would be self-describing, because of its JSON-LD media type.  My suggestion: make clear that the examples are JSON-LD (and pray that JSON-LD is standardized to be an RDF serialization) and thus should be delivered with a JSON-LD media type -- not merely generic JSON.

I'm not sure why it matters whether the syntaxes used in the examples are self-describing or not, for the purpose of this document. Whether the description of the vocabulary is found through follow-your-nose or through a search against the internet, the same principle (of documenting what properties actually apply to) holds true.

Can you expand on why you think it matters?

> 6. It may be helpful to have the examples in both JSON-LD and Turtle throughout, because of the different audiences for this document.  I don't feel strongly about this though.  It's a judgement call.

My goal in primarily using JSON in this document is to avoid the misapprehension that this is a purely Semantic Web problem. I'm inclined to keep it as it is.

> Finally, on the meta level, I would politely suggest that you and other members of the TAG be more active in attempting to include me in such work in the future.  Given how much time and thought I have put into these topics over the years -- as I am sure you are aware -- and given how important it is to reach community consensus, it is disappointing that you and others in the TAG did not reach out to include me.


This document is the result of a long process that you have been involved in. There has been a gap between the last sets of discussion and the publication of this document -- down to my own lack of availability, sorry -- which I guess has made you feel it's come out of the blue, but in fact your and everyone else's inputs to the process that have led to this document have been instrumental in shaping its scope and its content.

Thanks for your comments. As I said at the start of this mail, there's still plenty of time to iterate on the content of this document as it moves through the W3C publication process, and any comments that haven't been addressed in the current draft will be rolled forward as comments on the FPWD.

Thanks again,

Jeni

[1] http://www.w3.org/2001/tag/doc/urls-in-data-2013-04-27/#consuming-data
[2] http://www.w3.org/2001/tag/doc/urls-in-data-2013-04-27/#publishing-data
-- 
Jeni Tennison
http://www.jenitennison.com/

Received on Saturday, 27 April 2013 16:03:11 UTC