Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF from Nathan on 2011-01-20 (public-lod@w3.org from January 2011)

From: Nathan <nathan@webr3.org>
Date: Thu, 20 Jan 2011 22:30:45 +0000
To: Dave Reynolds <dave.e.reynolds@gmail.com>
CC: "public-lod@w3.org" <public-lod@W3.ORG>, Tim Berners-Lee <timbl@w3.org>, Sandro Hawke <sandro@w3.org>, Ivan Herman <ivan@w3.org>
Message-ID: <4D38B795.3020200@webr3.org>

Dave Reynolds wrote:
>> Okay, I agree, and I'm really not looking to create a lot of work here,
>> the general gist of what I'm hoping for is along the lines of:
>>
>> RDF Publishers MUST perform Case Normalization and Percent-Encoding
>> Normalization on all URIs prior to publishing. When using relative URIs
>> publishers SHOULD include a well defined base using a serialization
>> specific mechanism. Publishers are advised to perform additional
>> normalization steps as specified by URI (RFC 3986) where possible.
>>
>> RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case
>> Normalization and Percent-Encoding Normalization.
>>
>> Two RDF URIs are equal if and only if they compare as equal, character
>> by character, as Unicode strings.
> 
> I sort of OK with that but ...
> 
> Terms like "RDF Publisher" and "RDF Consumer" need to be defined in 
> order to make formal statements like these. The RDF/OWL/RIF specs are 
> careful to define what sort of processors are subject to conformance 
> statements and I don't think RDF Publisher is a conformance point for 
> the existing specs.
> 
> This may sound like nit-picking that's life with specifications. You 
> need to be clear how the last para about "RDF URIs" relates to notions 
> like "RDF Consumer".
> 
> I wonder whether you might want to instead define notions of Linked Data 
> Publisher and Linked Data Consumer to which these MUST/MAY/SHOULD 
> conformance statements apply. That way it is clear that a component such 
> as an RDF store or RDF parser is correct in following the existing RDF 
> specs and not doing any of these transformations but that in order to 
> construct a Linked Data Consumer/Publisher some other component can be 
> introduced to perform the normalizations. Linked Data as a set of 
> constraints and conventions layered on top of the RDF/OWL specs.

Fully agree, had the same conversation with DanC this afternoon and he 
too immediately suggested changing RDF Publisher/Consumer to Linked Data 
Publisher/Consumer. Also ties in with earlier comments about 
standardizing Linked Data, however it's done, or worded, my only care 
here is that it positively impacts the current situation, and doesn't 
negatively impact anybody else.

> The specific point on the normalization ladder would have to defined, of 
> course, and you would need to define how to handle schemes unknown to 
> the consumer.
> 
> All this presupposes some work to formalize and specify linked data. Is 
> there anything like that planned?  In some ways Linked Data is an 
> engineering experiment and benefits from that freedom to experiment. On 
> the other hand interoperability eventually needs clear specifications.

Unsure, but I'll also ask the question, is there anything planned? I'd 
certainly +1 standardization and do anything I could to help the process 
along.

>> For many reasons it would be good to solve this at the publishing phase,
>> allow normalization at the consuming phase (can't be precluded as
>> intermediary components may normalize), and keep simple case sensitive
>> string comparison throughout the stack and specs (so implementations
>> remain simple and fast.)
> 
> Agreed.

cool, thanks again Dave,

Nathan

Received on Thursday, 20 January 2011 22:31:54 UTC