Re: RDF-ISSUE-13 (RDF XMLLiterals): Review RDF XML Literals [Cleanup tasks] from Jeremy Carroll on 2011-03-09 (public-rdf-wg@w3.org from March 2011)

From: Jeremy Carroll <jeremy@topquadrant.com>
Date: Tue, 08 Mar 2011 17:38:15 -0800
To: Ivan Herman <ivan@w3.org>
CC: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <4D76DA07.6000201@topquadrant.com>
I realized I should have given a simple answer.

The decision was based on not wanting to require an XML subsystem within 
an RDF reasoner. Therefore the XML processing is confined to the RDF/XML 
parser that has to handle XML anyway.

Jeremy


On 3/8/2011 5:27 PM, Jeremy Carroll wrote:
> Hi Ivan
> http://www.w3.org/TR/2003/WD-rdf-concepts-20031010/#section-substantive-Revisions 
>
> Under "XMLLiteral simplification" gives the blow by blow account.
>
> There is text embedded in
> http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0170.html
>
> Joe Reagle says:
> [[
>
> I presume that the reason you even care how the xml-literal is 
> represented
>> >  is that you will want to compare RDF instances (which might contain
>> >  xml-literals) to see if they are identical at some point?
> ]]
>
> The current design is intended to make that easy, and put the burden 
> of XML processing within the RDF/XML parser.
> A turtle or N3 parser is not required to have an XML subsystem, 
> whereas the older design, which canonicalized as part of the lex2value 
> mapping required all RDF implementations to be able to do that.
>
> Notice also point i in
> http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0335.html
> [[
>
>  An example fix would be
> to require an RDF/XML parser to use a specific canonicalization on
> input.
>
> ]]
> A proposal that was accepted in full.
>
> Jeremy
>
>
> On 3/8/2011 12:41 AM, Ivan Herman wrote:
>> Jeremy,
>>
>> just want to understand... what was the reason xc14n was required on 
>> the lexical space? I would expect that xc14n is important to be able 
>> to compare xml literals but that is a value space issue. Just like 
>> 123.456 is identical, in value space, to 123.4560000
>>
>> Thanks
>>
>> Ivan
>>
>> On Mar 7, 2011, at 20:28 , Jeremy Carroll wrote:
>>
>>> The motivation in the 1999 M&S spec, and the 2004 Recs for XML 
>>> Literals were to do with I18N use cases involving HTML (and in some 
>>> of them Ruby)
>>>
>>> I believe that for at least some of these use cases we would now 
>>> recommend RDFa.
>>>
>>> I think there are some use cases that are not addressed by RDFa.
>>>
>>> Once you take the use cases seriously, then you end up somewhere not 
>>> a million miles away from the current specs, with all their problems.
>>>
>>> I suspect an underlying error in the 2002-2004 work was the 
>>> following incorrect reasoning:
>>> - it is important for RDF to carry rich text literals (e.g. 
>>> involving Ruby markup)
>>> - it is important to be able to tell if two RDF fragments are the same
>>> Hence:
>>> - it is important to be able to compare two rich text literals in 
>>> RDF [It is this that leads to the XC14N dance]
>>>
>>> Jeremy
>>>
>>>
>>> On 3/7/2011 5:35 AM, Ivan Herman wrote:
>>>> On Mar 7, 2011, at 14:25 , RDF Working Group Issue Tracker wrote:
>>>>
>>>>> RDF-ISSUE-13 (RDF XMLLiterals): Review RDF XML Literals [Cleanup 
>>>>> tasks]
>>>>>
>>>>> http://www.w3.org/2011/rdf-wg/track/issues/13
>>>>>
>>>>> Raised by: Andy Seaborne
>>>>> On product: Cleanup tasks
>>>>>
>>>>> RDF Concepts:
>>>>> http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
>>>>>
>>>>> RDF Syntax:
>>>>> http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-XML-literals
>>>>>
>>>>> The lexical space of RDF XML Literals is XML fragments which are 
>>>>> required to be "exclusive canonical XML".  The lexical space and 
>>>>> the value space are in 1-1 correspondence. The rules are quite 
>>>>> complicated. These rules for canonicalization apply to the lexical 
>>>>> form; equality testing can be done using string compare.
>>>>>
>>>>> Canonicalization rules include no use of<tag/>   and that 
>>>>> attributes must be in sorted order (this is not an exhaustive list).
>>>>>
>>>>> A consequence of this is that many correct XML fragments are not 
>>>>> legal as XML Literals because they do not correspond to exclusive 
>>>>> canonicalization.
>>>>>
>>>>> Possible cleanup includes partially relaxing the lexical space 
>>>>> restrictions while retaining the value space so that fragments can 
>>>>> be used as XML literals without complex processing.
>>>>>
>>>> +10^infinite
>>>>
>>>> I know of no RDF serializers around that would produce correct XML 
>>>> Literals in this sense. They all produce valid XML, with hopefully 
>>>> the right namespace declarations (though that does not always 
>>>> happen either) but they certainly do not necessarily go through the 
>>>> extra mile of canonicalization. And there is no reason for that 
>>>> either: canonicalization comes into place when two XML fragments 
>>>> must be compared as strings; but this should be done in value space 
>>>> and not in lexical space...
>>>>
>>>> Ivan
>>>>
>>>>> RDF XML Literals are the only datatype hard wired into RDF.
>>>>>
>>>>> If a Turtle document is to be validated, will that require access 
>>>>> to an XML parser and canonicalization engine?
>>>>>
>>>>>
>>>>>
>>>>>
>>>> ----
>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>
>>
>>
>>
>>
>
Received on Wednesday, 9 March 2011 01:38:38 UTC