Re: RDF-ISSUE-13 (RDF XMLLiterals): Review RDF XML Literals [Cleanup tasks] from Jeremy Carroll on 2011-03-09 (public-rdf-wg@w3.org from March 2011)

From: Jeremy Carroll <jeremy@topquadrant.com>
Date: Tue, 08 Mar 2011 17:27:26 -0800
To: Ivan Herman <ivan@w3.org>
CC: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <4D76D77E.1050803@topquadrant.com>
Hi Ivan
http://www.w3.org/TR/2003/WD-rdf-concepts-20031010/#section-substantive-Revisions
Under "XMLLiteral simplification" gives the blow by blow account.

There is text embedded in
http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0170.html

Joe Reagle says:
[[

I presume that the reason you even care how the xml-literal is represented
>  >  is that you will want to compare RDF instances (which might contain
>  >  xml-literals) to see if they are identical at some point?
]]

The current design is intended to make that easy, and put the burden of 
XML processing within the RDF/XML parser.
A turtle or N3 parser is not required to have an XML subsystem, whereas 
the older design, which canonicalized as part of the lex2value mapping 
required all RDF implementations to be able to do that.

Notice also point i in
http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0335.html
[[

  An example fix would be
to require an RDF/XML parser to use a specific canonicalization on
input.

]]
A proposal that was accepted in full.

Jeremy


On 3/8/2011 12:41 AM, Ivan Herman wrote:
> Jeremy,
>
> just want to understand... what was the reason xc14n was required on the lexical space? I would expect that xc14n is important to be able to compare xml literals but that is a value space issue. Just like 123.456 is identical, in value space, to 123.4560000
>
> Thanks
>
> Ivan
>
> On Mar 7, 2011, at 20:28 , Jeremy Carroll wrote:
>
>> The motivation in the 1999 M&S spec, and the 2004 Recs for XML Literals were to do with I18N use cases involving HTML (and in some of them Ruby)
>>
>> I believe that for at least some of these use cases we would now recommend RDFa.
>>
>> I think there are some use cases that are not addressed by RDFa.
>>
>> Once you take the use cases seriously, then you end up somewhere not a million miles away from the current specs, with all their problems.
>>
>> I suspect an underlying error in the 2002-2004 work was the following incorrect reasoning:
>> - it is important for RDF to carry rich text literals (e.g. involving Ruby markup)
>> - it is important to be able to tell if two RDF fragments are the same
>> Hence:
>> - it is important to be able to compare two rich text literals in RDF [It is this that leads to the XC14N dance]
>>
>> Jeremy
>>
>>
>> On 3/7/2011 5:35 AM, Ivan Herman wrote:
>>> On Mar 7, 2011, at 14:25 , RDF Working Group Issue Tracker wrote:
>>>
>>>> RDF-ISSUE-13 (RDF XMLLiterals): Review RDF XML Literals [Cleanup tasks]
>>>>
>>>> http://www.w3.org/2011/rdf-wg/track/issues/13
>>>>
>>>> Raised by: Andy Seaborne
>>>> On product: Cleanup tasks
>>>>
>>>> RDF Concepts:
>>>> http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
>>>>
>>>> RDF Syntax:
>>>> http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-XML-literals
>>>>
>>>> The lexical space of RDF XML Literals is XML fragments which are required to be "exclusive canonical XML".  The lexical space and the value space are in 1-1 correspondence. The rules are quite complicated. These rules for canonicalization apply to the lexical form; equality testing can be done using string compare.
>>>>
>>>> Canonicalization rules include no use of<tag/>   and that attributes must be in sorted order (this is not an exhaustive list).
>>>>
>>>> A consequence of this is that many correct XML fragments are not legal as XML Literals because they do not correspond to exclusive canonicalization.
>>>>
>>>> Possible cleanup includes partially relaxing the lexical space restrictions while retaining the value space so that fragments can be used as XML literals without complex processing.
>>>>
>>> +10^infinite
>>>
>>> I know of no RDF serializers around that would produce correct XML Literals in this sense. They all produce valid XML, with hopefully the right namespace declarations (though that does not always happen either) but they certainly do not necessarily go through the extra mile of canonicalization. And there is no reason for that either: canonicalization comes into place when two XML fragments must be compared as strings; but this should be done in value space and not in lexical space...
>>>
>>> Ivan
>>>
>>>> RDF XML Literals are the only datatype hard wired into RDF.
>>>>
>>>> If a Turtle document is to be validated, will that require access to an XML parser and canonicalization engine?
>>>>
>>>>
>>>>
>>>>
>>> ----
>>> Ivan Herman, W3C Semantic Web Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>
>>>
>>>
>>>
>>>
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
Received on Wednesday, 9 March 2011 01:27:49 UTC