Re: The value space of XMLLiteral (was: Re: XML literals poll) from Ivan Herman on 2011-11-24 (public-rdf-wg@w3.org from November 2011)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 24 Nov 2011 09:54:07 +0100
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Gavin Carothers <gavin@carothers.name>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <1B9A4FBA-C269-4DC3-ACB2-836DE2D47EB5@w3.org>

On Nov 23, 2011, at 23:24 , Richard Cyganiak wrote:

> Hi Gavin,
> 
> Thanks for the comments! A comment and question below.
> 
> On 21 Nov 2011, at 19:44, Gavin Carothers wrote:
>>> Q1. Should the specs define a way to compare XML literals based on value?
>>> 
>>> In other words, in the same way that integers 7 and 007 have the same value, should <foo/> and <foo></foo> be defined as having the same value?
>> 
>> No. The value space of an XML fragment or document is far too complex
>> for our WG to deal with (schema annotation, DTD parse additions,
>> white-space rules, etc). There are too many special cases, and too
>> little value. XPath and XQuery stop short of doing this, that should
>> be a hint to us.
> 
> I note that RDF 2004 does this already, and it doesn't look too complex – they just referred to Exclusive XML Canonicalization spec:
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#XMLLiteral-lexical-space
> 
> (This is a link to the lexical space definition – the value space is 1:1 with the lexical space in the 2004 design.)
> 

Actually... my claim is that if we use the xml infoset value approach, we can keep silent on canonicalization altogether in our specs, and scrap all references to it. How two valid XML fragments could/should be compared for equality in terms of infosets is not something this WG has to solve; it is an implementation issue that has to be based on what the XML community defines. Not our job.

That has the advantage of making the syntax issues clearer, ie, what a parser does or doesn't have to do.

Ivan


> Saying that the value space is an XML infoset doesn't seem to be too complicated. I believe there is a 1:1 relationship between canonicalized XML documents and XML infosets:
> http://www.w3.org/TR/xml-infoset/
> 
> So, in terms of handling all the special cases and so on, isn't this a solved problem?
> 
> Best,
> Richard


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Thursday, 24 November 2011 08:51:30 UTC