Re: Denotation of XMLLiterals: poll from Patrick Stickler on 2003-08-07 (w3c-rdfcore-wg@w3.org from August 2003)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Thu, 7 Aug 2003 09:27:00 +0300
To: "ext Brian McBride" <bwm@hplb.hpl.hp.com>
Cc: "rdf core" <w3c-rdfcore-wg@w3.org>
Message-ID: <001501c35cac$e8a69f30$f89216ac@NOE.Nokia.com>

  ----- Original Message ----- 
  From: ext Brian McBride 
  To: Patrick Stickler 
  Cc: rdf core 
  Sent: 06 August, 2003 16:13
  Subject: Re: Denotation of XMLLiterals: poll

  Patrick Stickler wrote:
  > Whatever solution we choose, it should provide enough information
  > to test equality of values.
  >  
  > Option A does not do that.

  Sorry, if I've not been clear.  With option A, I had in mind something 
  close to what Pat suggested.  This includes the notion that the mapping 
  from the lex space to the value space of xml literals is 1:1.

  Thus it is possible to test whether xml literal values are equal by 
  comparing their lexical forms.

OK. I can see how that would be sufficient, albeit a little odd.

Perhaps we can use that approach in conjunction with XML Infosets,
specifying that the value space consists of Infosets which are
serializable as canonical XML according to the defined lexical
space of rdfs:XMLLiteral, in 1:1 correlation with the canonical 
lexical forms, and that the comparison function for the members 
of the value space is character sequence equality of their 
canonicalized form.

So, testing for equality utilizes the canonicalized lexical forms, 
but the values are Infosets, with all that implies, and not just 
character sequences.

This would solve the primary deficiency of the Infoset spec, that
it fails to provide a method of comparison, yet still capture
the desired result that we are dealing with something much
richer than strings.

???

> >  
> > Option C is completely unnacceptable to me. It again introduces
> > a unique treatment for the rdf:XMLLiteral datatype, among other
> > shortcomings that I've detailed before and won't repeat here.

> Thanks for being brief Patrick, 

Yeah. What a nice change, eh? ;-)

> but in this case I could do with a 
> reminder.  

In a nutshell, a URIref denotes some resource. That URIref is not
an inherent part of that resource. Positing some value that pairs
a specific URIref with a lexical form strikes me as a layering
error. It also precludes using other URIrefs to denote the same
datatype, or from using mechanisms such as owl:sameIndividualAs
or rdfs:subClassOf to equate or relate other, proprietary
vocabularies or specialized datatypes to the core of RDF.

In short, it goes against what I see as fundamental aspecs of
RDF and the (still overly vague but emerging) SW architecture.

> > If none of the above seem to work, then there is the fourth
> > option which is to say that XML literals are self denoting,
> > being canonicalized XML fragments, and those fragments are
> > comparible by character sequence, and may be mapped by XML
> > applications to other things, such as XML Infosets,
> > DOM trees, XPath nodesets, whatever.
> > 
> > The trouble with that seems to be that it fails to distinguish between 
> > markup and text, e.g.
> > 
>    _:a eg:prop "&lt;br&gt;&lt;/br&gt;" .
> 
> rdf entails
> 
>    _:a eg:prop "<br></br>"^^rdf:XMLLiteral .
> 
> I think there is general agreement that is a bad thing.

Agreed. I think that my proposed approach of marrying XML Infosets
with canonicalized comparison may avoid that problem (hopefully
without creating too many others).

Patrick

Received on Thursday, 7 August 2003 02:27:04 UTC