RE: XML literals from Martin Duerst on 2003-08-07 (w3c-rdfcore-wg@w3.org from August 2003)

From: Martin Duerst <duerst@w3.org>
Date: Thu, 07 Aug 2003 16:51:39 -0400
To: <Patrick.Stickler@nokia.com>
Cc: <w3c-rdfcore-wg@w3.org>
Message-Id: <4.2.0.58.J.20030807164410.0308f3b0@localhost>
At 14:34 03/08/05 +0300, Patrick.Stickler@nokia.com wrote:

>         -----Original Message-----
>         From: ext Martin Duerst [mailto:duerst@w3.org]

>         At Mon, 4 Aug 2003 10:53:45 +0300, Patrick.Stickler@nokia.com wrote:
>
>         > Fine, then we define it ourselves. Let the lexical form itself 
> be the
>         UTF8 encoded canonical form
>
>         No other lexical form is UTF-8 encoded. Lexical space is always
>         on strings of characters, irrelevant of encoding.

>Then let it be a unicode string. I'm not particular on that point. Only 
>that it be consistently and explicitly defined.

Just saying that it is an unicode string in the denotation raises
all kinds of problems. If done the right way (making sure markup
is separated from characters), that would be very nice, but getting
this wrong would be much worse than letting it undefined.


>         > And being canonicalized XML fragments, implementors know what 
> the values
>         are and what to
>         > do with them.
>
>         Overall, I'm wondering why you are opposed to have them 'different
>         from everything else',
>
>I was simply not sure that Pat's claim that they were different from any 
>other XSD value space is necessarily true. There are alot of questionmarks 
>there, and we may later *want* there to be some intersection somewhere 
>with XSD so if  it doesn't absolutely need to be said, why say it?

I agree, in particular for equivalence with plain literals, and
therefore xsd:string.


>         but are okay with 'same as octet sequences'
>         if you claim that implementers anyway know what to do with them,
>         i.e. treat them as canonicalized XML fragments.
>
>         Maybe this is some basic unease at having something undefined.
>         This would indeed allow others to write an implementation that
>         would treat them as pumpkins. But who would seriously do that?
>
>         Neither equating XML fragments with octet sequences nor equating
>         them with pumpkins seems very adequate. And it seems strange to
>         me that we would have to do something rather inadequate just to
>         avoid something maybe even more inadequate, in particular if
>         that other thing (the pumpkins) is utter nonsense.
>
>
>
>What exactly are you proposing we say the value space of XML literals 
>contains?

Here is my list of preferences (best first):

1) XML Literals are sequences of characters and markup (so that
    XML Literals not containing markup are the same as corresponding
    plain literals.

2) XML Literals are not equal to any xsd type except xsd:string.
    Their relationship to xsd:string and plain literals is currently
    undefined.

3) XML Literals are not equal to any xsd:string nor plain literals
    (current proposal by Pat)

---- cutoff, below here doesn't seem right at all

4) XML Literals are sequences of characters (taking markup as
    just plain characters)

5) XML Literals are sequences of octets


Regards,   Martin.
Received on Thursday, 7 August 2003 16:52:00 UTC