W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > August 2003

Re: FW: XML literals

From: Martin Duerst <duerst@w3.org>
Date: Wed, 13 Aug 2003 17:41:24 -0400
Message-Id: <>
To: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, <w3c-rdfcore-wg@w3.org>, <w3c-i18n-ig@w3.org>

Hello Jeremy,

As I already said in another context, this looks like a lot of
good work and progress. Some more comments below.

At 18:05 03/08/13 +0200, Jeremy Carroll wrote:

>(Resend - forgot I18N first time)
>The main planks of Pat's text from
>seemed to get support at the RDF Core WG telecon on Friday, I was actioned
>to move the conversation forward, and ensure that Martin and I18N were in on
>My understanding is that the main goal was to avoid any possibility of
>confusing XMLLiteral with xsd:hexBinary as in Martin's test case.
>I also am trying to adequately address Patrick's concerns while changing
>Pat's text as little as possible.
>Brian used the term "XML fragment" at the telecon, I am however sticking
>with Pat's "XML value" because of the existence of
>which makes Brian's preferred term misleading.
>I would be happy to consider other words for XML value.
>For completeness I also include stuff on the lexical space, since there was
>some concern that the wording is not about Unicde strings ... and the word
>"corresponding" ...
>I have numbered the notes for the sake of this e-mail, further discussion
>Patrick - please indicate whether the last two notes (2,3) adequately
>address your concerns. (3) ended up perhaps more geared towards some of
>Martin's concerns.
>I ended up unclear as to whether note 2 was wanted by the WG or not.
>The lexical space
>   is the set of all strings which:
>   + are well-balanced, self-contained XML data [XML];

I think you can directly refer to production [43] of the
XML specification (http://www.w3.org/TR/REC-xml#NT-content).
This allows some stuff like CDATA sections that canonical
XML will exclude, but otherwise is exactly what you need.

>   + correspond  under [UTF-8] encoding to exclusive Canonical
>     XML (with comments, with empty InclusiveNamespaces
>     PrefixList ) [XML-XC14N];

I think 'when encoded as [UTF-8]' would be slightly easier
to understand than 'under [UTF-8] encoding'.

>   + when embedded between an arbitrary XML start tag and an end tag
>     correspond to a document conforming to XML Namespaces [XML-NS]
>The value space is a set of entities, called XML values, which is:
>   + disjoint from the lexical space
>   + disjoint from the value space of any XML schema datatype [XML-SCHEMA2]
>   + disjoint from the set of Unicode character strings [Unicode]
>   + in 1:1 correspondence with the lexical space.
>The lexical-to-value mapping
>    is a one-one mapping from the lexical space onto the value space,
>    i.e. it is both injective and surjective.
>Note (1): Not all lexical forms of this datatype are compliant with XML 1.1
>[XML 1.1]. If compliance with XML 1.1 is desired, then only those that are
>fully normalized according to XML 1.1 should be used.
>Note (2): XML values can be thought of as the [XML Infoset] or
>the [XPath] nodeset corresponding to the lexical form, with an appropriate
>equality function.
>Note (3): RDF applications may use additional equivalence relations, such as
>that which relates an xsd:string with an rdf:XMLLiteral corresponding to a
>single text node of the same string.
>I seem to recall concern about putting too much into notes.

Some people don't like notes. But they often help understand
a spec better, and we all have made experiences about how not
understanding a spec (or understanding it differently) has hurt us.

>Either the stuff
>is sufficiently important to go into the design, or it isn't.
>This may be sufficient to kill notes (2) and (3). I am reluctant to drop
>note (1) since the RDF specs have largely followed charmod on NFC which puts
>us into a somewhat anomolous position between XML 1.0 and XML 1.1 ...

I'm confused here. Graham said that RDF allowed control characters,
but XML 1.1 doesn't, and I showed him that it did. But XML 1.0

>If the notes add clarity then it is probably best to keep them.

I think both notes add clarity. From an i18n viewpoint, note (3)
is definitely helpful. For XML people, note (2) will be helpful,
although it may be better to change 'can' to 'may', i.e.
"XML values may be thought of as...". But that's a minor detail.

Regards,    Martin.
Received on Wednesday, 13 August 2003 17:58:14 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:54:07 UTC