Re: ISSUE: XMLLiteral and xml:lang from Jeremy Carroll on 2003-01-30 (www-webont-wg@w3.org from January 2003)

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Thu, 30 Jan 2003 11:36:54 +0100
To: www-webont-wg@w3.org
Message-Id: <200301301136.54883.jjc@hpl.hp.com>
(This message is too long - I will send a shorter follow up which consists 
only of test cases on the glitch - you may prefer to read that one only).


>   Let me see if I understand - this one, like the annotations 
> question is with respect to whether these features of our language, 
> which are in OWL Full (by the RDF inclusion principle :->) should 
> also be in Owl Lite and Owl DL.

Correct, since they are both critical for I18N objectives and requirements, I 
would be very unhappy if they are not in OWL Lite.

>    Is there any implication to our model, complexity, etc. which is 
> affected by this?  That is, if we were to add these in Lite, can you 
> give some examples of what we could or couldn't do that would effect 
> reasoning?

I believe RDF Core has made every effort to make this usable in OWL.
If we think not we should make this clear in a last call comment to RDF.

There is one glitch that I am aware of to do with XMLLiteral.
I will first describe the functioning of XMLLiteral, then the glitch, then 
suggest two possible courses of action on that glitch.

The way XMLLiteral works is that

The document authot uses the parseType="Literal" construction:
see
http://www.w3.org/TR/2003/WD-rdf-syntax-grammar-20030123/#section-Syntax-XML-literals
for example 9.

This is read by the parser and some string is produced which can form an XML 
document when enclosed between a start and end tag.
In example 9, a possible such string is:
'<a:Box required="true" xmlns:a="http://example.org/a#">
         <a:widget size="10"  />
         <a:grommit id="23" /></a:Box>
    '
This string is given the datatype rdf:XMLLiteral, which maps it to a specific 
Canonical XML document in the domain of discourse.
As specified in:
http://www.w3.org/TR/2003/WD-rdf-concepts-20030123/#section-XMLLiteral
vis:
[[
The mapping
is defined as the function that maps a pair or string to the canonical form 
[XML-C14N] (with comments) of the corresponding XML document.
]]


The glitch
========
While the mapping through XML-C14N is wholly deterministic and well-defined, 
its original input is not, but has deliberately been left as (slightly) 
implementation dependent.
See:
http://www.w3.org/TR/2003/WD-rdf-syntax-grammar-20030123/#parseTypeLiteralPropertyElt

[[
This specification allows some freedom to choose exactly what string is used 
as the lexical form of an XML Literal. Whatever string is used, it MUST 
correspond to an XML document when enclosed within a start and end element 
tag, and its canonicalization (without comments, as defined in Exclusive XML 
Canonicalization [XML-XC14N]) MUST be the same as the same canonicalization 
of the literal text l. It is often acceptable to use l without any changes 
but this is incorrect if, for example, l uses entity references or namespace 
prefixes defined in the outer XML document.
]]

Note, that example 9 was an example in which the final sentence does not 
apply.

The reason for this glitch is that:
- most usages (e.g. any based on XHTML) are fully addressed by the Exclusive 
XML Canonicalization without comments and with an empty  InclusiveNamespaces 
PrefixList
(see
http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#def-InclusiveNamespaces-PrefixList
)
- some potential usages use namespaces in a more advanced way - for examples 
those that include XPath expressions within attribute values. We could not 
work out a general solution for such usages. We beleived that what we had 
done was a significant improvement on M&S, but we should not exclude later 
work that addressed such uses.

An example problem would be:

...
<rdf:Description
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:eg="http://www.example.org/"
>
  <eg:prop rdf:parseType="Literal">
<!-- this is part of an XSLT transform -->
     <xsl:template match="eg:foo"/>
  </eg:prop>
</rdf:Description>

when using the Exclusive Canonicalization, without comments, and with empty 
InclusiveNamespaces PrefixList delivers something like the following string:
'

     <xsl:template xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
match="eg:foo"></xsl:template>
 ' 
(there should not be a new line in the start element tag, but my mailer has 
stuck it in).
The significant thing is that the namespace declaration for eg has got lost, 
despite it being significant to what the XSLT transform does.


Proposed Solutions
===============
1: accept the RDF Core position and leave this as implementation dependent - 
This is my preference. I would be surpirsed if Stanton, for example, would be 
happy with this.

2: decide that the OWL requirements (support of XHTML and friends) are wholly 
met using the "Exclusive Canonicalization, without comments, and with empty 
InclusiveNamespaces PrefixList" (we could change that to with comments if 
people preferred), and add text like the following somewhere (don't know 
where).

[[
When reading RDF/XML documents OWL processors SHOULD use the freedom granted 
them under para 7.2.17 of RDF Syntax by using the Exclusive Canonicalization, 
without comments, and with empty InclusiveNamespaces PrefixList.
Other variations may only be used on specific user instruction.
]]

3: Make the above comment to RDF Core suggesting they are more specific.
Received on Thursday, 30 January 2003 11:55:54 UTC