Dissenting Opinion: xml:lang and rdf:parseType="Literal"

@@ draft jjc - for consideration by WG @@

@@ maybe this should go in the issue list under rdfms-literal-is-xml-structure. Maybe this should go in a call to advance, how useful is it to have this available along with the 2nd last call to review? Maybe retitle as "Detailed Rationale for Decision rdfms-literal-is-xml-structure", and link off issues list. @@

The Internationalization Working Group has registered a dissenting opinion on the treatment agreed by RDFCore concerning rdf:parseType="Literal". This dissent is to changes made by the RDF Core working group in response to comments concerning the last call design, particularly comments concerning the datatype rdf:XMLLiteral. These changes are reflected in the September 5th publication, particularly of RDF Concepts, RDF/XML Syntax, and RDF Semantics.

Rationale For Current Design

Comment Before or During 1st Last Call (early 2003)

This feature of RDF is the single feature to have attracted most comments both during and before last call. These included comments from Reagle ( on use of canonicalization and use of an XML wrapper), Prud'hommeaux, the Web Ontology WG, Patel-Schneider, (concerning: language tag in canonical XML; malformed literals of type rdf:XMLLiteral; typed literals and language tags; aliases of rdf:XMLLiteral; language tags in rdf:XMLLiteral in the LBase appendix), Berners-Lee , Marchiori.

Comment During Consideration of 1st Last Call Comments

Resolving the last call comments to the WG's (and the commentators') satisfaction involved changes that impacted aspects that were known to be important to the internationalization working group, and they were informed. Dürst then commented further (regarding language tagging and rdf:XMLLiteral, XMLLiteral and octets, using rdf:datatype="&rdf;XMLLiteral"). A detailed analysis was provided by Ishida.

The Working Group gave further consideration to the comments of Dürst and Ishida. Changes were made to avoid the problems with octets, and these were agreed by Dürst. The other arguments were not found to be compelling, for example Carroll's response to Ishida. Most of the substantive arguments had already been made in the WG decision of 9th May.

Options Considered

Before that decision, the WG has considered four different designs, for the result of an rdf:parseType="Literal":

A special sort of (untyped) literal
Such as in the 29th August 2002 Working Draft.
A special sort of typed literal.
Similar to the last call design. This would remain the only datatype that can have a language identifier.
A normal typed literal, with an XML wrapper
The wrapper carries an xml:lang attribute.
A normal typed literal, without an XML wrapper
This follows Exclusive XML Canonicalization, and loses the xml:lang attribute. This is the chosen design, in the current editors drafts.

In addition, further designs have been discussed as a result of the I18N comment. The one that appears closest to the position of the I18N group was that to unify plain and XML literals.

The essence of this proposal is that plain and XML literals are the same and must both be in exclusive C14N XML. A plain literal is converted by the parser into C14N XML by escaping the special characters (such as "<") as entities.

Various Considerations

Prior to the WG decision of 9th May, participants in the WG have argued that:

Further concerning the designs considered in more depth after the decision, the following points were made.

An important consideration, reflected most in the comments from the Web Ontology WG and Patel-Schneider's concerns, is that unless rdf:XMLLiteral is a normal datatype with no special treatment of language, then OWL Lite and OWL DL do not support it. No version of the OWL Abstract Syntax has permitted literals other than plain literals (with or without language tags) or typed literals (without a language tag). Thus, many possible solutions would require substantive changes to OWL DL and OWL Lite.

Summary

To summarize:

Special
untyped literal
Special
typed literal
Wrapped normal
typed literal
Normal
typed literal
no wrapping
All literals
are XML
Uses generic
datatyping
No No Yes Yes N/A
Easily permit
non-XML RDF
No No Yes Yes No
Permit non-built-in
datatype like
rdf:XMLLiteral.
No No Yes Yes No
Avoid an
RDF-specific solution
to the problem of
XMLcontext
Yes Yes No Yes ?
Relative Simplicity Yes No No Yes No
Inherit xml:lang Yes Yes Yes No Yes
Works with OWL
Candidate Rec
No No Yes Yes No
Legacy plain literals
are OK
Yes Yes Yes Yes No
Legacy XML literals
are OK
No No No No No

Comment On September 5th Working Drafts

@@ to be completed @@ We have received further comment concerning this aspect of our design as reflected in the 5th September 2003 Working Drafts:

The Working Group did accept an @@what concession do we make - add 'at risk' part, add exit criteria@@

Points Raised by I18N WG

This section briefly indicates how the points raised by the I18N WG in their formal objection relate to this rationale, and the discussion within the RDF Core Working Group.

Conflicting with XML 1.0 and general expectations
As pointed out by Borden, Tim Bray states in the Annotated XML 1.0 Specification that xml:lang Has No Required Effect. Given a view that the XMLLiteral elements in the graph are self-contained, then it is natural to have xml:lang not have any effect.
In terms of outreach, the RDF Core WG finds the graph view of RDF the compelling one. This is seen in the overall approach taken by the RDF Primer. It is also reflected in detail in the discussion in the archives initiated by Pat Hayes' message. (See msgs in that thread such as July-0083, July-0098).
Creating new RDF/XML documents
@@ suggest we add 'SHOULD xml:lang="" with rdf:parseType="Literal"'@@ There is some merit in the case made here. Counterarguments include:
Reasoning and query
This problem is difficult to understand. A dummy element such as <span> and <div> does not carry meaning, other than in its attributes. Queries designed for querying XHTML documents etc need to be able to ignore such dummy elements anyway.
Change of interpretation of xml:lang for existing RDF/XML documents
The argument here requires that the existing recommendation is sufficiently clear such that the use of xml:lang and rdf:parseType="Literal" gives interoperable results already. The RDF Core Working Group does not believe this, which is why we have spent time redesigning this aspect of RDF. As seen in the sumamry table, any design of this area will cause problems for legacy data and systems.
Availability of other Solutions
These alternatives are discussed above. RDF Core has weighed their advantages and disadvantages and made a decision to the best of its ability.

A further point is that the I18N objection links to some messages in the RDF Core archive, specifically from Stickler, Hayes and Carroll. These messages reflect the deliberations of the Working Group, and not the final decisions. The archives show that RDF Core has been open to other designs, has been aware of the I18N related problems caused by the current design and has made a considered decision. On the 9th May when the current design was decided, Carroll (alone) was recorded as abstaining - which he later informally retracted.