W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2012

Proposal for rdf:XMLLiteral (ISSUE-13)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 1 May 2012 13:30:26 +0100
Message-Id: <1FA1A9DF-6277-4831-B347-538317F1AA7E@cyganiak.de>
To: RDF Working Group WG <public-rdf-wg@w3.org>
Dear All,

Below is a PROPOSAL to resolve ISSUE-13. This completes ACTION-126. The design is based on a poll we had back in November. My working notes, including the poll results, discussion of the results, and justification for the design proposed below are available on the wiki:
http://www.w3.org/2011/rdf-wg/wiki/XML_Literals

I propose to put this to the vote at tomorrow's call.

Best,
Richard


== Executive summary ==

 Make rdf:XMLLiteral optional in the datatype map
 Change rdf:XMLLiteral lexical space to allow
  non-canonical but well-formed XML
 Define a canonical lexical form for rdf:XMLLiteral
  that is equivalent to the old lexical space
 Re-define the value space in terms of XML infosets (this
  should be in 1:1 correspondence to the old value space
  and old lexical space)


== Normative changes to RDF Concepts ==

All changes are relative to the current RDF Concepts ED:
http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-XMLLiteral

The current definition of the rdf:XMLLiteral lexical space is:

[[
The lexical space is the set of all strings:

   which are well-balanced, self-contained XML content [XML10];
   for which encoding as UTF-8 [UTF-8] yields exclusive Canonical
    XML (with comments, with empty InclusiveNamespaces PrefixList)
    [XML-EXC-C14N];
   for which embedding between an arbitrary XML start tag and an
    end tag yields a document conforming to XML Namespaces
    [XML-NAMES]
]]

DELETE the second bullet point of the definition above.

REPLACE the current definition of the rdf:XMLLiteral value space with this definition:

[[
The value space is the set of all ordered lists of information items [XML Infoset] that contain only character information items, element information items, comment information items, and processing instruction information items.
]]

REPLACE the current definition of the rdf:XMLLiteral L2V mapping with this definition:

[[
The lexical-to-value mapping is defined as follows:

   Wrap the lexical form between an arbitrary XML start-tag and 
    matching end-tag, yielding an XML document [XML10]
   Take the XML infoset corresponding to the XML document
   Return the list of children of its document element information item
]]

ADD the following definition for an rdf:XMLLiteral canonical mapping:

[[
The canonical mapping is defined as Exclusive XML Canonicalization [XML-XC14N] (with comments, with empty InclusiveNamespaces PrefixList).
]]


== Informative changes to RDF Concepts ==

REMOVE the following sentence:

[[
This allows the inclusion of text that contains markup, such as XHTML [XHTML11].
]]

Instead, ADD the following sentence:

[[
This allows the inclusion of XML payloads in RDF graphs, as well as text that contains markup, such as XHTML [XHTML11].
]]

ADD the following informative Note:

[[
Any XML namespace declarations (xmlns) and language annotation (xml:lang) desired in the XML content must be included explicitly in the XML literal. Note that some concrete RDF syntaxes may define mechanisms for inheriting them from the context (@parseType="literal" in RDF/XML [RDFXML]).
]]

REMOVE the following three informative notes:

[[
XML values can be thought of as the [XML-INFOSET] or the [XPATH] nodeset corresponding to the lexical form, with an appropriate equality function.

RDF applications may use additional equivalence relations, such as that which relates an xsd:string with an rdf:XMLLiteral corresponding to a single text node of the same string.

If language annotation of XML literals is required, it must be explicitly included as markup, usually by means of an xml:lang attribute.
]]

Possibly add an example or two for rdf:XMLLiteral, using editorial discretion.


== Changes to other documents ==

Change RDF Semantics so that rdf:XMLLiteral is no longer interpreted in RDF-Entailment, but only in D-Entailment.

Change the definition of datatype map in RDF Semantics to make the presence of rdf:XMLLiteral in the map optional (or drop the definition altogether in favour of a reference to RDF Concepts, which already includes a definition of datatype maps in accordance with ACTION-122)

No changes to RDF/XML or Turtle.


== Notable consequences of this design ==

 Nothing changes for RDF/XML
 RDF/XML parsers who violate the RDF/XML spec by
  not canonicalizing @parseType="literal" on input
  are still non-conforming, but no longer produce
  ill-typed literals
 Turtle "<foo/>"^^rdf:XMLLiteral is no longer ill-typed
 Turtle "<foo/>"^^rdf:XMLLiteral and
  RDF/XML <x:y rdf:parseType="literal"><foo/></x:y> now
  result in different triples (but same meaning). This
  can be unexpected where content negotiation is involved,
  because the choice of serialization syntax for the same
  graph now means we potentially get different triples.
Received on Tuesday, 1 May 2012 12:30:57 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:48 GMT