Re: Proposal for rdf:XMLLiteral (ISSUE-13)

Richard,

thanks for doing this.

It looks good to me...

Ivan


On May 1, 2012, at 14:30 , Richard Cyganiak wrote:

> Dear All,
> 
> Below is a PROPOSAL to resolve ISSUE-13. This completes ACTION-126. The design is based on a poll we had back in November. My working notes, including the poll results, discussion of the results, and justification for the design proposed below are available on the wiki:
> http://www.w3.org/2011/rdf-wg/wiki/XML_Literals
> 
> I propose to put this to the vote at tomorrow's call.
> 
> Best,
> Richard
> 
> 
> == Executive summary ==
> 
> • Make rdf:XMLLiteral optional in the datatype map
> • Change rdf:XMLLiteral lexical space to allow
>  non-canonical but well-formed XML
> • Define a canonical lexical form for rdf:XMLLiteral
>  that is equivalent to the old lexical space
> • Re-define the value space in terms of XML infosets (this
>  should be in 1:1 correspondence to the old value space
>  and old lexical space)
> 
> 
> == Normative changes to RDF Concepts ==
> 
> All changes are relative to the current RDF Concepts ED:
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-XMLLiteral
> 
> The current definition of the rdf:XMLLiteral lexical space is:
> 
> [[
> The lexical space is the set of all strings:
> 
>  • which are well-balanced, self-contained XML content [XML10];
>  • for which encoding as UTF-8 [UTF-8] yields exclusive Canonical
>    XML (with comments, with empty InclusiveNamespaces PrefixList)
>    [XML-EXC-C14N];
>  • for which embedding between an arbitrary XML start tag and an
>    end tag yields a document conforming to XML Namespaces
>    [XML-NAMES]
> ]]
> 
> DELETE the second bullet point of the definition above.
> 
> REPLACE the current definition of the rdf:XMLLiteral value space with this definition:
> 
> [[
> The value space is the set of all ordered lists of information items [XML Infoset] that contain only character information items, element information items, comment information items, and processing instruction information items.
> ]]
> 
> REPLACE the current definition of the rdf:XMLLiteral L2V mapping with this definition:
> 
> [[
> The lexical-to-value mapping is defined as follows:
> 
>  • Wrap the lexical form between an arbitrary XML start-tag and 
>    matching end-tag, yielding an XML document [XML10]
>  • Take the XML infoset corresponding to the XML document
>  • Return the list of children of its document element information item
> ]]
> 
> ADD the following definition for an rdf:XMLLiteral canonical mapping:
> 
> [[
> The canonical mapping is defined as Exclusive XML Canonicalization [XML-XC14N] (with comments, with empty InclusiveNamespaces PrefixList).
> ]]
> 
> 
> == Informative changes to RDF Concepts ==
> 
> REMOVE the following sentence:
> 
> [[
> This allows the inclusion of text that contains markup, such as XHTML [XHTML11].
> ]]
> 
> Instead, ADD the following sentence:
> 
> [[
> This allows the inclusion of XML payloads in RDF graphs, as well as text that contains markup, such as XHTML [XHTML11].
> ]]
> 
> ADD the following informative Note:
> 
> [[
> Any XML namespace declarations (xmlns) and language annotation (xml:lang) desired in the XML content must be included explicitly in the XML literal. Note that some concrete RDF syntaxes may define mechanisms for inheriting them from the context (@parseType="literal" in RDF/XML [RDFXML]).
> ]]
> 
> REMOVE the following three informative notes:
> 
> [[
> XML values can be thought of as the [XML-INFOSET] or the [XPATH] nodeset corresponding to the lexical form, with an appropriate equality function.
> 
> RDF applications may use additional equivalence relations, such as that which relates an xsd:string with an rdf:XMLLiteral corresponding to a single text node of the same string.
> 
> If language annotation of XML literals is required, it must be explicitly included as markup, usually by means of an xml:lang attribute.
> ]]
> 
> Possibly add an example or two for rdf:XMLLiteral, using editorial discretion.
> 
> 
> == Changes to other documents ==
> 
> Change RDF Semantics so that rdf:XMLLiteral is no longer interpreted in RDF-Entailment, but only in D-Entailment.
> 
> Change the definition of datatype map in RDF Semantics to make the presence of rdf:XMLLiteral in the map optional (or drop the definition altogether in favour of a reference to RDF Concepts, which already includes a definition of datatype maps in accordance with ACTION-122)
> 
> No changes to RDF/XML or Turtle.
> 
> 
> == Notable consequences of this design ==
> 
> • Nothing changes for RDF/XML
> • RDF/XML parsers who violate the RDF/XML spec by
>  not canonicalizing @parseType="literal" on input
>  are still non-conforming, but no longer produce
>  ill-typed literals
> • Turtle "<foo/>"^^rdf:XMLLiteral is no longer ill-typed
> • Turtle "<foo/>"^^rdf:XMLLiteral and
>  RDF/XML <x:y rdf:parseType="literal"><foo/></x:y> now
>  result in different triples (but same meaning). This
>  can be unexpected where content negotiation is involved,
>  because the choice of serialization syntax for the same
>  graph now means we potentially get different triples.


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Tuesday, 1 May 2012 12:41:32 UTC