W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2009

Exact format for XML Literals?

From: Ivan Herman <ivan@w3.org>
Date: Tue, 08 Sep 2009 09:24:57 +0200
Message-ID: <4AA606C9.1020102@w3.org>
To: W3C SPARQL Working Group <public-rdf-dawg@w3.org>

an issue came up in the RDFa task force that has relevance on the SPARQL
syntax. It may be that this will lead to a need to tighten up the SPARQL
language specification's language (no new feature here). It is related
to the way XML Literals are represented in the query language (well,
essentially, in Turtle...). The question is whether the following
extract is valid or not:

a:bla b:blabla
  "<bla   b='something' a='else'>and else</bla>"^^rdf:XMLLiteral.

The lexical space of XML Literal is defined by the RDF concept document
and it says:

The lexical space is the set of all strings:
 - which are well-balanced, self-contained XML content [XML];
 - for which encoding as UTF-8 [RFC 2279] yields exclusive Canonical XML
 - for which embedding between an arbitrary XML start tag and an end tag
yields a document conforming to XML Namespaces [XML-NS]

the important point is the usage of XC14N. A cursory read of this text
would mean that, in SPARQL, one would have to write a canonical XML for
an XML Literal (which is not the case in the case above).

Note that the RDF/XML specification goes a little bit further: in point
7.2.17 of the RDF/XML spec[2] it explicitly

l is transformed into the lexical form of an XML literal in the RDF graph

and refers to the XC14N algorithm explicitly. Ie, the XML extract above
is perfectly valid for RDF/XML. However, the current SPARQL spec is
silent about this.

It is fairly obvious that the same should happen in SPARQL (and in
Turtle): the parser should, conceptually, apply a canonicalization
algorithm on the XML content in the literal. But it may be better to say
that explicitly in the document, similarly to RDF/XML...

Do I miss something?


[1] http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
[2] http://www.w3.org/TR/rdf-syntax-grammar/#section-grammar-productions


Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Tuesday, 8 September 2009 07:25:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:00:57 UTC