Possible syntax restriction on encoding rdf:XMLLiteral

Martin Duerst suggests

[[It seems much simpler to have the spec simply forbid using
&rdf;XMLLiteral in a datatype attribute in RDF/XML.]]
-- http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Jul/0410.html

I said on our design of using exc-C14n in the parser and mentioned in
rdf concepts rather than in semantics
[[
We didn't want to require people to have an XML parser
for handling RDF's abstract syntax so all the XML checking
belongs in the mapping from RDF/XML to the triples.
]] - http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Aug/0038.html

Martin replied later on issues for users:
[[
Here is some thoughts I have gone through:
- It makes things somewhat different for software writing RDF/XML:
   It can't just write out all types with rdf:datatype. But this is
   probably a desirable effect.
- For users, there are really a lot of users out there, and it's
   not very easy to say in general. But in my view, it very much helps
   them understanding XML Literals if they see these literals always
   at the same level of escaping. Most people seem to get confused
   very quickly with different escaping levels. Using only
   rdf:parseType='Literal' would mean that the basic escaping is
   the same in the RDF/XML syntax, in the abstract syntax, and,
   as far as I understand, in most implementations. I think
   that is a serious benefit.
]]
-- http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Aug/0045.html


I think this is a plausible set of reasons - and entity encoding XML is
a bit ugly, particularly to show to end-users.  If it remained internal
that would be better from usability, I feel.

So the proposal is to forbid (in the SYNTAX, not semantics or abs syn):
  rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"
and require XML literals to be written in the rdf:parseType="Literal" form.

Proposed change to 7.2.16 Production literalPropertyElt adding a sentence
[[
d.string-value cannot be rdf:XMLLiteral.  XML Literals are written using the form
given in 7.2.17 Production parseTypeLiteralPropertyElt.
]]
-- changing http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#literalPropertyElt

Plus adding a bad parser test case datatypes/bad001.rdf that shows this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
          xmlns:eg="http://example.org/">
<rdf:Description>
   <eg:prop rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">&lt;em>foo&lt;/em></eg:prop>
</rdf:Description>
</rdf:RDF>

This has the benefits:
  * The literal form of rdf:XMLLiteral in the graph will always be in exc-C14N form
  * No(? less?) XML c14N checker/words need be in the semantics
  * Ensures only one way to encode rdf:XMLLiteral.
  * Easier to read the non-entity encoded XML.
  * XML transported 'through' RDF will be more like the original form (caveat exc-C14N)
  * Document impact is small - syntax WD only (primer doesn't use this form)

Downsides:
  * It is a change
  * Somebody is likely using this rdf:datatype form.
  * It'll make a new test case that no existing parser will pass.

Personally I'm favouring the change.

Dave

Received on Wednesday, 6 August 2003 08:44:55 UTC