- From: Brian McBride <bwm@hplb.hpl.hp.com>
- Date: 28 Jul 2003 12:15:25 +0100
- To: pat hayes <phayes@ihmc.us>
- Cc: Dave Beckett <dave.beckett@bristol.ac.uk>, rdf core <w3c-rdfcore-wg@w3.org>
On Sun, 2003-07-27 at 22:39, pat hayes wrote: > Dave, Quick reply - Dave to confirm/correct > can you answer me a quick question about RDF/XML? Sorry I am > still so behind the curve on this, but I need to get this exactly > right given our decision about plain literals and xsd:string. > > Consider a plain literal in an RDF graph which uses some characters > which require escaping in XML, eg say "<br/>". > > 1. Is it the case that in RDF/XML, this would be rendered using XML > character escaping? Ie it would look like this > "&gr;br/<" > ? That would be "<br />", but you have the right idea. > > 2. If so, would it be correct to say that in spite of this, that the > literal character string itself was the original 5-character Unicode > sequence? (Or is the character string of the literal an 11-character > sequence in RDF/XML but a 5-character sequence in the graph? I hope > not....) The literal in the graph is "<br />" > > 3. If so, are there any literal character sequences which *cannot* be > sent through RDF/XML? Or does XML provide an escape for every Unicode > code point? We discovered last week that there are some UNICODE characters (ascii control codes e.g. bel) which are not legal in an XML document. We have to decide whether they are legal in the graph, and thus not expressible in RDF/XML, or just not legal in the graph. I guess you would like us to make this decision quickly. My instincts are to not allow XML special cases to pollute (sorry value laden term) the graph syntax, so I'm for saying that any UNICODE character sequence is legal and noting there might be problems serializing in RDF/XML. That said, you (Pat) commented this would make expressing the semantics more difficult, in that not all plain literals without lang tags would denote xsd:string's, requiring you to have a more complex rule in the semantics doc. I wonder whether we really need that rule. Would it suffice to *note* that most plain literals without lang tags denote xsd:string's, but that due to the fact that some UNICODE sequences are not legal xsd:string's, not all plain literals without lang tags are xsd:string's. This is something that should be straightforward to implement in an xsd reasoner. We could do a couple of simple test cases. So I'm suggesting no rule and a warning note. As always, the WG decides. Brian ps: test case: _:a <rdf:label> "\0007" . entails? _:a <rdf:label> _:v . _:v <rdf:type> <xsd:string> .
Received on Monday, 28 July 2003 07:18:29 UTC