Re: quick syntax question. from Brian McBride on 2003-07-28 (w3c-rdfcore-wg@w3.org from July 2003)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: 28 Jul 2003 12:15:25 +0100
To: pat hayes <phayes@ihmc.us>
Cc: Dave Beckett <dave.beckett@bristol.ac.uk>, rdf core <w3c-rdfcore-wg@w3.org>
Message-Id: <1059390924.2139.30.camel@dhcp-91-3.hpl.hp.com>

On Sun, 2003-07-27 at 22:39, pat hayes wrote:
> Dave,

Quick reply - Dave to confirm/correct

>  can you answer me a quick question about RDF/XML? Sorry I am 
> still so behind the curve on this, but I need to get this exactly 
> right given our decision about plain literals and xsd:string.
> 
> Consider a plain literal in an RDF graph which uses some characters 
> which require escaping in XML, eg say "<br/>".
> 
> 1. Is it the case that in RDF/XML, this would be rendered using XML 
> character escaping? Ie it would look like this
> "&gr;br/&lt;"
> ?

That would be "&lt;br /&gt;", but you have the right idea.

> 
> 2. If so, would it be correct to say that in spite of this, that the 
> literal character string itself was the original 5-character Unicode 
> sequence? (Or is the character string of the literal an 11-character 
> sequence in RDF/XML but a 5-character sequence in the graph? I hope 
> not....)

The literal in the graph is "<br />"

> 
> 3. If so, are there any literal character sequences which *cannot* be 
> sent through RDF/XML? Or does XML provide an escape for every Unicode 
> code point?

We discovered last week that there are some UNICODE characters (ascii
control codes e.g. bel) which are not legal in an XML document.  We have
to decide whether they are legal in the graph, and thus not expressible
in RDF/XML, or just not legal in the graph.

I guess you would like us to make this decision quickly.

My instincts are to not allow XML special cases to pollute (sorry value
laden term) the graph syntax, so I'm for saying that any UNICODE
character sequence is legal and noting there might be problems
serializing in RDF/XML.

That said, you (Pat) commented this would make expressing the semantics
more difficult, in that not all plain literals without lang tags would
denote xsd:string's, requiring you to have a more complex rule in the
semantics doc.

I wonder whether we really need that rule.  Would it suffice to *note*
that most plain literals without lang tags denote xsd:string's, but that
due to the fact that some UNICODE sequences are not legal xsd:string's,
not all plain literals without lang tags are xsd:string's.  This is
something that should be straightforward to implement in an xsd
reasoner.  We could do a couple of simple test cases.

So I'm suggesting no rule and a warning note.  As always, the WG
decides.

Brian

ps: test case:

_:a <rdf:label> "\0007" .

entails?

_:a <rdf:label> _:v .
_:v <rdf:type> <xsd:string> .

Received on Monday, 28 July 2003 07:18:29 UTC