Re: quick syntax question. from pat hayes on 2003-07-28 (w3c-rdfcore-wg@w3.org from July 2003)

From: pat hayes <phayes@ihmc.us>
Date: Mon, 28 Jul 2003 18:08:59 -0500
To: Dave Beckett <dave.beckett@bristol.ac.uk>, Brian_McBride <bwm@hplb.hpl.hp.com>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p06001a34bb4b5c01005a@[10.0.100.23]>
Regarding the below, my current version says this:

"... there may be valid D-entailments for 
particular datatypes which depend on 
idiosyncratic properties of the particular 
datatypes, such as..." (old text at end of 
section 7.4, now following added:)

"In particular, the value space and 
lexical-to-value mapping of the XSD datatype 
xsd:string sanctions the identification of typed 
literals with plain literals without language 
tags for all character strings which are in the 
lexical space of the datatype, since both of them 
denote the Unicode character string which is 
displayed in the literal; so the following 
inference rule is valid in all 
XSD-interpretations. Here, 'sss' indicates any 
string of characters in the lexical space of 
xsd:string.

xsd 1a

uuu aaa "sss".
-->
uuu aaa "sss"^^xsd:string .

xsd 1b

uuu aaa "sss"^^xsd:string .
-->
uuu aaa "sss".

--------
I think that covers it.

OK??

Pat

>On 28 Jul 2003 12:15:25 +0100
>Brian McBride <bwm@hplb.hpl.hp.com> wrote:
>
>>  On Sun, 2003-07-27 at 22:39, pat hayes wrote:
>>  > Dave,
>>
>>  Quick reply - Dave to confirm/correct
>>
>>  >  can you answer me a quick question about RDF/XML? Sorry I am
>>  > still so behind the curve on this, but I need to get this exactly
>>  > right given our decision about plain literals and xsd:string.
>>  >
>>  > Consider a plain literal in an RDF graph which uses some characters
>>  > which require escaping in XML, eg say "<br/>".
>>  >
>>  > 1. Is it the case that in RDF/XML, this would be rendered using XML
>>  > character escaping? Ie it would look like this
>>  > "&gr;br/&lt;"
>>  > ?
>>
>>  That would be "&lt;br /&gt;", but you have the right idea.
>
>That's one of the encodings, there are several.  How plain
>literals is written into RDF/XML does not involve XML canonicalization.
>In the graph, you get a Unicode string, what Charmod calls a
>Character string: http://www.w3.org/TR/charmod/#def-character-string
>
>>  >
>>  > 2. If so, would it be correct to say that in spite of this, that the
>>  > literal character string itself was the original 5-character Unicode
>>  > sequence? (Or is the character string of the literal an 11-character
>>  > sequence in RDF/XML but a 5-character sequence in the graph? I hope
>>  > not....)
>>
>>  The literal in the graph is "<br />"
>>
>>  >
>>  > 3. If so, are there any literal character sequences which *cannot* be
>>  > sent through RDF/XML? Or does XML provide an escape for every Unicode
>>  > code point?
>>
>>  We discovered last week that there are some UNICODE characters (ascii
>>  control codes e.g. bel) which are not legal in an XML document.  We have
>>  to decide whether they are legal in the graph, and thus not expressible
>>  in RDF/XML, or just not legal in the graph.
>
>Yes, these are listed
>[[
>Char	   ::=   	#x9 | #xA | #xD | 
>[#x20-#xD7FF] | [#xE000-#xFFFD] | 
>[#x10000-#x10FFFF]
>  	/* any Unicode character, excluding the 
>surrogate blocks, FFFE, and FFFF. */
>]] -- http://www.w3.org/TR/REC-xml#NT-Char
>
>However, that is for XML 1.0(2nd edition).
>the draft XML 1.1 proposes replacing the above comment to:
>   [[
>   /* any Unicode character, excluding most ISO 
>controls, the surrogate blocks, FFFE, and FFFF */
>   ]] -- http://www.w3.org/TR/xml11/#NT-Char:
>(ISO controls I assume refering to the excluded parts #0-#8, #B, #C, #E-#1F)
>
>RDF/XML is an XML 1.0 (2nd edition) format so the former definition applies.
>
>>  I guess you would like us to make this decision quickly.
>>
>>  My instincts are to not allow XML special cases to pollute (sorry value
>>  laden term) the graph syntax, so I'm for saying that any UNICODE
>>  character sequence is legal and noting there might be problems
>>  serializing in RDF/XML.
>
>The former would be for concepts.  RDF/XML or any XML format would have
>problems serializing such things.
>
>>  That said, you (Pat) commented this would make expressing the semantics
>>  more difficult, in that not all plain literals without lang tags would
>>  denote xsd:string's, requiring you to have a more complex rule in the
>  > semantics doc.
>>
>>  I wonder whether we really need that rule.  Would it suffice to *note*
>>  that most plain literals without lang tags denote xsd:string's, but that
>>  due to the fact that some UNICODE sequences are not legal xsd:string's,
>>  not all plain literals without lang tags are xsd:string's.  This is
>>  something that should be straightforward to implement in an xsd
>>  reasoner.  We could do a couple of simple test cases.
>
>I'm wondering here what's broke - xsd:string allowing illegal Unicode
>or RDF's plain literals?
>
>>  So I'm suggesting no rule and a warning note.  As always, the WG
>>  decides.
>>
>>  Brian
>>
>>  ps: test case:
>>
>>  _:a <rdf:label> "\0007" .
>
>   _:a rdf:label "\u0007" .
>
>>
>>  entails?
>>
>>  _:a <rdf:label> _:v .
>>  _:v <rdf:type> <xsd:string> .
>
>  _:a rdf:label _:v .
>  _:v rdf:type xsd:string .
>
>Dave


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 28 July 2003 19:09:08 UTC