- From: pat hayes <phayes@ihmc.us>
- Date: Mon, 28 Jul 2003 18:08:59 -0500
- To: Dave Beckett <dave.beckett@bristol.ac.uk>, Brian_McBride <bwm@hplb.hpl.hp.com>
- Cc: w3c-rdfcore-wg@w3.org
Regarding the below, my current version says this: "... there may be valid D-entailments for particular datatypes which depend on idiosyncratic properties of the particular datatypes, such as..." (old text at end of section 7.4, now following added:) "In particular, the value space and lexical-to-value mapping of the XSD datatype xsd:string sanctions the identification of typed literals with plain literals without language tags for all character strings which are in the lexical space of the datatype, since both of them denote the Unicode character string which is displayed in the literal; so the following inference rule is valid in all XSD-interpretations. Here, 'sss' indicates any string of characters in the lexical space of xsd:string. xsd 1a uuu aaa "sss". --> uuu aaa "sss"^^xsd:string . xsd 1b uuu aaa "sss"^^xsd:string . --> uuu aaa "sss". -------- I think that covers it. OK?? Pat >On 28 Jul 2003 12:15:25 +0100 >Brian McBride <bwm@hplb.hpl.hp.com> wrote: > >> On Sun, 2003-07-27 at 22:39, pat hayes wrote: >> > Dave, >> >> Quick reply - Dave to confirm/correct >> >> > can you answer me a quick question about RDF/XML? Sorry I am >> > still so behind the curve on this, but I need to get this exactly >> > right given our decision about plain literals and xsd:string. >> > >> > Consider a plain literal in an RDF graph which uses some characters >> > which require escaping in XML, eg say "<br/>". >> > >> > 1. Is it the case that in RDF/XML, this would be rendered using XML >> > character escaping? Ie it would look like this >> > "&gr;br/<" >> > ? >> >> That would be "<br />", but you have the right idea. > >That's one of the encodings, there are several. How plain >literals is written into RDF/XML does not involve XML canonicalization. >In the graph, you get a Unicode string, what Charmod calls a >Character string: http://www.w3.org/TR/charmod/#def-character-string > >> > >> > 2. If so, would it be correct to say that in spite of this, that the >> > literal character string itself was the original 5-character Unicode >> > sequence? (Or is the character string of the literal an 11-character >> > sequence in RDF/XML but a 5-character sequence in the graph? I hope >> > not....) >> >> The literal in the graph is "<br />" >> >> > >> > 3. If so, are there any literal character sequences which *cannot* be >> > sent through RDF/XML? Or does XML provide an escape for every Unicode >> > code point? >> >> We discovered last week that there are some UNICODE characters (ascii >> control codes e.g. bel) which are not legal in an XML document. We have >> to decide whether they are legal in the graph, and thus not expressible >> in RDF/XML, or just not legal in the graph. > >Yes, these are listed >[[ >Char ::= #x9 | #xA | #xD | >[#x20-#xD7FF] | [#xE000-#xFFFD] | >[#x10000-#x10FFFF] > /* any Unicode character, excluding the >surrogate blocks, FFFE, and FFFF. */ >]] -- http://www.w3.org/TR/REC-xml#NT-Char > >However, that is for XML 1.0(2nd edition). >the draft XML 1.1 proposes replacing the above comment to: > [[ > /* any Unicode character, excluding most ISO >controls, the surrogate blocks, FFFE, and FFFF */ > ]] -- http://www.w3.org/TR/xml11/#NT-Char: >(ISO controls I assume refering to the excluded parts #0-#8, #B, #C, #E-#1F) > >RDF/XML is an XML 1.0 (2nd edition) format so the former definition applies. > >> I guess you would like us to make this decision quickly. >> >> My instincts are to not allow XML special cases to pollute (sorry value >> laden term) the graph syntax, so I'm for saying that any UNICODE >> character sequence is legal and noting there might be problems >> serializing in RDF/XML. > >The former would be for concepts. RDF/XML or any XML format would have >problems serializing such things. > >> That said, you (Pat) commented this would make expressing the semantics >> more difficult, in that not all plain literals without lang tags would >> denote xsd:string's, requiring you to have a more complex rule in the > > semantics doc. >> >> I wonder whether we really need that rule. Would it suffice to *note* >> that most plain literals without lang tags denote xsd:string's, but that >> due to the fact that some UNICODE sequences are not legal xsd:string's, >> not all plain literals without lang tags are xsd:string's. This is >> something that should be straightforward to implement in an xsd >> reasoner. We could do a couple of simple test cases. > >I'm wondering here what's broke - xsd:string allowing illegal Unicode >or RDF's plain literals? > >> So I'm suggesting no rule and a warning note. As always, the WG >> decides. >> >> Brian >> >> ps: test case: >> >> _:a <rdf:label> "\0007" . > > _:a rdf:label "\u0007" . > >> >> entails? >> >> _:a <rdf:label> _:v . >> _:v <rdf:type> <xsd:string> . > > _:a rdf:label _:v . > _:v rdf:type xsd:string . > >Dave -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Monday, 28 July 2003 19:09:08 UTC