- From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
- Date: Thu, 07 Aug 2003 07:25:55 -0400 (EDT)
- To: dave.beckett@bristol.ac.uk
- Cc: www-rdf-comments@w3.org
From: Dave Beckett <dave.beckett@bristol.ac.uk> Subject: Re: mismatch between test and RDF/XML syntax Date: Thu, 7 Aug 2003 11:47:16 +0100 > On Tue, 05 Aug 2003 14:05:09 -0400 (EDT) > "Peter F. Patel-Schneider" <pfps@research.bell-labs.com> wrote: > > > > > Hi: > > > > The RDF/XML Syntax Specification (Revised), draft of 4 August 2003 appears > > to allow strings that are not in Normal Form C. This is counter to test > > rdf-charmod-literals/error001.rdf > > > > > > The relevant productions for this example are > > > > 7.2.14 propertyElt which parses <eg:Creater eg:named="..."/> > > 7.2.21 emptyPropertyElt which parses <eg:Creater eg:named="..."/> > > 7.2.25 propertyAttr which parses eg:named="..." > > > > the last of which allows anyString (defined as ``Any string.'') as the > > value of the attribute. > > Indeed. I think this would best done with a note next to the actions > where triples with literal values are added to the graph. This is > when literal() event is used in nodeElement (now section 7.2.11 in > the editor's draft), literalPropertyElt (7.2.16) emptyPropertyElt > (7.2.21). I was trying to imagine under what circumstances the empty string would not be allowed and could not come up with any. I think that the caution is thus not needed for emptyPropertyElt. > For each of these triples additions I will add a note of the form > > The string <em>t</em>.string-value MUST be a Unicode [UNICODE] String > in Normal Form C (NFC) [NFC]. > > before the literal() term is used. I will also add the two new normative references: > > [UNICODE] > The Unicode Standard, Version 3, The Unicode Consortium, > Addison-Wesley, 2000. ISBN 0-201-61633-5, as updated from time to > time by the publication of new versions. (See > http://www.unicode.org/unicode/standard/versions/ for the latest > version and additional information on versions of the standard > and of the Unicode Character Database). > > [NFC] > Unicode Normalization Forms, Unicode Standard Annex #15, Mark > Davis, Martin Duerst. (See > http://www.unicode.org/unicode/reports/tr15/ for the latest > version). > > Dave I was actually hoping that is was the Syntax Specification that was correct, and am disappointed that this is not the case. I expect that there are very many XML documents that cannot be handled in RDF because of this requirement for NFC. What does this do to the XML-scraping use case for RDF? If this requirement sticks, I expect much confusion, particularly as there is no mention of it in the primer. It also appears to me that there is an inconsistency with the treatment of XML Literals, with the direct use of o.string-value being inconsistent with the last sentence of 7.2.17. Further, Exclusive XML Canonicalization results in a sequence of octets, which are probably not allowed as lexical forms of RDF literals. Peter F. Patel-Schneider Bell Labs Research Lucent Technologies
Received on Thursday, 7 August 2003 07:26:06 UTC