Re: mismatch between test and RDF/XML syntax

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Subject: Re: mismatch between test and RDF/XML syntax
Date: Thu, 7 Aug 2003 11:47:16 +0100

> On Tue, 05 Aug 2003 14:05:09 -0400 (EDT)
> "Peter F. Patel-Schneider" <pfps@research.bell-labs.com> wrote:
> 
> > 
> > Hi:
> > 
> > The RDF/XML Syntax Specification (Revised), draft of 4 August 2003 appears
> > to allow strings that are not in Normal Form C.  This is counter to test
> > rdf-charmod-literals/error001.rdf
> > 
> > 
> > The relevant productions for this example are
> > 
> > 7.2.14 propertyElt		which parses <eg:Creater eg:named="..."/>
> > 7.2.21 emptyPropertyElt		which parses <eg:Creater eg:named="..."/>
> > 7.2.25 propertyAttr		which parses eg:named="..."
> > 
> > the last of which allows anyString (defined as ``Any string.'') as the
> > value of the attribute.
> 
> Indeed.  I think this would best done with a note next to the actions
> where triples with literal values are added to the graph.  This is
> when literal() event is used in nodeElement (now section 7.2.11 in
> the editor's draft), literalPropertyElt (7.2.16) emptyPropertyElt
> (7.2.21).

I was trying to imagine under what circumstances the empty string would not
be allowed and could not come up with any.  I think that the caution is
thus not needed for emptyPropertyElt.

> For each of these triples additions I will add a note of the form
> 
>   The string <em>t</em>.string-value MUST be a Unicode [UNICODE] String
>   in Normal Form C (NFC) [NFC].
> 
> before the literal() term is used.  I will also add the two new normative references:
> 
> [UNICODE]
>     The Unicode Standard, Version 3, The Unicode Consortium,
>     Addison-Wesley, 2000. ISBN 0-201-61633-5, as updated from time to
>     time by the publication of new versions. (See
>     http://www.unicode.org/unicode/standard/versions/ for the latest
>     version and additional information on versions of the standard
>     and of the Unicode Character Database).
> 
>   [NFC] 
>     Unicode Normalization Forms, Unicode Standard Annex #15, Mark
>     Davis, Martin Duerst. (See
>     http://www.unicode.org/unicode/reports/tr15/ for the latest
>     version).
> 
> Dave

I was actually hoping that is was the Syntax Specification that was
correct, and am disappointed that this is not the case.   I expect that
there are very many XML documents that cannot be handled in RDF because of
this requirement for NFC.  What does this do to the XML-scraping use case
for RDF?  If this requirement sticks, I expect much confusion, particularly
as there is no mention of it in the primer.

It also appears to me that there is an inconsistency with the treatment of
XML Literals, with the direct use of o.string-value being inconsistent with
the last sentence of 7.2.17.  Further, Exclusive XML Canonicalization
results in a sequence of octets, which are probably not allowed as lexical
forms of RDF literals.

Peter F. Patel-Schneider
Bell Labs Research
Lucent Technologies

Received on Thursday, 7 August 2003 07:26:06 UTC