Re: draft response to pfps re nfc

Hello Jeremy,

Sorry to be late with my reply; as I wrote in another mail, I was
traveling.


At 09:32 03/09/09 +0100, Jeremy Carroll wrote:


>Copying to i18n to request help on correct application of charmod. See the 
>two paragraphs between ****.

Are these two paras supposed to go into a spec, or just serve as
the official answer? For the second, they seem quite appropriate
to me.

Regards,    Martin.



>This is a proposed draft, note I suggest additional text for concepts, and 
>and still need additional text for syntax -
>
>[[
>
>Dear Peter
>
>thanks for your comments concerning NFC
>http://lists.w3.org/Archives/Public/www-rdf-comments/2003JulSep/0283
>http://lists.w3.org/Archives/Public/www-rdf-comments/2003JulSep/0225
>
>
>These comments also apply to XSD datatypes derived from xsd:string and 
>xsd:anyURI, so we will respond in full generality.
>
>(e.g. The two character string { e, NON SPACING ACUTE } is a legal 
>xsd:string that can be 'written' in RDF/XML (in XML 1.0) but does not 
>correspond to a legal RDF graph.)
>
>We also agree that there are XML 1.0 fragments that can be written within 
>a an rdf:parseType="Literal" element in an XML 1.0 document that conforms 
>to the RDF/XML syntax except that this fragment is not in NFC.
>However, this would not be an RDF/XML document, since there is no 
>corresponding RDF graph.
>
>You are correct to point out that this constraint is not made explicitly 
>in the syntax document, and this is a bug.
>
>
>Concepts places a similar constraint on the lexical form of all datatypes 
>e.g. xsd:string, whereas syntax suggests that there is no such constraint e.g.
>7.2.16
>http://www.w3.org/TR/rdf-syntax-grammar/#literalPropertyElt
>[[
>If the rdf:datatype attribute d is given then o := 
>typed-literal(literal-value := t.string-value, literal-datatype := 
>d.string-value) otherwise t.string-value MUST be a Unicode[UNICODE] string 
>in Normal Form C[NFC], o := literal(literal-value := t.string-value, 
>literal-language := e.language)
>]]
>
>This text needs modifying.
>
>
>
>1. NFC constraint in general
>
>You suggest that RDF should drop the NFC constraint completely.
>This would clearly solve the problems you raise.
>
>However, the RDF Core WG has endeavoured to follow charmod
>(http://www.w3.org/TR/charmod)
>as much as possible, as one of the key inputs from the I18N community.
>
>See
>4.4 Responsibility for Normalization
>http://www.w3.org/TR/charmod/#sec-NormalizationApplication
>
>[[
>[S]  Specifications of text-based formats and protocols SHOULD, as part of 
>their syntax definition, require that the text be in normalized form.
>]]
>[[
>[S]  Specifications of text-based languages and protocols SHOULD define 
>precisely the construct boundaries necessary to obtain a complete 
>definition of full-normalization. These definitions SHOULD include at 
>least the boundaries between markup and character data as well as entity 
>boundaries (if the language has any include mechanism) and SHOULD include 
>any other boundary that may create denormalization when instances of the 
>language are processed.
>]]
>
>****
>The RDF Core WG has previously identified the lexical form of literals as 
>the relevant construct, around which NFC should be required.
>While we have been aware of transitional issues, since the specs we build 
>on (XML 1.0 and XSD) do not require NFC, we do not see those issues as 
>insufficient to not migrate the RDF recommendation.
>
>It is clear that applications working with XML 1.0 and the current version 
>of XSD datatypes may choose to be more lenient than this part of our 
>specification, and then what they should do, is also clarified in charmod. 
>i.e. they must not normalize. Since the recommendation is clear that these 
>are errors, the responsibility for fixing them is clear.
>****
>
>2. Clarity of RDF Concepts document
>
>We have made the following changes to concepts:
>
>In section 5
>[[
>The lexical space of a datatype is a set of Unicode [UNICODE] strings.
>]]
>to
>[[
>The lexical space of a datatype is a set of Unicode [UNICODE] strings in 
>Normal Form C [NFC].
>]]
>
>and in 5.1
>[[
>The lexical space
>is the set of all strings:
>]]
>to
>
>[[
>The lexical space
>is the set of all strings:
>- in Normal Form C [NFC].
>]]
>
>
>
>3. syntax document
>
>[TBD]
>
>
>]]

Received on Tuesday, 16 September 2003 19:13:57 UTC