- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 14 Mar 2002 13:59:36 -0000
- To: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "Brian McBride" <bwm@hplb.hpl.hp.com>, "RDF Core" <w3c-rdfcore-wg@w3.org>
> I propose that: > > - The Unicode strings within RDF literals are required to be in NFC. > - We note that literals whose unicode strings start with a combining > character may not be serializable in an XML document that conforms with > forthcoming Character Model Recommendations. > - We include a test case of such a literal as legal, to be reviewed if > Charmod reaches rec before we do. > In talking with Dave, it is clear I have omitted some discussion. Particularly about: - early uniform normalization - normalizing transcoders, SHOULD language I note that in M&S para 219: http://lists.w3.org/Archives/Public/www-archive/2001Jun/att-0021/00-part#219 [[[ Note: The W3C I18N WG is working on a definition for string identity matching. This definition will most probably be based on canonical equivalences according to the Unicode standard and on the principle of early uniform normalization. Users of RDF should not rely on any applications matching using the canonical equivalents, but should try to make sure that their data is in the normalized form according to the upcoming definitions. ]]] Early uniform normalization is now clear from charmod. Taking non-normal strings and making them NFC is the responsibility of the first Unicode component in the pipeline, latter components should reject stuff that is not NFC. Thus for a UTF-8 or UTF-16 RDF/XML document, or an N-triple document, it is the responsibility of the document author. For a foobar character set RDF/XML document it is the responsibility of the transcoder that converts into Unicode. A transcoder that meets that responsibility is called a normalizing transcoder. The existence of sufficient number of such transcoders should in my opinion be an exit criteria for charmod from CR. It should not be an exit criteria for RDF from CR. Hence I feel happier with SHOULD language for that part. I'll send another message with proposed text. Dave also expressed worry about the code footprint required for NFC checking. A small footprint RDF/XML implementation should in my view not implement this (unless it fits easily in the available space). It could expect UTF-8 input, hence the responsibility is the document author's. The RDF spec does not divide up responsibilities and we can remain silent on this case. Jeremy
Received on Thursday, 14 March 2002 09:04:30 UTC