- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 11 Oct 2011 18:58:50 -0400
- To: David Wood <david@3roundstones.com>
- Cc: "Phillips, Addison" <addison@lab126.com>, Jeremy Carroll <jeremy@topquadrant.com>, Martin J. Dürst <duerst@it.aoyama.ac.jp>, John Cowan <cowan@mercury.ccil.org>, "www-international@w3.org" <www-international@w3.org>, RDF Working Group WG <public-rdf-wg@w3.org>
* David Wood <david@3roundstones.com> [2011-10-11 17:00-0400] > > On Oct 11, 2011, at 16:49, "Phillips, Addison" <addison@lab126.com> wrote: > > >>> B) > >>> 2) drop the "SHOULD use NFC" requirement on literals > >> > >> I'm good with this one, unless we decide to do something around our ISSUE-63: > >> http://www.w3.org/2011/rdf-wg/track/issues/63 > >> > > > > For reasons I just outlined, I think this would be a mistake. By avoiding denormalized text, RDF users can help ensure interoperability. In practice, this is a no-op for implementers. > > Why do you see it as a noop? I guess it depends on which implementors we're talking about, but most of the current stack (OWL, SPARQL, RIF implementers) are invoked after the implied pre-normalization step. They don't have to do any normalization. Exceptions would be those creating RDF from user input or mapping non-RDF data (e.g. RDBs) to RDF. For those folks, the advice to pre-normalize could help them to converge on one of many possible representations of e.g. product names. I'm pretty confident that we don't want to rule out having non-normalized forms in the domain of discourse (especially since applying the same codepoint comparison works regardless of normalization), but that we'd like to *advise* folks to converge where it's in their interest to do so and advising NFKC is a good path to that end. Thus, if say "It is recommended to use Unicode Normal Form KC [NFKC] for both literals and IRIs when there is no explicit reason to preserve the non-normalized form.", we probably hit the sweet point (and most present implementors don't have to do anything). > Regards, > Dave > > > > > Addison > -- -ericP
Received on Tuesday, 11 October 2011 22:59:24 UTC