- From: David Wood <david@3roundstones.com>
- Date: Tue, 11 Oct 2011 22:49:14 -0400
- To: Eric Prud'hommeaux <eric@w3.org>
- Cc: "Phillips, Addison" <addison@lab126.com>, Jeremy Carroll <jeremy@topquadrant.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, John Cowan <cowan@mercury.ccil.org>, "www-international@w3.org" <www-international@w3.org>, RDF Working Group WG <public-rdf-wg@w3.org>
On Oct 11, 2011, at 18:58, Eric Prud'hommeaux <eric@w3.org> wrote: > * David Wood <david@3roundstones.com> [2011-10-11 17:00-0400] >> >> On Oct 11, 2011, at 16:49, "Phillips, Addison" <addison@lab126.com> wrote: >> >>>>> B) >>>>> 2) drop the "SHOULD use NFC" requirement on literals >>>> >>>> I'm good with this one, unless we decide to do something around our ISSUE-63: >>>> http://www.w3.org/2011/rdf-wg/track/issues/63 >>>> >>> >>> For reasons I just outlined, I think this would be a mistake. By avoiding denormalized text, RDF users can help ensure interoperability. In practice, this is a no-op for implementers. >> >> Why do you see it as a noop? > > I guess it depends on which implementors we're talking about, but most of the current stack (OWL, SPARQL, RIF implementers) are invoked after the implied pre-normalization step. They don't have to do any normalization. Exceptions would be those creating RDF from user input or mapping non-RDF data (e.g. RDBs) to RDF. For those folks, the advice to pre-normalize could help them to converge on one of many possible representations of e.g. product names. Well, right, but it seems like normalizing RDF upon ingest to a triple store of any form would hurt, maybe a lot. I don't think we should just dismiss that without some analysis. Regards, Dave > > I'm pretty confident that we don't want to rule out having non-normalized forms in the domain of discourse (especially since applying the same codepoint comparison works regardless of normalization), but that we'd like to *advise* folks to converge where it's in their interest to do so and advising NFKC is a good path to that end. Thus, if say "It is recommended to use Unicode Normal Form KC [NFKC] for both literals and IRIs when there is no explicit reason to preserve the non-normalized form.", we probably hit the sweet point (and most present implementors don't have to do anything). > > >> Regards, >> Dave >> >>> >>> Addison >> > > -- > -ericP >
Received on Wednesday, 12 October 2011 02:49:53 UTC