- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Fri, 15 Mar 2002 10:10:15 -0000
- To: "Brian McBride" <bwm@hplb.hpl.hp.com>, "RDF Core" <w3c-rdfcore-wg@w3.org>
> >And that in the definition of the RDF graph we use MUST > language, whereas in > >the discussion of RDF/XML we indicate that parsers SHOULD use normalizing > >transcoders, (with a reference to a CHARMOD WD). > > > >Issue: do we want a note saying that non normalized unicode > input MUST NOT > >be normalized. (This is one of the safe guards in charmod, but it assumes > >that specs are defining a processing model, and we are not). > > I'm confused here. Why do we say the a parser should use a normalizing > transcoder whilst at the same time saying it MUST NOT normalize > non-normalized unicode? > > Dumbo of Bristol > Brian > I've got bigger ears than you! (also of Bristol) The early uniform normalization approach to unicode normalization is that the first producer of *Unicode* must normalize (and nobody else). I am an RDF/XML parser. If I read an RDF/XML document and it is in utf-8 then if it's not normalised it's an error. If I read Shift-JIS (no idea whether there is a normalization problem) then I turn it into unicode, and it's my job to make sure that unicode is normalised. To perform this job, I ask Xerces to do it, and they either have or don't have appropriate normalizing transcoders. Once charmod is deployed they will have these things. Since charmod is not deployed I prefer SHOULD language. The motivation for this is to ensure maximum compatibility with the lazy implementor who does nothing. He probably has never heard of shift-JIS and so rejects it. A normalized unicode document is processed identically by me and Mr Lazy. A non-normalized unicode document is rejected by me (in strict conformance mode). (In default mode, a non-normalized unicode document generates a warning from me, but then I do not normalize it but proceed anyway like Mr Lazy). So when we both produce an answer we produce the same answer. Jeremy
Received on Friday, 15 March 2002 05:10:26 UTC