- From: John Cowan <cowan@ccil.org>
- Date: Mon, 24 Jan 2022 09:56:33 -0500
- To: Daphne Preston-Kendal <dpk@nonceword.org>
- Cc: public-microxml@w3.org
- Message-ID: <CAD2gp_SVf+JDN+YagoWm8kWTefOoDf0SV7owFhmNjNbLeyvyQA@mail.gmail.com>
I am reluctant to say James Clark is wrong, but i agree w/u. pls tell
schemers i had emerg foot surg fri, followup surg on tues, prospects good,
tx.
On Mon, Jan 24, 2022 at 3:47 AM Daphne Preston-Kendal <dpk@nonceword.org>
wrote:
> The first example of a µXML document given in the spec is
>
> <comment lang="en" date="2012-09-11">
> I <em>love</em> µ<!-- MICRO SIGN -->XML!<br/>
> It's so clean & simple.</comment>
>
> with the JSON equivalent
>
> [ "comment",
> { "date": "2012-09-11", "lang": "en" },
> [ "\nI ",
> ["em", {}, ["love"]],
> " \u03BCXML!",
> ["br", {}, []],
> "\nIt's so clean & simple."
> ]
> ]
>
> The mapping of U+00B5 to U+03BC implies that µXML processors
> can or should do compatibility normalization of their input,
> but this is not actually explicitly stated anywhere. In fact,
> it appears to contradict the recommendation
>
> > [Unicode] says that canonically equivalent sequences of characters ought
> to be treated as identical. However, documents that are canonically
> equivalent according to Unicode but that use distinct code point sequences
> are considered distinct by MicroXML parsers. This gives rise to the
> possibility that the user might unintentionally create sequences of
> characters that are canonically equivalent but are treated as distinct by
> MicroXML parsers. To avoid this possibility, all documents SHOULD be in
> Normalization Form C as described by [Unicode].
>
> which seems to say that parsers should *not* do any normalization.
> (Also consider that U+00B5 is unaffected by non-compatibility
> normalization.)
>
> Is this an error in the spec (in that example)?
>
> --
> dpk (Daphne Preston-Kendal) ·· 12107 Berlin, Germany ·· http://dpk.io/
> ‘What’s the good of Mercator’s North Poles and Equators,
> Tropics, Zones, and Meridian Lines?’
> So the Bellman would cry: and the crew would reply
> ‘They are merely conventional signs!’ — Carroll, Hunting of the Snark
>
>
Received on Monday, 24 January 2022 14:58:56 UTC