Re: No Character Normalization? from Paul Hoffman / IMC on 2000-06-26 (w3c-ietf-xmldsig@w3.org from April to June 2000)

From: Paul Hoffman / IMC <phoffman@imc.org>
Date: Mon, 26 Jun 2000 14:34:59 -0700
To: John Cowan <jcowan@reutershealth.com>, "w3c-ietf-xmldsig@w3.org" <w3c-ietf-xmldsig@w3.org>
Message-Id: <p04320302b57d7caeb5e7@[165.227.249.13]>

At 4:05 PM -0400 6/26/00, John Cowan wrote:
>Paul Hoffman / IMC wrote:
>
>>  But this is a gross oversimplification of how users might enter
>>  non-canonicalized characters in a document. An easy example from
>>  plane zero is U+00BC (VULGAR FRACTION ONE QUARTER). Microsoft Word
>>  (and other programs) will insert this into a document as its
>>  uncanonicalized form; Word will even do it behind your back unless
>>  you turn off Word's default "helpful" auto-correction feature. U+00BC
>>  canonicalizes into U+0031 followed by U+2044 followed by U+0034.
>
>That is a compatibility decomposition, useful for specialized purposes,
>but not relevant here.

Whoops, you're right. Sorry about that; I'm in the midst of dealing 
with a protocol that uses form KC instead of C right now. Never mind.

--Paul Hoffman, Director
--Internet Mail Consortium

Received on Monday, 26 June 2000 17:35:05 UTC