Re: [WebCGM2.1][LC Review] i18n comment 6: Unicode normalization

Hello, and thanks for the helpful I18N comments on the WebCGM 2.1 Last Call 
review.

After some research into the details of Comment #6 -- that WebCGM should 
use a Unicode normalization form for font-name-string comparisons -- we see 
the wisdom of it for reliable matching.  But lacking deep expertise on the 
topic, we'd welcome further advice.

Question:  Do you have a recommendation for which of the four normalization 
forms would be best?

For background, recall that the subject string comparison is seeking a 
match between:  on the one hand, a font-name-string as extracted from a 
WebCGM instance; and on the other hand, a font-name-string from the ACL 
file (a separate XML file) that specifies the font-name to be matched.

We would expect Unicode normalization to potentially make a difference in 
those cases wherein the first string (font-name from WebCGM instance) is 
outside the well-defined core set of thirteen (13) fixed names that are 
required by the WebCGM standard.  The character encoding in the WebCGM 
instance will be either ISOLatin1, or Unicode UTF8 or UTF16.

If the answer is not simple enough for efficient email resolution, we would 
welcome your participation in our teleconference of Thursday, 04-dec, 11am 
EST.  (Or alternately two weeks later if you can't make tomorrow.)  Please 
let me know, and I will send telecon logistics.

Thanks,
-Lofton Henderson
(Chair WebCGM WG)


At 10:29 AM 11/11/2008 +0000, ishida@w3.org wrote:

>Comment from the i18n review of:
>http://www.w3.org/TR/2008/WD-webcgm21-20080917/WebCGM21-Config.html#ACI-fontmap
>
>Comment 6
>At http://www.w3.org/International/reviews/0811-webcgm/
>Editorial/substantive: S
>Tracked by: RI
>
>Location in reviewed document:
>9.3.2.2 
>[http://www.w3.org/TR/2008/WD-webcgm21-20080917/WebCGM21-Config.html#ACI-maplist]
>
>Comment:
>Normalization for string comparison should include conversion to a Unicode 
>normalization form, to eliminate issues related to precomposed vs. 
>decomposed characters and issues related to ordering of multiple combining 
>characters.
>
>

Received on Wednesday, 3 December 2008 18:58:52 UTC