W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2009

Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization

From: Andrew Cunningham <andrewc@vicnet.net.au>
Date: Tue, 3 Feb 2009 02:18:43 +1100 (EST)
Message-ID: <9069b54aaacd6a0f982ca9c45d4988cc.squirrel@mail.vicnet.net.au>
To: "Boris Zbarsky" <bzbarsky@MIT.EDU>
Cc: "Andrew Cunningham" <andrewc@vicnet.net.au>, public-i18n-core@w3.org, "W3C Style List" <www-style@w3.org>


On Tue, February 3, 2009 1:42 am, Boris Zbarsky wrote:
> Andrew, forgive my ignorance, but does this mean that normalizing
> everything in the UA at parse-time (and normalizing the content of form
> fields that the user types in at typing time, of course) is not in fact
> a viable option?
the normalisation of form fields should be determined the web developer.
Normalisation in some context may violate standards in some industries.
One taht comes to mind is libraries. Many of the newer integrated library
management systems will use a web browser as a client for the cataloguing
modules. Normalising form fields would result in violating the MARC21
character model.

If i were working on content in some langauges like igbo, and wanted to
include tone markers to use as an alternative display of data, its easier
to work with NFD data and filter tone marks out when applying standard
orthographic views.

selectors should be consistently normalised, and NFC would be appropriate.
Search strings are also a good example of where normalisation is
important. You just ahve to look at the mess BBC world services had when
they migrated to the Windows Vietnamese keyboard layout.

But for form data you have a very different story, where a web developer
should have full control of what is happening. To have a browser normalise
to NFC and then have a web developer have to renormalise data to NFD or in
the case of MARC21 build a completely new normalisation routine that
matches the MARC21 character model which is nearly but not quite NFD is
creating a burden for the web developer in question.

>> A text editor that doesn't normalise to NFC isn't broken. An ideal text
>> editor gives teh user the choice on what normalisation form to use.
>
> This in particular worries me, in light of the form input issue.

certain things require normalisation, certain things should be at the
discretion of the developer.


-- 
Andrew Cunningham
Research and Development Coordinator
Vicnet
State Library of Victoria
Australia

andrewc@vicnet.net.au
Received on Monday, 2 February 2009 15:19:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 2 February 2009 15:19:26 GMT