[Bug 8456] Behaviour of 'FULLY-NORMALIZED' not well defined in fn:normalize-unicode from bugzilla@wiggum.w3.org on 2009-12-08 (public-qt-comments@w3.org from December 2009)

From: <bugzilla@wiggum.w3.org>
Date: Tue, 08 Dec 2009 15:35:40 +0000
To: public-qt-comments@w3.org
Message-Id: <E1NI26G-00068M-G3@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=8456





--- Comment #1 from Michael Kay <mike@saxonica.com>  2009-12-08 15:35:40 ---
This issue is discussed here:

http://lists.w3.org/Archives/Public/public-qt-comments/2003Oct/0198.html

a discussion which started with my observation

"It's not at all clear to me that supporting "fully-normalized" form
makes any sense at all. Whereas the Unicode normalization forms all describe
an algorithm for normalizing data, the "fully-normalized" form is described
only as a property of a string. There is no algorithm provided for making a
string fully-normalized, and the only algorithms that one might come up with
involve losing information."

The next message in the thread summarizes what we concluded about the
algorithm:

"... a check that the first character in the string being normalized is
a base character (e.g. has a combining class of 0). If the last test
fails, a space is inserted at the start of the data to carry the
combining mark."

If my memory serves me right, we were assured that the algorithm would be
properly described in a future version of CharMod, and we felt that it needed
to be fixed in CharMod rather than in our specs. Perhaps that was wishful
thinking (many things related to I18N are).

For my own part, if I remember right I decided not to support this optional
feature until it was better specified.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 8 December 2009 15:35:49 UTC