W3C home > Mailing lists > Public > public-qt-comments@w3.org > November 2003

RE: [F&O] 7.4.6 fn:normalize-unicode

From: Ashok Malhotra <ashokma@microsoft.com>
Date: Fri, 28 Nov 2003 11:52:18 -0800
Message-ID: <EDB607C8AC991F40BE646533A1A673E8B6FDA2@RED-MSG-42.redmond.corp.microsoft.com>
To: "David Carlisle" <davidc@nag.co.uk>, <public-qt-comments@w3.org>

David:
Thank you for your comment.  The algorithm for taking a string and
normalizing it according to the 'fully-normalized' form has recently
been clarified in discussions with the I18N folks.

If I've understood it correctly, you first normalize according to NFC.  
You then look at the first character of the string.  If this is one of
the 
accents listed in the chamod book then you add a space at the start of
the string.

We are continuing discussions with the I18N folks because we cannot
refer to
The charmod spec until it becomes a recommendation.

All the best, Ashok

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of David Carlisle
Sent: Tuesday, November 18, 2003 2:10 AM
To: public-qt-comments@w3.org
Subject: [F&O] 7.4.6 fn:normalize-unicode



fn:normalize-unicode contains a normalization form of
"fully-normalized", but it is not at all clear what this means.

charmod at the point that it defines this term says:

  For plain text (no includes, no constructs, no character escapes) in a
  Unicode encoding form, full-normalization and Unicode-normalization
are
  equivalent.

The argument of fn:normalize-unicode is a _string_ so is plian text in
the sense of the above, so the requirements of fully-normalized over NFC
do not apply.

Also the definition of fully-normalized in charmod says

   the text is in ___a____ Unicode encoding form, 

Note that it does not specify which form should be used (probably NFC
makes most sense here so should probaby be specified) but then as noted
above there probably wouldn't be any difference between NFC and
fully-normalized for a string argument, except possibly the latter would
add a space at the start of the string (which is a possible difference,
despite the charmod quote that says these are equivalent...)


David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
Received on Friday, 28 November 2003 14:54:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:14:28 GMT