W3C home > Mailing lists > Public > www-i18n-comments@w3.org > September 2006

Re: Conformance to Unicode 5.0: Troubles on the horizon?

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Mon, 04 Sep 2006 20:55:16 +0900
Message-Id: <6.0.0.20.2.20060904205113.07c9abe0@localhost>
To: Karl Dubost <karl@w3.org>, www-i18n-comments@w3.org, public-i18n-core@w3.org
Cc: Richard Ishida <ishida@w3.org>, Felix Sasaki <fsasaki@w3.org>

Hello Karl,

There is no problem at all for HTML and so on. What the changes
in Unicode 5.0 mean are that they discourage higher level protocols
*for mirrored glyphs and characters*. This is different from
higher level protocols for the bidi algorithm as such. These
(http://www.unicode.org/reports/tr9/#Higher-Level_Protocols)
have always been allowed, because they make sense and are
widely implemented. The Unicode Consortium wouldn't change
that easily, and certainly not without consulting W3C in detail.

So in short, no problem at all.

Regards,    Martin.

At 11:12 06/09/04, Karl Dubost wrote:
>
>Hi,
>
>Unicode 5.0 has been published but it raises some questions.
>
>[[[
>In UAX #9, "Bidirectional Algorithm," for better interoperability,  
>the algorithm was modified to tighten up the conformance requirements  
>for using mirrored glyphs for characters. Higher level protocols are  
>discouraged, due to interoperability and security considerations. The  
>definition of directional run was changed to be the same as level  
>run, and the use of soft-hyphen with bidi text was clarified.
>]]] -- Unicode 5.0.0
>http://www.unicode.org/versions/Unicode5.0.0/
>Tue, 29 Aug 2006 17:33:36 GMT
>
>There are quite a few specifications at W3C which references Unicode  
>normatively. For example HTML 4.01,
>
>[[[
>dir = LTR | RTL [CI]
>     This attribute specifies the base direction of directionally  
>neutral text (i.e., text that doesn't have inherent directionality as  
>defined in [UNICODE]) in an element's content and attribute values.  
>It also specifies the directionality of tables. Possible values:
>
>         * LTR: Left-to-right text or table.
>         * RTL: Right-to-left text or table.
>
>In addition to specifying the language of a document with the lang  
>attribute, authors may need to specify the base directionality (left- to-right or right-to-left) of portions of a document's text, of table  
>structure, etc. This is done with the dir attribute.
>]]] -- Language information and text direction
>http://www.w3.org/TR/html401/struct/dirlang.html#adef-dir
>Fri, 24 Dec 1999 23:25:42 GMT
>
>(PS: Though HTML 4.01 refers to  Unicode 3.0, I wonder if using  
>characters from Unicode 4.0 and 5.0 make the document non conformant.)
>
>
>It seems by new conformance rules of Unicode, that a markup language  
>should not have directionality information at the markup level. But  
>HTML 4.01 specification is mandating the opposite,
>
>[[[
>The [UNICODE] specification assigns directionality to characters and  
>defines a (complex) algorithm for determining the proper  
>directionality of text. If a document does not contain a displayable  
>right-to-left character, a conforming user agent is not required to  
>apply the [UNICODE] bidirectional algorithm. If a document contains  
>right-to-left characters, and if the user agent displays these  
>characters, the user agent must use the bidirectional algorithm.
>]]] -- Language information and text direction
>http://www.w3.org/TR/html401/struct/dirlang.html#adef-dir
>Fri, 24 Dec 1999 23:25:42 GMT
>
>
>My question is related to the Good Practice 8 of "QA Framework:  
>Specification Guidelines".
>
>[[[
>When imposing requirements by normative references, address  
>conformance dependencies.
>]]] - http://www.w3.org/TR/qaframe-spec/#ref-define-practice
>
>
>And we give as an example the Charmod 1.0 specification, which gives  
>recommendations for forward normative references.
>
>
>[[[
>C063   [S]  A generic reference to the Unicode Standard MUST be made  
>if it is desired that characters allocated after a specification is  
>published are usable with that specification. A specific reference to  
>the Unicode Standard MAY be included to ensure that functionality  
>depending on a particular version is available and will not change  
>over time.
>
>C064   [S]  All generic  references to the Unicode Standard  
>[Unicode]  MUST refer to the latest version of the Unicode Standard  
>available at the date of publication of the containing specification.
>
>C065  [S]  All generic references to ISO/IEC 10646 [ISO/IEC 10646]  
>MUST refer to the latest version of ISO/IEC 10646 available at the  
>date of publication of the containing specification.
>]]]
>
>-- Character Model for the World Wide Web 1.0: Fundamentals
>http://www.w3.org/TR/2005/REC-charmod-20050215/#sec-RefUnicode
>Tue, 15 Feb 2005 14:24:00 GMT
>
>
>-- 
>Karl Dubost - http://www.w3.org/People/karl/
>W3C Conformance Manager, QA Activity Lead
>   QA Weblog - http://www.w3.org/QA/
>      *** Be Strict To Be Cool ***
>
>
>


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Monday, 4 September 2006 11:56:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:36 GMT