W3C home > Mailing lists > Public > www-i18n-comments@w3.org > September 2006

Conformance to Unicode 5.0: Troubles on the horizon?

From: Karl Dubost <karl@w3.org>
Date: Mon, 4 Sep 2006 11:12:30 +0900
Message-Id: <42C8ACE5-54FD-4855-A99F-CF093F2235DE@w3.org>
Cc: Richard Ishida <ishida@w3.org>, Felix Sasaki <fsasaki@w3.org>
To: www-i18n-comments@w3.org, public-i18n-core@w3.org

Hi,

Unicode 5.0 has been published but it raises some questions.

[[[
In UAX #9, "Bidirectional Algorithm," for better interoperability,  
the algorithm was modified to tighten up the conformance requirements  
for using mirrored glyphs for characters. Higher level protocols are  
discouraged, due to interoperability and security considerations. The  
definition of directional run was changed to be the same as level  
run, and the use of soft-hyphen with bidi text was clarified.
]]] -- Unicode 5.0.0
http://www.unicode.org/versions/Unicode5.0.0/
Tue, 29 Aug 2006 17:33:36 GMT

There are quite a few specifications at W3C which references Unicode  
normatively. For example HTML 4.01,

[[[
dir = LTR | RTL [CI]
     This attribute specifies the base direction of directionally  
neutral text (i.e., text that doesn't have inherent directionality as  
defined in [UNICODE]) in an element's content and attribute values.  
It also specifies the directionality of tables. Possible values:

         * LTR: Left-to-right text or table.
         * RTL: Right-to-left text or table.

In addition to specifying the language of a document with the lang  
attribute, authors may need to specify the base directionality (left- 
to-right or right-to-left) of portions of a document's text, of table  
structure, etc. This is done with the dir attribute.
]]] -- Language information and text direction
http://www.w3.org/TR/html401/struct/dirlang.html#adef-dir
Fri, 24 Dec 1999 23:25:42 GMT

(PS: Though HTML 4.01 refers to  Unicode 3.0, I wonder if using  
characters from Unicode 4.0 and 5.0 make the document non conformant.)


It seems by new conformance rules of Unicode, that a markup language  
should not have directionality information at the markup level. But  
HTML 4.01 specification is mandating the opposite,

[[[
The [UNICODE] specification assigns directionality to characters and  
defines a (complex) algorithm for determining the proper  
directionality of text. If a document does not contain a displayable  
right-to-left character, a conforming user agent is not required to  
apply the [UNICODE] bidirectional algorithm. If a document contains  
right-to-left characters, and if the user agent displays these  
characters, the user agent must use the bidirectional algorithm.
]]] -- Language information and text direction
http://www.w3.org/TR/html401/struct/dirlang.html#adef-dir
Fri, 24 Dec 1999 23:25:42 GMT


My question is related to the Good Practice 8 of "QA Framework:  
Specification Guidelines".

[[[
When imposing requirements by normative references, address  
conformance dependencies.
]]] - http://www.w3.org/TR/qaframe-spec/#ref-define-practice


And we give as an example the Charmod 1.0 specification, which gives  
recommendations for forward normative references.


[[[
C063   [S]  A generic reference to the Unicode Standard MUST be made  
if it is desired that characters allocated after a specification is  
published are usable with that specification. A specific reference to  
the Unicode Standard MAY be included to ensure that functionality  
depending on a particular version is available and will not change  
over time.

C064   [S]  All generic  references to the Unicode Standard  
[Unicode]  MUST refer to the latest version of the Unicode Standard  
available at the date of publication of the containing specification.

C065  [S]  All generic references to ISO/IEC 10646 [ISO/IEC 10646]  
MUST refer to the latest version of ISO/IEC 10646 available at the  
date of publication of the containing specification.
]]]

-- Character Model for the World Wide Web 1.0: Fundamentals
http://www.w3.org/TR/2005/REC-charmod-20050215/#sec-RefUnicode
Tue, 15 Feb 2005 14:24:00 GMT


-- 
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager, QA Activity Lead
   QA Weblog - http://www.w3.org/QA/
      *** Be Strict To Be Cool ***
Received on Monday, 4 September 2006 02:13:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:36 GMT