W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2015

Review of tracker issues for best practices (part II)

From: Phillips, Addison <addison@lab126.com>
Date: Sat, 28 Mar 2015 20:29:28 +0000
To: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <7C0AF84C6D560544A17DDDEB68A9DFB52ECE7DF3@ex10-mbx-9007.ant.amazon.com>
Picking up where I left off...


BP: formats that allow insertion of text or elements inline should supply bidi isolation
// probably we need a whole series of BPs about isolation


BP: Specifications should require user agents to implement the Unicode spec re Default Ignorable Code Points (Unicode Standard version 5.2, Chapter 5 (http://unicode.org/versions/Unicode5.2.0/ch05.pdf), section 5.21), including never displaying the directional formatting characters (LRM, RLM, LRE, RLE, LRO, RLO, and PDF) inappropriately (e.g. as empty boxes or advance widths) even if the underlying platform does not handle them properly. In particular, this must be the case for script dialog text, page titles, and tooltips.


BP: elements (such as title) intended for display outside the context of the document as a whole must provide a direction attribute (and a language attribute) to help ensure proper display


BP: attributes that are intended for natural language text should provide a means of indicting the direction of the text. Example, provide "altdir" for "alt"


BP: elements that might be rendered by native controls (such as combo box items) should have a health warning indicating the need to apply or pass language and direction attributes to the native renderer.
BP: elements that might be rendered by native controls should be rendered according to the direction (set or inherited) for that element


BP: On an OS that has a widespread UI convention for setting direction, user agent should support it on input and textarea elements
// this one should be survey further


BP: When an input value is remembered, its direction should be remembered too


Positioning of visual elements such as scroll bars based on top-level direction
// needs more investigation


BP: Specifications that display text out of context that do not pass direction should use "auto" directionality. The example given is JavaScript's alert()


BP: When an XML grammar allows UTF-16 as an encoding, the BOM is required (per XMLSpec http://www.w3.org/TR/REC-xml/#charencoding)


BP: In document character encoding declarations are always useful, even when non-functional
// this is to some degree a repeat from issue-22's BP


// the UTF-8 BOM: encouraged or discouraged?


BP: define whether case sensitivity applies in search/find mechanisms
// case insensitivity only for ASCII values
// best practices here should be taken from Charmod-Norm's section on ACI/UCI/ACS
// http://www.w3.org/Bugs/Public/show_bug.cgi?id=10153


BP: use hexadecimal entities in preference to decimal ones in examples


// requirements for a "find" type operation should be taken from (and/or incorporated into) Charmod-Norm's section on searching
// this item is about normalization sensitivity of a find operation


BP: data structures that support sorting of natural language text for user presentation should provide a means to store a separate "pronunciation" field, since this is necessary for languages like Chinese and Japanese and cannot be computed with high accuracy.


BP: data structures for personal names must be able to support different numbers of name tokens, different ordering of name tokens, and different uses.
// issue 70 is multiple family names, 71 is culturally linked name positioning


BP: do not make lang case-sensitive; do not make language tag matching case sensitive
// note that language tag matching/processing is supposed to be case insensitive per BCP 47


// preferred names for encodings: probably we should reference Encoding


BP: when creating a "spell check" feature (or other natural language processing feature) for user input text, it should be possible for the content author to indicate that a field should not be checked. 
// side thought: should we also BP the use of ITS for things like indicating DNT?


BP: when defining automatic quote generation, the enclosing text's language should be used to form/shape the quote marks (?)
// this is the infamous <q> issue


// this has to do with, when stripping CR and LF line terminators, whether other terminating characters should also be stripped. We decided not to. Should we document this somewhere?


BP: when a namespace uses ASCII case insensitivity, values in that namespace should be limited to ASCII
// counter examples exist (CSS: I'm looking at you!) and they suck


BP: when the Gregorian calendar is specifically used, a health warning should be included reminding of the existence of other calendars
// note Ian's health warning in HTML5


Addison Phillips
Globalization Architect (Amazon Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.

Received on Saturday, 28 March 2015 20:29:54 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:02:05 UTC