RE: ANEC's comments on W3C mobileOK Basic Tests 1.0, W3C Working Draft 30 January 2007

Hi Chaals,

Sean's already "...added these comments to the ever-growing list of comments on the last call document that we'll process."

However, if I can make you happy with "...a plain version of this document to facilitate reading" (a usability contradiction, valid only it is the case of long, continuous texts), I'll do; cut&paste it at the end of this message.

Re. UTF-8: I am not an expert on character implementation and handling (there are few in the world) but I've seen trouble caused by it; even the current SMS standard suffers.
The thing is, there are differences and incompatibilities between what mobile systems (of different generations and technologies), mobile terminals and applications support and how. There are also incompatibilities between the IT and telecom world. 

I have some available reference developed during the update of the character standard (also used for text messaging); I add an extract below:

Part of the mobile (SMS) standard was originally taken over from paging, with only a rudimentary set of characters defined: the "ASCII set" complemented by a few specifically European-language letters, amongst them the ten Greek capital letters not having a corresponding visual representation in the Latin alphabet. The Cyrillic alphabet was not covered at all.
The GSM 03.38 standard [10] applies only to what is transmitted between a mobile phone and the "Mobile Switching Centre" (MSC), not to how character generation is handled inside the phone. Naturally it is however not very meaningful to generate text with characters that can then not be transmitted, so the character limitations of the standard also limits what needs to be generated. With the original - "default" - (SMS) character set, multi-linguality is therefore completely unsatisfactory.

In GSM Phase 2, an alternative to the original character set was introduced, in principle permitting about double the numbers of characters of the original SMS scheme. This alternative was designated "user-defined", i.e. no scheme was specified in the standard. It appears no user - i.e. Operator - has utilized this possibility.

With GSM Phase 2+, another alternative was introduced, namely the coding scheme of ISO/IEC 10646 1. With this scheme there is, in principle, no longer any limitation on the repertoire of characters that can be represented in texts or SMS messages. European multi-linguality is therefore enabled, as far as representation of characters is concerned.

When the original standard for GSM messaging was developed, it was focused on languages of Western Europe. For those, a comparatively small set of letters was sufficient to cover the alphabetic needs. The developed standard therefore made use of the same 7-bit (i.e. containing 128 characters) ETSI-specific coding scheme as for the paging system ERMES (European Radio Message System), with some minor changes.

With the spread of GSM, that scheme became totally insufficient. The messaging standard was therefore modified, starting by its version 5.1.0 (March 1996), to permit also an alternative coding scheme, namely ISO/IEC 10646 in its two-octet (two-byte) coding variant UCS-2 (coding-wise identical to “Unicode”). This scheme however reduced the number of characters that can be contained within a message package, as described below.

Also, in the latest standard version an extension mechanism for the default alphabet has been introduced. It does not expand the letter repertoire, however.

The increasing emphasis on supporting multi-linguality in telecommunication – as well as in data processing – in today’s global society makes the limitations of the present SMS standard’s coding schemes unacceptable to users. In this connection the ETSI standard ES 202 130 “Character repertoires, ordering rules and assignments to the 12-key telephone keypad” (2003-10) is especially relevant.

In the case of sending SMS text with the default 7-bit alphabet, the characters are packed into 140-octet (bytes) packages, permitting a message length of up to 160 characters. With the 10646 UCS-2 alphabet, where each character is represented by two octets, the maximum message length is 70 characters.

Although different implementations of the standard may be possible, it appears that all mobile-phone and system manufacturers have taken an obvious straight-forward approach. When a user inputs a message containing only the letters of the default 7-bit coding scheme, a message of maximum 160 characters is generated.

In early phones, only the default letter repertoire was available. Present-day phones, however, permit input of a much larger repertoire. If only default-scheme (7-bit) letters are input, a packed 7-bit message is generated. As soon as a letter outside that scheme is input, however, the generation switches to two-octet coding, permitting a maximum length of only 70 characters.

Understandably, users find this mobile-phone behavior highly confusing. It also means waste of bandwidth, since even a single character outside of the default scheme may results in two or even three consecutive SMS being sent instead of a single one.

Also, it means that a user of one of the languages not covered by the default alphabet will in general have to pay more to transfer a message than a user of a language that is covered, i.e. a discrimination of several European languages!


UCS-2 transformation according to UTF-8 is a data compression method, ISO/IEC 10646 “UCS Transformation Format 8 (UTF-8)”, as an additional coding scheme.
The purpose of that transformation is not really compression, but avoiding complications in data transmission. Since every 10646 UCS-2 character is represented by two octets, a character data stream may contain single-octet values in the ranges used for control characters in 7- and 8-bit schemes (hex 00-1F and 7F). This may cause problems in “transparent” transmissions. The UTF-8 transformation ensures that such problems do not occur.

As a “side effect”, all Basic Latin characters (i.e. same as ASCII, hex 20-7E) will become represented by a single octet. Since that includes the letters a/A-z/Z, which in all Latin-script languages are the most common in text, considerable data compression will in practice occur.

UTF-8 transformation may however also place too great demands on processors. Further it shall be noticed that for characters with UCS-2 code values above hex 07FF, the transformation will actually produce three or more octets, making SMS even less efficient. This is the case for e.g. all Indian-language letters.

*************************************************************
The ANEC comments:

1. Introduction and scope
As a general standpoint, ANEC considers Web accessibility and usability of very high importance to consumers. Given the increasing number of consumers accessing the Web through a mobile device, we appreciate W3C’s efforts trying to improve the experience of the mobile Web into a better consumer experience.
Our comments on this test specification draft document are provided below, with good intentions and in a positive spirit and should be considered as our contribution to improve the current Last Call draft.
The comments are intended to provide consumer-centric input and guidance on how to further improve and extend the coverage and usefulness of the present draft and/or future Recommendations within this area.
The comments reflect issues relevant to consumers, discussed and agreed in the ANEC ICT Working Group.

2. Comments
Section “Abstract”
It is understood that “this document defines the tests that provide the basis for making a claim of W3C® mobileOK Basic™ conformance, based on W3C Mobile Web Best Practices (http://www.w3.org/TR/mobileOK-basic10-tests/#BestPractices#BestPractices )”.
Furthermore, it is understood that “content passing these tests has taken some steps to provide a functional user experience for users of basic mobile devices whose capabilities at least match those of the Default Delivery Context (DDC).
The concepts introduced, mobileOK Basic (the lesser of two levels of claim; the greater level being mobileOK) and mobileOK do not assesses interoperability, usability, accessibility, nor the accessed content”.
Comment #1: In the perspective of the above, we believe that this may be understood as (strongly) misleading consumers, who will have other, natural assumptions about the meaning of the trust mark. Assumably, if a mobile Web site is declared to be “mobileOK”, consumers will assume the trust mark to is some kind of guarantee for aspects that will mean OK to them. In other words, it may well be assumed as a guarantee for reliable content, safe access, and trustable connections with a fair usability and some minimum levels of accessibility. Furthermore, depending on the consumer’s age, assumptions may even be made about the some kind of appropriateness of the content, when accessed by young children.
An analogy to the above is TV sets marketed as “HD ready”. Even if this is only a declaration of one of the TV set’s capabilities, consumers (typically uninterested in details of this and other technologies) will naturally assume this to be a declaration of compatibility and capabilities for receiving and displaying high definition TV broadcasts without further needs to buy additional products (such as a set-top box) and most probably, subscriptions (that will also imply a considerable monthly fee). Consumers are often not aware that HD displays will only display an HD picture when connected to an HD receiver (set-top box).
This will lead to consumer disappointment and the product may even be handed back. To continue with the analogy, “Real HD ready” TV sets are now marketed and the situation is becoming very confusing…what was “HD ready”? And what may be next? False marketing does not aid the successful uptake of new consumer technologies.
Therefore, we suggest the re-branding of the mobileOK™ and mobileOK Basic™ trust marks in some way that reflects their true and proper meanings. Due to the complexity of the required branding, this may be a challenging task but worth the effort. It is not our task, nor competence area to propose alternative names that would work properly on a global market but wording that consumers would understand may include:
• Ready for mobile use;
• Mobile device adapted site;
• This content displays OK on mobile devices.
We believe that third party provisioning (or certification) is the only way to provide a reliable trust mark information to consumers as often, products do not match qualities declared by manufacturers, entailing a loss of consumer confidence. ANEC therefore encourages third-party certification.
Section 1.1.4
“The best practices, and hence the tests, are not promoted as guidance for achieving the optimal user experience…It will often be possible, and generally desirable, to provide an experience designed to take advantage of the extra capabilities.
Content providers should provide an experience that is mobileOK conformant to ensure a basic level of interoperability.”
Comment #2: The above is a valuable statement to developers but, again, misleading to consumers, who should understand the trust mark in the right way.
Section 2.3
“Creators of implementations of the tests described in this document are encouraged to provide as much information as possible to users of their implementations. Where possible they should not stop on FAIL and specifically they should:
• Provide information about the cause of failure
• Continue individual tests as far as is possible
• Carry out as many tests as is reasonable”
Comment #3: In addition to the above information recommended for provision, the consumer should be informed about the reason of failure in an understandable way. Additional information relating to other functionalities should also be provided. Furthermore and in addition, a technical reason or code may be provided to help the operator or creator of the implementation to identify the source of the error.
Last but not least, consumers should be able to contact customer services through a single point of access per modality (e.g. by calling the “usual” number for all issues).

Section 2.3.4
Some tests refer to “CSS Style” information.

Comment #4: We would like the WG to confirm if the consequences of applying and using CSS have been examined with regard to mobile Web accessibility (possibly in collaboration with the WAI/WCAG Activity)?
Section 3.3
The current requirements are to support only UTF-8 encoding.
Comment #5: We believe that the UTF-8 coding support should be studied in more detail, as it may have implications on the displayable text and the data transmission.

Regards,
Bruno

-----Original Message-----
From: Charles McCathieNevile [mailto:chaals@opera.com] 
Sent: den 7 mars 2007 22:56
To: Bruno von Niman; public-bpwg-comments@w3.org
Cc: 'Chiara Giovannini'; 'Christina Everett'; 'Bruno von Niman'
Subject: Re: ANEC's comments on W3C mobileOK Basic Tests 1.0, W3C Working Draft 30 January 2007

On Thu, 08 Mar 2007 04:21:45 +1100, Bruno von Niman <ANEC_W3CRep_Bruno@vonniman.com> wrote:

> Dear W3C Mobile Web Best Practices WG,

> See attached a document with comments from ANEC (www.anec.org <http://www.anec.org/> ) on the W3C mobileOK Basic Tests 1.0, W3C Working Draft 30 January 2007.

Oh for a plain version of this document to facilitate reading. Say, pasting the content into email...

In your comment 5 you ask us to study utf-8 more as you believe there maybe some issues with displayability.

Culd you be more precise? I looked this morning at utf-8 (which means I studied it more). What are the issues or potential issues that concern you?

thanks for the comments

cheers

Chaals

-- 
  Charles McCathieNevile, Opera Software: Standards Group
  hablo español  -  je parle français  -  jeg lærer norsk
chaals@opera.com          Try Opera 9.1     http://opera.com

Received on Friday, 9 March 2007 09:28:39 UTC