AW: I18N-ISSUE-246: Clarify character encoding behavior when calculating storage size [ITS-20] from Stephan Walter on 2013-02-28 (www-international@w3.org from January to March 2013)

From: Stephan Walter <stephan.walter@cocomore.com>
Date: Thu, 28 Feb 2013 13:32:27 +0000
To: Anne van Kesteren <annevk@annevk.nl>, Felix Sasaki <fsasaki@w3.org>
CC: Yves Savourel <ysavourel@enlaso.com>, Norbert Lindenberg <w3@norbertlindenberg.com>, "public-multilingualweb-lt-comments@w3.org" <public-multilingualweb-lt-comments@w3.org>, www-international <www-international@w3.org>
Message-ID: <9F17297605BABC4F8C79050627BCF0F20F8A9CE6@AMSPRD0511MB548.eurprd05.prod.outlook.>

Hello,

the point that I still don't fully understand here (admittedly I'm not experienced with W3C specifications and what they pertain to, so maybe I'm just not getting something here...):

As far as I can see we do not require (in the sense of a capital MUST) any specific kind of action or processing to be done based on the 'storage size' information. Using the distinction between processor and consumer that Yves explained: A processor does not actually have to do anything with the storage size information that it would need to encode the marked up data for. We just suggest what could be done with it by a consumer (actually checking the constraint), and for that purpose the data would need to be encoded in the given encoding. Of course this way of using 'storage size' information seems to be the most reasonable one, but it is not a MUST.  There might be others that we didn't think of and that don't require the consumer to actually encode the data. So since the intended use of the 'storage size' information is not a MUST, can we really make something a MUST just because it is a prerequisite to this use?

Also what about a tool that produces data with ITS markup, maybe from a file with string literals from a UI or webpage that are, say, Latin1-encoded, plus the information that the translations of the strings will be stored in a target data store as UTF-8 and may not be longer than a certain number of bytes. I think that this application would want to use 'storage size' constraints with UTF-8 as value for the storageEncoding, without actually having to support UTF-8 (except, maybe, for generating the actual ITS-tagged output file in UTF-8, but that's another topic). Should it not be possible to call such an application ITS-2.0-compliant although it does not support UTF-8? And, if the target encoding was UTF-16, should the application have to produce an error if it does not support UTF-16, even though it would be able to produce a perfectly valid and useful output in any case?
Of course one might not regard this to be an 'application applying the information' in the sense of the suggested clause, but we should make sure that this is clear.

Best
Stephan

-----Ursprüngliche Nachricht-----
Von: annevankesteren@gmail.com [mailto:annevankesteren@gmail.com] Im Auftrag von Anne van Kesteren
Gesendet: Donnerstag, 28. Februar 2013 12:52
An: Felix Sasaki
Cc: Yves Savourel; Norbert Lindenberg; public-multilingualweb-lt-comments@w3.org; www-international
Betreff: Re: I18N-ISSUE-246: Clarify character encoding behavior when calculating storage size [ITS-20]

On Wed, Feb 27, 2013 at 10:55 PM, Felix Sasaki <fsasaki@w3.org> wrote:
> One important aspect of above sentence is that - as Yves pointed out - 
> the "must" would be a lower case "must". That is, this will be no 
> testable assertation of the ITS 2.0 specification, even if the spec 
> says "the consumer must support UTF-8". In that sense, we might even 
> put that requirement into a note, to make clear that from the ITS 2.0 
> point of view this is rather guidance than a normative statement. 
> Would that work for you too, Norbert?

1) Don't use RFC 2119 terms in any capitalization anywhere if they are not meant to be normative. That's confusing and wrong.

2) Isn't ITS 2.0 in part markup? All markup languages require utf-8.

3) Requiring utf-8 support also does not mean you have to test is, as not all conformance criteria apply to the same set of people. E.g.
conformance criteria on authors in HTML can often not be tested, but they are still rules authors of HTML are expected to follow.

--
http://annevankesteren.nl/

Received on Thursday, 28 February 2013 13:33:12 UTC