- From: Phillips, Addison <addison@lab126.com>
- Date: Tue, 19 Mar 2013 15:47:51 +0000
- To: Arle Lommel <arle.lommel@dfki.de>, Stephan Walter <stephan.walter@cocomore.com>
- CC: Norbert Lindenberg <w3@norbertlindenberg.com>, Felix Sasaki <fsasaki@w3.org>, Yves Savourel <ysavourel@enlaso.com>, "public-multilingualweb-lt-comments@w3.org" <public-multilingualweb-lt-comments@w3.org>, "'www-international'" <www-international@w3.org>
It's not a new normative statement if normative language isn't intended. I would avoid putting a normative statement into a "note". Generally, when I'm spec writing, I avoid the Magic Normative Words unless I mean them normatively. So in this case I read the proposed note text as meaning: > In order to be able to evaluate a Storage Size constraint an application > has to be able to encode... Which is an example of "anti-normative" writing: MAY -> can SHOULD -> ought SHOULD NOT -> ought not, avoid MUST -> has to MUST NOT -> can't, don't RECOMMENDED -> really good idea Addison > -----Original Message----- > From: Arle Lommel [mailto:arle.lommel@dfki.de] > Sent: Tuesday, March 19, 2013 8:22 AM > To: Stephan Walter > Cc: Norbert Lindenberg; Felix Sasaki; Yves Savourel; public-multilingualweb-lt- > comments@w3.org; 'www-international' > Subject: Re: I18N-ISSUE-246: Clarify character encoding behavior when > calculating storage size [ITS-20] > > Stephan, > > I think this sound good. However, as it adds a MUST statement, will it impact us > because it could be seen as a new normative statement? (I think it is rather a > clarification of intent, but just want to check on it.) > > -Arle > > On 2013 Mar 19, at 06:41 , Stephan Walter <stephan.walter@cocomore.com> > wrote: > > > Hello, > > > > coming back to the issue of error handling when storage size is processed. > Would you agree to adding the following note to the definition of the data > category as a resolution: > > > > NOTE: In order to be able to evaluate a Storage Size constraint an application > must be able to encode the content of the selected nodes in the specified > character encoding. An application that evaluates Storage Size but does not > support the specified character encoding must report this as an error. If the > selected nodes contain characters that the specified character encoding cannot > represent, the processor must also report this as an error. The application > evaluating the Storage Size constraint is not necessarily the ITS processor itself. > The constraint may rather be evaluated by applications consuming the ITS > encoded data in later steps. In such cases the above requirement pertains to > those ITS consuming applications. > > > > Best regards > > Stephan > > > > -----Ursprüngliche Nachricht----- > > Von: Norbert Lindenberg [mailto:w3@norbertlindenberg.com] > > Gesendet: Donnerstag, 28. Februar 2013 08:26 > > An: Felix Sasaki > > Cc: Norbert Lindenberg; Yves Savourel; public-multilingualweb-lt- > comments@w3.org; 'www-international' > > Betreff: Re: I18N-ISSUE-246: Clarify character encoding behavior when > > calculating storage size [ITS-20] > > > > > > On Feb 27, 2013, at 14:55 , Felix Sasaki wrote: > > > >> Hi Yves, Norbert, all, > >> > >> Am 27.02.13 13:50, schrieb Yves Savourel: > >>> Hi Norbert, > >>> > >>>>> Note also that we have no way to check conformance of the > >>>>> applications using the ITS data for such mandatory support: ITS > >>>>> processors just pass the data along, they don't act on them (in > >>>>> the case of this data category). > >>>> So who does actually act if a string is too long to fit into the > >>>> specified storage? > >>> There is certainly the case of applications that do process ITS markup and > apply it to the content directly: For example a JavaScript in an HTML5 page. But > there are also applications that use an ITS processor to feed the content and > the ITS information to a distinct system where the information is then applied. > They correspond, for example, to the "Localization Workflow Managers" > described in the "potential users of ITS"[1]. > >>> > >>> So I think it's important to make the distinction between the 'ITS processor' > which act on the markup, and the 'consumer of ITS information' (for lack of a > better name) that applies the ITS information. Both can be the same > application, but they may also be separate ones. > >>> > >>> This means a storage-size constraint can be applied completely > >>> outside the original XML/HTML5 document with tools that have no > >>> relations with the ITS processor itself, or with XML/HTML5 for that > >>> matter. Examples of such applications are localization quality > >>> checking tools (like CheckMate, XBench, QA-Distiller, etc.) > >>> > >>> This is why, from my viewpoint, requiring the 'consumer of ITS information' > to support UTF-8 is not important. And I was looking at the case for consumers > that don't have a need for UTF-8, and whether we should really foist on them > such a requirement. > > >>> > >>> To answer your question "Do you really want to let systems that can > represent less than 1% of Unicode advertise themselves as ITS 2.0 > conformant?": Why not? If the context where they are utilized is using only 1% > of Unicode, why should they be forced to support more? I see many customers > that never work outside of Latin-1. > >>> > >>> This said, supporting UTF-8 is very easy nowadays and promoting its > support is a good thing too. So in the interest of moving forward and of > promoting better internationalization, I see no problem requiring the consumer > of storage-size to support UTF-8. > >>> > >>> The only thing that bother me a little is that such conformance as well as > the parts about handling errors, apply to the consumer of the ITS information, > not really the ITS processor, and I'm not sure the scope of our tests can cover > that. > >>> > >>> > >>> With regards to the error handling: > >>> > >>>> It could be as simple as "If an ITS processor doesn't support the > >>>> specified character encoding, it must report this as an error and > >>>> terminate processing. If the selected nodes contain characters that > >>>> the specified character encoding cannot represent, the processor > >>>> must report this as an error and terminate processing." Or you > >>>> could try and be nice in the second case and specify a fallback > >>>> strategy, e.g., by saying that the first replacement character > >>>> among U+FFFD, > >>>> U+003F, > >>>> U+FF1F that can be represented in the specified character encoding > >>>> must be used instead of any character that can't. > >>> I would favor a more practical behavior: > >>> > >>> "If the application applying the information doesn't support the specified > character encoding, it must report this as an error. > >> > >> One important aspect of above sentence is that - as Yves pointed out - the > "must" would be a lower case "must". That is, this will be no testable > assertation of the ITS 2.0 specification, even if the spec says "the consumer > must support UTF-8". In that sense, we might even put that requirement into a > note, to make clear that from the ITS 2.0 point of view this is rather guidance > than a normative statement. Would that work for you too, Norbert? > > > > While it's much better if assertions can be and are tested, keep in mind that a > test suite generally can't prove that a system fully conforms to a spec - it can > only show in some cases that it doesn't. And even if this requirement isn't > testable by software, there's still the test of looking into the developer's eyes > and asking "does your system support UTF-8?". Notes are not requirements, so > turning this into a note would remove the basis for asking the question. > > > > Norbert > > > > > > >
Received on Tuesday, 19 March 2013 15:49:37 UTC