Re: Responses to two Character Model last call issues

On Wednesday 29 August 2001 05:10, Martin Duerst wrote:
> I have been actioned (see e.g.
> http://lists.w3.org/Archives/Member/w3c-i18n-ig/2001Aug/0110.html)
> by the I18N WG to have a look at part of
> your last call comments to the Character Model, identified
> on our part as Issues 94 and 95.

Thank you for your response.

>  > If someone did use this to describe the MIME type, the dsig spec does
>  > not address how to resolve conflicting information and leaves it to
>  > the application.
>
> It is not clear to us whether you want us to change the relevant text
> in the Character Model, and if yes, how, or whether you are asking us
> whether we think the Character Model applies to the DSig spec in this
> case or not, or something else.

We were not asking for a change. I was just stating some our choices in the 
context of your own so as to make explicit any hidden assumptions/conflicts.

> Now to the next issue, our LCI-95 at

I agree with your response, but as I think this was Don's comment I will 
defer to him further discussion.

> http://www.w3.org/International/Group/charmod-lc/#LCI-95. You write:
>  > 3.6.2 Private Use Code Points
>  >
>  > The recommendation that private-use code points be allowed but
>  > prohibition against any mechanism to facilitate private agreements
>  > concerning these code points in any protocol seems bizarre.  Why not
>  > leave it up to protocol designers to determine if they will include a
>  > mechanism for private extensions or negotiation of privately defined
>  > options?
>
> We have to disagree with you on this point. I will try to explain why.
>
> While some mechanisms for extensibility, and options, and private
> agreements can be part of most well-designed specifications, it is
> important that these mechanisms are themselves well-defined.
> However, there are many problems with using the private-use area,
> and these problems get more and more severe in the context of
> the WWW with its worldwide interaction.
>
> One main problem is the interaction with general XML processing
> tools. These tools would not know how to correctly conserve
> (or modify) the additional information in the protocol, and
> so the protocol and the codepoints could easily get out of sync.
> You may think that this is not much different from other kind
> of markup, but typing/cutting/pasting characters is something
> that will work for everything except the private use zone.
>
> A related problem is that we try to create an architecture where
> e.g. various XML namespaces can be combined in innovative ways.
> In a document with two different private use area extensions,
> this will lead to all kinds of problems and inconsistencies.
> If the interpretation of private use characters depends on
> the namespace of the element they are contained in, this
> breaks the fundamental assumption of formats such as HTML
> and XML that each entity has only one character encoding.
> It also brings us back into ISO 2022-like hodgepodges, except
> that there is much more of a layer violation between markup
> and character codes with 2022 character escapes.
>
> Another issue is that it might look easy to misuse the private use
> area for all kinds of pseudo-clever tagging and compression schemes.
>
> In addition, there is the danger that some assignments in the
> private use zone lead to pressure to include these codes in
> Unicode. The W3C is not the place to create new encoded characters,
> this is much better left to the expert organizations in this field.
>
> In some sense, the private use area is very similar to the x-
> convention for languages, charsets,... in the IETF. It is there for
> experimental use, but not for use as part of a specification.
>
> I hope this explains why we prohibit private-use agreement mechanisms
> for specifications that want to conform to the character model, and
> why we therefore reject your comment.

Received on Wednesday, 5 September 2001 14:59:37 UTC