W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2009

RE: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization

From: Phillips, Addison <addison@amazon.com>
Date: Mon, 2 Feb 2009 13:48:56 -0800
To: Boris Zbarsky <bzbarsky@MIT.EDU>
CC: Mark Davis <mark.davis@icu-project.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA017DA5F881@EX-SEA5-D.ant.amazon.com>
Boris wrote:

> 
> I couldn't care less about non-normalizability of XML names per se,

That is the restriction on XML, though.

> but
> you indicated that any data that will be communicated to the server
> can't be normalized. 

I don't think I said that it can't be normalized under any circumstances. It is possible that any data that will be communicated to the server might not be normalized. And the questions basically are:

- is it okay to send non-normalized data? I think the answer to this is emphatically yes. There is actually no way to prevent it.

- is it okay to normalize non-normalized data at the server (or elsewhere) for some process? I think the answer to this is emphatically yes, although whether one wants to or not depends on the context of what "some process" is.

> You said this in the context of form
> textfields,
> but there's nothing particularly special about those that I can
> see...

No, there isn't anything particularly special about them. Data is data.

> Any data exposed via a DOM API, whether it be form input values or
> XML/HTML tag localNames can be sent to the server, right?

Yes. The question is, when you then *select* that data, what comes back?

Consider some data:

    &#xe9;

If I want to select that item, can I ask for either:

  - U+0065 U+0300
  - U+00E9

Both are semantically equivalent and normalize to U+00E9. I can send either to the server in my request and get the appropriate (normalized) value in return. Conversely, I should be able to select:

  <p>&#x65;&#x300;</p>

... using either form. I might be returned the original (non-normalized) sequence in the result. The point is that processes that are normalization sensitive must behave as if the data were normalized. Why is that a contradiction?

Addison
Received on Monday, 2 February 2009 21:52:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 2 February 2009 21:52:45 GMT