Re: SVG12: IRI Processing rules and xlink:href

Martin Duerst wrote:
>>
>> The problem with this edit is that step 1b. is now doing two
>> things where formerly it did only one thing. Previously, it specified
>> that the IRI be converted to a normalized Unicode character sequence
>> without specifying how that took place.
> 
> Well, the split was necessary to change from MUST to SHOULD.

Yes, and the question is why SHOULD is a good thing (which you address a 
bit later, though not to my immediate satisfaction)...

> 
>> Now it specifies converting from the legacy encoding and *then*
>> (perhaps) normalize. It reduces the requirement for NFC from an inherent
>> MUST to an explicit SHOULD.
> 
> Yes. The way I understand it, this was to address concerns brought
> up by the CSS WG that in some cases, they won't even know what
> the original encoding was, because a whole CSS file has been
> transcoded to Unicode, and the original encoding thrown away,
> and it would be a bad hack to somehow keep the encoding around
> and do something encoding-dependent long after everything is in
> Unicode.

And CSS doesn't deal with the normalization problem because....???

The CSS working group should already realize that they have exactly the 
same normalization problem. Although CharMod C014 doesn't strictly 
require it (because it's supposed to be in a separate document that 
::sigh:: remains unadvanced), a technology that relies on matching of 
strings between the stylesheet and the styled document ignores 
CharMod-Norm at its peril! If the stylesheet isn't normalized, how to 
selectors work? Do they rely on equivalent non-normalization of the 
styled document? Somehow that's not what's intended, I think.

In other words, a better response, in my opinion, might be that CSS 
require normalization rather than having IRI add additional waffle about it.

> 
>> Now I understand that encoding converters may or may not produce a
>> sequence that is NFC. For example, mapping a sequence containing the
>> combining flavors of Japanese dakuten or handakuten characters (i.e. U+3099, U+309A) to Unicode from a Japanese encoding will result in a combining sequence in several converters I have handy. I think it acceptable and even smart not to require the transcoding process to be normalizing. However, that wasn't the requirement in 1b. Normalization could be applied outside the transcoding process and still be conformant with the old text.
> 
> Yes, but as far as I understand, that wasn't the original issue.
> Also, given the above fact, my guess is that apart from the above
> issue, implementation conformance to the normalization requirement
> in 1.b. is spotty at best.

Yes, because we haven't highlighted it sufficiently. Normalization 
remains an issue (if a minor one) that requires some attention.

In any case, I look forward to your participating in our call in just a 
few hours :-).

Best Regards,

Addison


-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.

Received on Tuesday, 3 July 2007 05:56:31 UTC