W3C home > Mailing lists > Public > www-validator@w3.org > May 2011

(unknown charset) RE: HTML5 and Unicode Normalization Form C

From: (unknown charset) Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Mon, 30 May 2011 03:00:03 +0200
To: (unknown charset) www-validator@w3.org, www-international@w3.org
Message-ID: <20110530030003701233.f859ec9d@xn--mlform-iua.no>
Leif Halvard Silli, Mon, 30 May 2011 02:06:19 +0200:

> Clearly, HTML5 and the HTML5 validator should help authors avoid 
> gotchas. But, when thinking trouch some scenarios, it seems to be 
> difficult to give the right kind of warning/advice in a validator.
> 
> Example: 
> 
> * For the Apache2 version that comes with Mac OS X, one might in 
> principle use composed as well as decomposed links even if the file 
> names are decomposed. In Apache on Mac OS X, there is, however, a 
> single problem: cool, composed IRIs. E.g. 
> 	<http://example.com/%C3%A5.html> works, while 
> 	<http://example.com/%C3%A5> does not work. May be this is an Apache 
>   bug.
> * In order to fix the above problem, which also lead customers to react 
> when files were placed online, I started to use decomposed links:
>     <http://example.com/a%CC%8A>

Just discoverd, though, that Safari on Windows (but not on Mac) handles 
decomposed values in a unique way:

* in case of a de-composed fragment link, then Safari on Windows will 
target the composed identifier. If there is no composed identifier, 
then it will target nothing. Chrome, Safari-on-Mac and "all other" 
browsers treat them differently.

* in case of a de-composed cool IRI, it will not work at all. This is 
probably because Safari for Windows normalizes the cool IRI first: As 
already told, cool IRI is do not seem to work (whenever they contain 
decomposed letters).

So, may be Safari on Windows shows why normalization must be handled 
with care ...
-- 
Leif Halvard Silli
Received on Monday, 30 May 2011 01:01:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:48 GMT