Re: How browsers display URIs with %-encoding (Opera/Firefox FAIL)

Leif Halvard Silli, Thu, 21 Jul 2011 21:05:16 +0200:

> For testing of page internal (that is #fragment links), you could 
> create an ISO-8859-1 encoded page which contains links to directly 
> typed fragments whose first letter begins with a non-ASCII letter from 
> the ISO-8859-1 charset. And then you can test how that same page works 
> if served/interpreted as another legacy, 8-bit encoding, such s KOI8-R 
> etc. This test should compare wheter, for instance, in a ISO-8859-1 
> page,  href="#Dürst" would hit both id="Dürst" and id="Dürst".

Made some unpolished tests (with far to many links inside ...), based 
on some old tests I had laying around: 
http://malform.no/testing/html5/urls/ 

Results: 

# With regard to hover display and display on the URL bar, then those 
tests show, for fragment URIs:
* that directly typed characters in a URL get semantic display  
  in all the browsers tested (Firefox, Opera, Chrome/Safari/iCab, IE8)
* that UTF-8 based percentage encoded URLs are given semantic display 
  # in Firefox, Safari/iCab and Opera
  # but not in IE8 or Chrome.
* that not-UTF-8 based percent encoded URLs get semantic display
  # in Firefox
  # not in any other browser. Caveat Operea. See note.

NOTE: For legacy encoded pages, then Opera makes a difference between  
href="#D%FCrst" and href="D%FCrst": the fragment variant does not get 
"semantic display" in Opera whereas the non-fragment URL does. 
      But there is a catch to what Opera (and Firefox too) do  for 
externally linking URLs: in the Windows-1251 encoded test page, the 
'%FC' is turned into the cyrillic soft-sign letter. 


# With regard to whether the URL works when activated, then those tests 
show, for fragment URLs:
* that directly typed letters always works
* that UTF-8 based, percent encoded URLs
  # never work, regardless of page encoding, for Opera and IE8
  # should work always, for Firefox and Safari/iCab/Chrome
    regardless of character or page encoding.
* that not-UTF-8 based percent encoded URLs are interpreted 
  "semantically"
  # for in IE8: never and not in any encoding
  # for Firefox, Safari and iCab: with non-UTF-8 encodings only
  # for Chrome and Opera: in any encoding, but seemingly only as long
    as the character belongs to the Latin-1 character set.
-- 
Leif H Silli

Received on Friday, 22 July 2011 00:58:39 UTC