W3C home > Mailing lists > Public > www-validator@w3.org > February 2006

Re: Parsing HREFs?

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 20 Feb 2006 08:45:43 +0200 (EET)
To: www-validator@w3.org
Message-ID: <Pine.GSO.4.63.0602200837040.17928@korppi.cs.tut.fi>

On Mon, 20 Feb 2006, Lachlan Hunt wrote:

>>    ... Since &lang; is the HTML entity for the
>>    left-pointing angle bracket, some browsers also convert &lang=en to 
>> </=en ...
>
> Although it technically should do that, I couldn't find any browser that 
> actually does.  My tests [1] show that within href attributes generally only 
> entity references from the ISO-8859-1 category, &quot, &amp, &lt and &gt from 
> the Markup Significant category, and &apos (where supported) are recognised 
> without the REFC.
>
> [1] http://lachy.id.au/dev/markup/tests/html401/charref/syntax

As far as I can see, Firefox 1.5 gets all of them right. (There might be 
problems in _displaying_ some of the characters, due to font problems, but 
that's a different issue.)

On the other hand, IE (even IE 7 beta preview) gets many of them wrong:
it fails to recognize entity references for characters outside ISO Latin 1
without REFC, it fails to recognize &apos; at all, and it fails to 
recognize hexadecimal character references without REFC.

But these errors in IE are _not_ limited to processing values of 
attributes. They are general flaws in its parser.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Monday, 20 February 2006 06:45:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:20 GMT