Re: several messages

On Mon, 01 Jun 2009 21:33:33 +0200, Ian Hickson <ian@hixie.ch> wrote:

> The reason to do this change is that authors make this mistake all the
> time and yet it is not harmful. By making this change the only practical
> effect is that authors will get fewer useless annoying errors out of
> conformance checkers.
>
>
>> > Supporting both '&' and ';' seems like a exercise in bug creation.
>> > Parsing URIs is hard enough to do right as it is without making things
>> > even more complicated and adding even more edge cases.
>>
>> But that's exactly what you are doing, except here it applies to parsing
>> href attributes, not URIs.
>
> No, no change to the parsing rules was involved here.

Writing HTML documents seems to make this valid:

    <a href="&copy=">

and claims that the attribute value contains just text and no character  
references (since character references end with ";").

Yet, Parsing HTML documents interprets the above the same as <a  
href="©=">, as far as I can tell.

Now, I guess there are several possible ways to fix this mismatch.

   1. Revert the change.
   2. Tweak the writing rules so that the ampersand above would be  
ambiguous.
   3. Tweak the parsing rules so that = is treated the same as 0-9a-zA-Z.

-- 
Simon Pieters
Opera Software

Received on Tuesday, 2 June 2009 10:23:39 UTC