- From: <bugzilla@wiggum.w3.org>
- Date: Sat, 27 Mar 2010 22:41:36 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9352 Summary: Make unescaped & conforming in attribute values in some cases Product: HTML WG Version: unspecified Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: HTML5 spec bugs AssignedTo: dave.null@w3.org ReportedBy: mjs@apple.com QAContact: public-html-bugzilla@w3.org CC: ian@hixie.ch, mike@w3.org, public-html@w3.org HTML syntax and URL syntax have an unfortunate conflict. HTML interprets & as the start of an entity reference, while in URLs it has special meaning as a separator in the query portion of a URL. HTML5 disallows the & character in attribute values unless it is actually the start of an entity reference. That means markup like this is nonconforming: <a href="http://images.google.com/imghp?hl=en&tab=wi"> In this specific case, there is no change that &tab= could be mistaken for an entity reference, and parsing will proceed exactly as the author expects. The spec explains that the reason for this syntax error is markup fragility: "For example, the parsing of certain named character references in attributes happens even with the closing semicolon being omitted. It is safe to include an ampersand followed by letters that do not form a named character reference, but if the letters are changed to a string that does form a named character reference, they will be interpreted as that character instead." http://dev.w3.org/html5/spec/Overview.html#conformance-requirements-for-authors However, for an author to be aware of this kind of error, they must be regularly using a conformance checker (or equivalently, a tool that ensures conformance at the output stage). Then the conformance checker can tell them if they have used a construct that actually will be interpreted as an entity reference, rather than merely one that might be, if edited. As a result of getting the error, authors who want the full benefits of conformance checking must write in a more awkward style, and must bloat their markup by replacing instances of "&" with "&". 7 of the Alexa top 15 sites have this error: http://www.w3.org/html/wg/wiki/index.php?title=HTML5_Authoring_Conformance_Study In many cases it appears an inordinate number of times, close to 100, and is the single most frequent error on the site. It seems that many authors, even on prominent sites, have not found the markup bloat and awkward syntax of consistently using & to be a cost worth paying for the benefit of speculatively avoiding future errors. Thus, I think HTML5 should reconsider and only make href="&foo=" an error in the case where foo is an entity name, since that is the only case where author expectations will actually be defeated. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Saturday, 27 March 2010 22:41:38 UTC