W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > August 2011

[Bug 13709] [html5] Attribute value normalization is not backwards compatible

From: <bugzilla@jessica.w3.org>
Date: Tue, 09 Aug 2011 07:27:42 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Qqgj0-0000f7-38@jessica.w3.org>

Michael[tm] Smith <mike@w3.org> changed:

           What    |Removed                     |Added
                URL|http://www.w3.org/mid/20110 |http://www.w3.org/mid/20110
                   |8082059.25340.bert@w3.org   |8082059.25340.bert@w3.org

--- Comment #1 from Michael[tm] Smith <mike@w3.org> 2011-08-09 07:27:41 UTC ---

A personal comment on http://www.w3.org/TR/2011/WD-

(That section is actually only an example, but I didn't immediately see 
where the parsing of attributes is formally defined. Sorry.)

The way string-valued attributes are processed in HTML5 is not backwards 
compatible with the way in HTML4. In HTML4, newlines in the source 
become spaces in the attribute value, but in HTML5 they become line 
feeds and/or carriage returns.

Section shows an example: although the mark-up contains no 
"&#10;" entity, the attribute value still contains a line feed.

The handling of line ends isn't specific to HTML4, but is a property of 
SGML (and thus also XML) and thus it risks being difficult to change in 
existing software. In my own software, e.g., it is handled at a very low 
level in the tokenizer.

The proposed new way is also inconvenient: In HTML4, you can format the 
source code to avoid long lines: 

    ... <span title="Some long title here">...</span> <span title="Some
    long title here">...</span>...

and the two attributes will be equal to one another, but not so in 

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Tuesday, 9 August 2011 07:27:46 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 16:31:16 UTC