- From: Elliotte Rusty Harold <elharo@ibiblio.org>
- Date: Fri, 24 Jul 2009 08:23:35 -0700
A technical point that may perhaps have already been considered. Section 3.3.3.2 states "If the title attribute's value contains U+000A LINE FEED (LF) characters, the content is split into multiple lines. Each U+000A LINE FEED (LF) character represents a line break." However this is incompatible with XML and the XHTML serialization. In XML as specified in http://www.w3.org/TR/REC-xml/#AVNormalize Before the value of an attribute is passed to the application or checked for validity, the XML processor must normalize the attribute value by applying the algorithm below, or by using some other method such that the value passed to the application is the same as that produced by the algorithm. All line breaks must have been normalized on input to #xA as described in 2.11 End-of-Line Handling, so the rest of this algorithm operates on text normalized in this way. Begin with a normalized value consisting of the empty string. For each character, entity reference, or character reference in the unnormalized attribute value, beginning with the first and continuing to the last, do the following: For a character reference, append the referenced character to the normalized value. For an entity reference, recursively apply step 3 of this algorithm to the replacement text of the entity. For a white space character (#x20, #xD, #xA, #x9), append a space character (#x20) to the normalized value. For another character, append the character to the normalized value. Thus, absent some fancy tricks with character references, linefeeds are not allowed in attribute values. Raw linefeeds are converted to spaces. I'm not sure what should be done about this. This is one of the weirder and more error-prone parts of XML. However, since HTML 5 is suspicious of linefeeds in title attributes anyway, we could either forbid them or adopt the XML interpretation. I first noticed this in the description of the title attribute, but the issue could be deeper. In particular, in the HTML 5 requirement that "If a reflecting DOM attribute is a DOMString but doesn't fall into any of the above categories, then the getting and setting must be done in a transparent, case-preserving manner." it's not clear what "transparent" really means here, and whether it's compatible with XML's attribute value normalization. Apologies if this has been discussed before, but I couldn't find anything on point in the archives. -- Elliotte Rusty Harold elharo at ibiblio.org
Received on Friday, 24 July 2009 08:23:35 UTC