- From: Stef Busking <notifications@github.com>
- Date: Tue, 07 Jan 2020 06:38:25 -0800
- To: w3c/DOM-Parsing <DOM-Parsing@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <w3c/DOM-Parsing/issues/59@github.com>
The steps for "to serialize an attribute value" only escape the characters `"`, `&`, `<` and `>` in the attribute value. White space characters are passed through to the serialization as-is. However, XML processors will replace each space, tab, carriage return or line feed character with a space according to https://www.w3.org/TR/xml11/#AVNormalize unless the character was present as a character reference. It seems therefore that the attribute value serialization algorithm should include a step mapping tab to `	`, carriage return to '
` and line feed to `
`. Testing this in various browsers shows that these already apply a similar substitution: ``` new XMLSerializer().serializeToString( new DOMParser().parseFromString('<root attr=" 	
"/>', 'text/xml') ) // <root attr=" 	
"/> in Firefox // <root attr=" 	 "/> in Edge / Chrome ``` The algorithm as described in this specification would generate `<root attr=" \t\r\n"/>` (where `\t` `\r` and `\n` represent tab, carriage return and line feed respectively). Only Safari seems to follow the specification here. Unfortunately, this serialization does not survive a round-trip, as it is normalized to four spaces by processors such as the DOMParser: ``` new XMLSerializer().serializeToString( new DOMParser().parseFromString('<root attr=" \t\r\n"/>', 'text/xml') ) // <root attr=" "/> ``` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/w3c/DOM-Parsing/issues/59
Received on Tuesday, 7 January 2020 14:38:27 UTC