- From: Stef Busking <notifications@github.com>
- Date: Tue, 07 Jan 2020 06:38:25 -0800
- To: w3c/DOM-Parsing <DOM-Parsing@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <w3c/DOM-Parsing/issues/59@github.com>
The steps for "to serialize an attribute value" only escape the characters `"`, `&`, `<` and `>` in the attribute value. White space characters are passed through to the serialization as-is. However, XML processors will replace each space, tab, carriage return or line feed character with a space according to https://www.w3.org/TR/xml11/#AVNormalize unless the character was present as a character reference. It seems therefore that the attribute value serialization algorithm should include a step mapping tab to `	`, carriage return to '
` and line feed to `
`.
Testing this in various browsers shows that these already apply a similar substitution:
```
new XMLSerializer().serializeToString(
new DOMParser().parseFromString('<root attr=" 	
"/>', 'text/xml')
)
// <root attr=" 	
"/> in Firefox
// <root attr=" 	 "/> in Edge / Chrome
```
The algorithm as described in this specification would generate `<root attr=" \t\r\n"/>` (where `\t` `\r` and `\n` represent tab, carriage return and line feed respectively). Only Safari seems to follow the specification here. Unfortunately, this serialization does not survive a round-trip, as it is normalized to four spaces by processors such as the DOMParser:
```
new XMLSerializer().serializeToString(
new DOMParser().parseFromString('<root attr=" \t\r\n"/>', 'text/xml')
)
// <root attr=" "/>
```
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/DOM-Parsing/issues/59
Received on Tuesday, 7 January 2020 14:38:27 UTC