[Bug 29217] Serialization of newlines


--- Comment #5 from Michael Kay <mike@saxonica.com> ---
1. What is the default for output methods other than XML or text?

For HTML you have freedom to replace any sequence of whitespace characters with
a different sequence that has the same rendition in browsers.

For XHTML I should assume the XML rules apply, though I don't know if that's
explicitly stated.

2. Do newline characters need to be normalized (see my initial comment)?

Not quite sure what you mean by the question. If you mean XML end-of-line
normalization, then the answer is no: this is done by the XML parser on input,
it does not need to be done on serialization. In fact, the opposite is true: if
there is a x0C character in a text or attribute node then (with the XML output
method) it is serialized as a character reference to ensure that it survives
end-of-line normalization when the XML is reparsed. (That's on the theory that
getting a x0C into a text or attribute node requires considerable effort, so it
must be there deliberately. This theory is a bit harder to defend now that we
accept input from unparsed-text() and parse-json()).

3. Does "newline" always refer to "&#xa;" sequences in the input, or does it
also refer to "&#xd&#xa;" ? 

It refers to x0A.

4. Would it make sense to specify newline handling globally for all rules in
the spec?

Quite possibly, but there will be differences between output methods.

You are receiving this mail because:
You are the QA Contact for the bug.

Received on Wednesday, 28 October 2015 09:54:26 UTC