[Bug 29217] Serialization of newlines

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29217

--- Comment #2 from Christian Gruen <christian.gruen@gmail.com> ---
Thanks for the prompt discussion of this issue.

An additional serialisation parameter might be useful (users of BaseX have
asked for such a parameter in the past; we called it "newline"). However, my
original intention of this bug was to get some clarification on the "current
rules". This is what I find in the spec:

  5.1.3 XML Output Method: the encoding Parameter

  When outputting a newline character in the instance of the data model, the
serializer
  is free to represent it using any character sequence that will be normalized
to a 
  newline character by an XML parser, unless a specific mapping for the newline 
  character is provided in a character map (see 11 Character Maps).

  8 Text Output Method

  The Text output method serializes the instance of the data model by
outputting the 
  string value of the document node created by the markup generation step of
the phases
  of serialization without any escaping.

  A newline character in the instance of the data model MAY be output using any 
  character sequence that is conventionally used to represent a line ending in
the 
  chosen system environment.

These are some of the questions that I believe may need to be answered in the
spec:

1. What is the default for output methods other than XML or text?
2. Do newline characters need to be normalized (see my initial comment)?
3. Does "newline" always refer to "&#xa;" sequences in the input, or does it
also refer to "&#xd&#xa;" ? 
4. Would it make sense to specify newline handling globally for all rules in
the spec?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Wednesday, 28 October 2015 07:04:27 UTC