Re: Serialisation bugs

On Tuesday 09 May 2023 15:31:06 (+02:00), John Lumley wrote:


On 09/05/2023 11:43, Steven Pemberton wrote:

Ran into a bug while giving the tutorial, (and another bug while writing 
this mail) so  probably need tests:

Test 1

ixml:

     data: a, " ", a, " ", a.
     -a: -" ".

input:
     a a a


On my processor (https://johnlumley.github.io/jwiXML.xhtml) the input 'a a 
a' gives an error, as it should, since 'a' never appears as a terminal.



My mistake. It should be a: -"a".


Steven


With five spaces ('     ')  it produces <data>  </data> (two spaces) if 
indentation is supressed (not available yet on the public workbench), 
<data/> if indentation supported.




The result on my processor is displayed in a <pre> whose text is 
fn:serialize($result,map{'indent':true()}) and reading the XQuery and XSLT 
Serialization
Recommendation suggests that with indentation enabled in XML output mode, 
it is an implementation decision as to whether

Whitespace characters MAY be added adjacent to a text node only if the text 
node contains only whitespace characters. Whitespace characters in such a 
text node MAY also be elided or replaced. For example, a tab MAY be 
inserted as a replacement for existing spaces.

I can't find a definition of 'elided' here (does it encompass 'delete 
completely' or is it purely 'shortened'?) but by disabling the indentation, 
the whitespace characters do appear along with the closing tag. However, I 
think that most users would prefer to look at a 1000-part XML tree under 
indentation, so I'm adding an option to disable the indentation if needed.




The same remarks apply to the second test.



expected output:


     <data>  </data>


(The bug displayed this as <data/>)

Test 2:

     data: +"  ".

No input

Expected output


     <data>  </data>

Steven

Received on Tuesday, 9 May 2023 13:42:08 UTC