RE: [FS] [XQuery] empty text nodes

We made a decision a couple of weeks ago to allow text nodes to be
zero-length so long as they have no parent. I think this is essentially what
you are recommending.

Michael Kay 

> -----Original Message-----
> From: public-qt-comments-request@w3.org 
> [mailto:public-qt-comments-request@w3.org] On Behalf Of Per Bothner
> Sent: 19 May 2004 23:40
> To: public-qt-comments@w3.org
> Subject: [FS] [XQuery] empty text nodes
> 
> 
> Both the Formal Semantics and the internal XQuery 
> specification seem to
> allow empty text nodes, but this conflicts with the Data Model:
> 
>    6.7 Text Nodes
>    6.7.1 Overview
>    Text nodes must satisfy the following constraint:
>      1. A text node must not contain the empty string as its content.
> 
> The formal semantics uses empty text nodes so it can suppress spaces
> between adjacent enclosed expressions:
> 
>    4.7.1 Direct Element Constructors
>    Normalization
>    [ElementContent1 ..., ElementContentn]ElementContent-unit, n > 1
>    ==
>    fs:item-sequence-to-node-sequence([ ElementContent1 
> ]ElementContent ,
>      text { "" }, ..., text { "" }, [ ElementContentn]ElementContent
> 
>    4.7.1.1 Attributes
>    Normalization
>    [ AttributeValueContent1 ...,
>      AttributeValueContentn]AttributeContent-unit, n > 1
>    ==
>    fs:item-sequence-to-untypedAtomic(
>     [AttributeValueContent1]]AttributeContent , text { "" }, ...,
>       text {""}, [ AttributeValueContentn]AttributeContent
> 
> Note also that the rule for Dynamic Evaluation of Text Node
> Constructors does not check for an empty string.  Only the
> static typing consider this by returning 'text?'.
> 
> Also, the informal XQuery specification has a problem here:
> 
>    3.7.3.4 Text Node Constructors
>    2. If the result of atomization is an empty sequence, no text node
>    is constructed. ...
> 
> However, it says nothing about a content expression that consists of
> a single empty string.  I suggest removing the above sentence, and
> modifying the 3rd rule:
>    3. The individual strings resulting from the previous step 
> are merged
>    into a single string by concatenating them with a single space
>    character between each pair.  If the resulting string is the empty
>    string, no text node is constructed (and the expressions returns an
>    empty sequence).  Otherwise, the resulting string becomes 
> the content
>    of the constructed text node.
> 
> Alternatively, change the data model to allow empty text 
> nodes.  I don't
> see any particular reason to prohibit them, even if they're not in the
> XML infoset.  Just replace the sentence "In addition, document and
> element nodes impose the constraint that two consecutive text 
> nodes can
> never occur as adjacent siblings" by adding "nor can any of their text
> node children have the empty string as the content".
> 
> My recommendation:
> 1. Change the Data Model to allow empty text nodes.
> 2. Change the Text Node Constructor to allow empty string content.
> 3. Change element and ocument constructor normalization so that they
> remove empty text nodes *and* combine adjacent text nodes.
> -- 
> 	--Per Bothner
> per@bothner.com   http://per.bothner.com/
> 

Received on Thursday, 20 May 2004 04:46:48 UTC