W3C home > Mailing lists > Public > xmlschema-dev@w3.org > February 2005

Re: SV: SV: empty elements and xsd:string

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Mon, 21 Feb 2005 17:14:43 +0000
To: Bryan Rasmussen <brs@itst.dk>
Cc: "'George Cristian Bina'" <george@oxygenxml.com>, xmlschema-dev@w3.org
Message-ID: <f5bis4lsum4.fsf@erasmus.inf.ed.ac.uk>

I'm not sure I follow.

text nodes and element values are not the same thing, you're mixing
the XPath data model and the XML Infoset (and how W3C XML Schema uses
the Infoset).

Consider your example

 <tag><hi>  </hi><hi></hi><hi/></tag>

The starting point for validation for all simple types is the *initial
value* of the element or attribute information item in question [1]:

  [T]he *initial value* of an element information item is the string
  composed of, in order, the [character code] of each character
  information item in the [children] of that element information item.

So in your example we get one string with two characters and two
strings with no characters (the empty string).  The second and third
'hi' elements in your example are treated identically, _because they
are identical at the Infoset level_.

Are you suggesting that the empty string is not a string, or rather
that it should not have been a member of xs:string?  I think that
would violate many more expectations than including it, as the REC
does.

Or are you suggesting that the Infoset is wrong in failing to
distinguish the second and third 'hi' elements above?

ht

[1] http://www.w3.org/TR/xmlschema-1/#key-iv
-- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Monday, 21 February 2005 17:14:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:49 GMT