W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2000

Re: null vs empty in XML Schemas

From: David Beech <dbeech@us.oracle.com>
Date: Tue, 04 Jan 2000 16:04:31 -0800
Message-ID: <38728A8E.6B6326C0@us.oracle.com>
To: costello@mitre.org
CC: www-xml-schema-comments@w3c.org, xml-dev@ic.ac.uk
> Message-ID: <386F4B76.106883C7@mitre.org>
> Date: Sun, 02 Jan 2000 07:58:30 -0500
> From: Roger Costello <costello@mitre.org>
> To: xml-dev@ic.ac.uk
> CC: www-xml-schema-comments@w3c.org, "Schneider,John C." <jcs@mitre.org>, "Cokus,Michael S." <msc@mitre.org>, costello@mitre.org
> Subject: null vs empty in XML Schemas
>
> Hi Folks,
>
> I have two questions:
> 1. What is the difference between null and empty?
> 2. How is nullable/null used?
>
> My understanding of the difference between null and empty is:
>
> - If an element is declared to be of type empty then in the XML instance
>   document the content of that element is a string of length zero.

That's close, but strictly speaking there are a couple of small
points to note:

"empty" is not a type per se, but a type can be defined with
empty content, e.g.

  <type name="whatever" content="empty"/>

and possibly with attribute declarations too.

An element with such a type is "constrained to have no content"
(3.4.5), which is not quite the same as having
a datatype that is string with length zero, although of
course they could look the same in an XML instance document.

> - If an element is declared with nullable="true" then in the XML
> instance
>   document if the element has xsi:null="true" then this element
>   has undefined content.

We need to do a better job of explaining the exact significance
of nulls within XML Schema - this is awaiting the complete update
of the Conformance chapter, and more expository material.

The xsi:null='true' just serves as a marker, and within the
world of XML Schema the content is well-defined: there must not
be any (3.4.9).  How the schema designer intends the marker to be
interpreted, e.g. as corresponding to a database NULL, is outside
the province of XML Schema conformance, but might be conveyed in
an annotation.

> Is this a correct understanding of the difference between empty and
> null?

In the schema, the difference is between a type that has content='empty'
and a type that (usually, although not necessarily) has non-empty content.

In the instance, null will have the null marker attribute, and both will
have empty content.

> Below is my understanding of how nullable/null is to be used:
>
> Here is an example XML Schema snippet declaring an element (middle) with
> nullable="true":
>
> <element name="PersonName">
>       <type>
>             <element name="forname" type="NMTOKEN"/>
>             <element name="middle" type="NMTOKEN" nullable="true"/>
>             <element name="surname" type="NMTOKEN"/>
>       </type>
> </element>
>
> Here is an example XML instance document conforming to the above schema
> snippet, where the middle element has been set to a null value:
>
> <PersonName>
>       <forename>John</forename>
>       <middle xsi:null="true"/>
>       <surname>Doe</surname>
> </PersonName>
>
> Thus, middle may contain a NMTOKEN value, or it may indicate that there
> is no defined value.  Is this the correct usage of nullable/null?

Yes, subject to resolution of the issue below.

> My
> reason for asking is because in the XML Schema spec, after nullable/null
> is discussed there is an Issue section stating:
>
> "Issue (nullRequiresEmpty): Is it a precondition for being nullable that
> the element's contentType allow no content? If not, then more needs to
> be said above, if so, this needs to be spelled out."
>
> This isn't consistent with the above example.  I thought that an element
> declared with nullable="true" can have a value in the instance document
> when a value is available.  When no value is available then we can
> indicate this in the instance document by setting xsi:null="true".

Your understanding is correct.

> This
> Issue seems to say that elements declared with nullabe="true" can never
> have a value in the instance document.  Thus, the middle element can
> never have a value.  Wherein lies the truth?  /Roger

There's an ambiguity in the wording of the issue - where it says "allow no
content", it should be read as "allow the possibility of no content", and
not as "never allow any content" (cf "take no prisoners"!).

Thanks for helping to clarify this,

  David
Received on Tuesday, 4 January 2000 19:07:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:46 GMT