W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2008

possible bug in XML Schema 1.1 Part 2: Datatypes (was Re: question about lexical and value spaces)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Fri, 18 Jan 2008 04:25:46 -0500 (EST)
Message-Id: <20080118.042546.116614490.pfps@research.bell-labs.com>
To: noah_mendelsohn@us.ibm.com
Cc: mike@saxonica.com, www-xml-schema-comments@w3.org

This possible bug shows up in the	
http://www.w3.org/TR/2006/WD-xmlschema11-2-20060217/ 
version of the document.

From: noah_mendelsohn@us.ibm.com
Subject: RE: question about lexical and value spaces
Date: Thu, 17 Jan 2008 20:24:39 -0500

> Michael Kay writes:
> 
> > There is intense debate about whether "ineffable values" (values with no
> > lexical representation) should be considered as being within the value 
> > space or not. 
> 
> Really?  I thought we were always clear that if there was no lexical form, 
> there was no value.  For example, I thought it was pretty clear that if 
> you used a pattern facet to restrict away all the lexical forms ending in 
> the digit 4 in a type derived from xs:integer, then the numbers 4, 14 and 
> so on were in fact not in the value space of the type.  

My understanding is that this is because facets really work in the value
space.

>From http://www.w3.org/TR/xmlschema-2/#facets

	[Definition:] A facet is a single defining aspect of a value
	space. Generally speaking, each facet characterizes a value
	space along independent axes or dimensions.

	[Definition:] A constraining facet is an optional property that
	can be applied to a datatype to constrain its value space.

	Constraining the value space c consequently constrains the
	lexical space. Adding constraining facets to a base
	type is described in Derivation by restriction (4.1.2.1).

Or course there is a wrinkle in this wrt the pattern facet:

	[Definition:] pattern is a constraint on the value space of
	a datatype which is achieved by constraining the lexical
	space to literals which match a specific pattern. The value of
	pattern must be a regular expression.

This works only because in 1.0 values must have lexical forms.


In http://www.w3.org/TR/xmlschema11-2/#rf-pattern there is

	4.3.4 pattern

	[Definition:] pattern is a constraint on the value space of
	a datatype which is achieved by constraining the lexical
	space to literals which match each member of a set of
	patterns.  The value of pattern must be a set of regular
	expressions.

which seems to depend on the 1.0 notion that all values must have
lexical forms.  Immediately after, there is

	pattern provides for:

 	* Constraining a value space to values that are denoted by
 	literals which match each of a set of regular expressions.

This would make a better definition in 1.1, I think.

So, I now think that there is a bug in
	http://www.w3.org/TR/xmlschema11-2/
with respect to the behaviour of the pattern facet.
I would hope that the bug is resolved in the way I mention above.

> Paul Biron and I 
> tend to recall often the discussion we had many years ago in line waiting 
> for dinner at a restaurant near the first New Orleans meeting at which we 
> pointed out how impractically hard it would be to enforce such things in 
> systems that in fact allow the values to be manipulated directly.  If you 
> have an API that purports to establish some new value of a datatype, it 
> can be very difficult to test whether there does or doesn't exist at least 
> one lexical form for it in the face of complex patterns.  Still, the 
> datatypes were focussed mainly on validation, and there is something very 
> appealing about being able to say that every value has at least one 
> serialization.  I was not aware that there was any serious consideration 
> of changing this. 
> 
> Suggestion:  can we take this discussion to the schemas IG list where more 
> WG members will see it?  As far as I know the comments list is tracked 
> very carefully for picking up new issues and bug reports, but it is not 
> necessarily subscribed by all members of the working group.
> 
> Noah

Umm, sure, you guys can hash things over as much as you want.  I was,
however, interested in getting an answer to something that was not clear
in the spec (or at least not documented as a change from 1.0), which is
why I sent the original message in to comments.  Now I'm reporting a
possible bug in the document, which is why I'm continuing to send to this
mailing list.

Peter F. Patel-Schneider
Bell Labs Research
Received on Friday, 18 January 2008 09:53:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:13:12 GMT