RE: XForms Basic and Schema Validation

Hi John,

1. Simple types and simple content are two different things.
2. Datatypes *are* simple types!
 
Regards,
 
Mark


Mark Birbeck
CEO
x-port.net Ltd.

e: Mark.Birbeck@x-port.net
t: +44 (0) 20 7689 9232
b: http://internet-apps.blogspot.com/
w: http://www.formsPlayer.com/

Download our XForms processor from
http://www.formsPlayer.com/

 

________________________________

From: John Boyer [mailto:boyerj@ca.ibm.com] 
Sent: 08 May 2006 22:32
To: Mark Birbeck
Cc: www-forms@w3.org; www-forms-request@w3.org
Subject: RE: XForms Basic and Schema Validation



	
	Hi Mark, 
	
	The notion of datatype is orthogonal to simple vs. complex type. 
	
	Section 2.2.1.3 of Schema Part 1 is clear in defining the fact that
you can have a complex type with simple content.   
	And you can datatype validate the simple content of a complex type. 
	
	>I agree if you were using the term datatype in its proper sense.
But
	>datatypes are simple types, not complex ones, so I disagree, since
it sounds
	>like you are using it to cover complex types. 
	
	I am using datatype in its proper sense, which is also what the spec
is doing, I believe. 
	Datatypes are not simple types.  They are descriptions of string
validations, which can 
	be used to validate content of both simple and complex types. 
	
	>Well...firstly it actually says "XML Schema datatypes" which to me
means
	>'the datatypes from XML Schema Part 2'. In other words, it doesn't
deal with
	>other types defined by an author.
	
	Sorry, but you are misreading "XML Schema datatypes" as "XML Schema
built-in datatypes". 
	If XForms Basic had intended to refer to the built-in datatypes, it
should have used 
	that term.  But XML Schema Part 2 is about providing the machinery
for defining ones own 
	datatypes. It then uses that machinery to create a number of
built-in datatypes.  Note 
	that the built-in datatypes can be used in complex types that define
simple content.
	
	So, we are left with the fact that XForms 1.0 was designed to
address the *main* use 
	case for validation, which is user input validation.  That's why the
spec contains 
	language associating the type MIP with schema datatype. If anything
more than that works 
	for an implementation, it seems to me to be a bit of a bonus for
that implementation. 
	
	John M. Boyer, Ph.D.
	Senior Product Architect/Research Scientist
	Co-Chair, W3C XForms Working Group
	Workplace, Portal and Collaboration Software
	IBM Victoria Software Lab
	E-Mail: boyerj@ca.ibm.com  http://www.ibm.com/software/
	
	Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
	
	
	
	
	
	"Mark Birbeck" <mark.birbeck@x-port.net> 
	Sent by: www-forms-request@w3.org 

	05/08/2006 05:33 AM 

		
		To
		<www-forms@w3.org> 
		cc
		
		Subject
		RE: XForms Basic and Schema Validation

		




	
	Hi John,
	
	Here's an 'executive summary' of the points that I'll provide
explanation
	for, inline below:
	
	 One problem with XForms Basic as defined is that it doesn't
	 explain how the 'downgrading' of a complex type should take
	 place. The second bullet (in XForms Basic) provides for the
	 *possibility* of this downgrading by saying that a processor
	 "may" choose to only support simple types, but nowhere is it
	 explained what it would mean in practice. (And in reply to
	 your and Raman's view that the third bullet deals with this,
	 I'm afraid it doesn't--it deals with *datatypes*, which are
	 simple types.)
	
	
	>                 The sentence says that all Schema datatypes other
than the
	> ones listed are to be treated as string, not all built-in schema
	> datatypes.
	
	Well...firstly it actually says "XML Schema datatypes" which to me
means
	'the datatypes from XML Schema Part 2'. In other words, it doesn't
deal with
	other types defined by an author.
	
	But even if you ignore the "XML Schema" bit, the term used is
'datatype'
	which has a very specific meaning; to infer that this sentence
suggests that
	any *complex* type that the author has defined should also be
converted to
	xs:string, would require you to include 'complex types' within the
	definition of 'datatypes' which--as you rightly say in a discussion
with
	Allan on that very subject-- is incorrect. :)
	
	I suppose you could say that using the word 'datatype' was a
mistake, and
	what was actually intended was the more general term 'type
definition'; but
	that makes things worse, since this term includes both simple and
complex
	types, so the third bullet would actually be saying that any type
definition
	other than those listed would be xs:strings--obviously not what is
intended.
	
	So my suggestion is for the WG to stop trying to rush this out, and
properly
	resolve the issue of how complex types behave. (The spec hasn't
moved for
	about 2 1/2 years, I think another week isn't going to hurt.)
	
	As it happens, I don't really see anything wrong with the third
bullet in
	relation to its stated subject matter which is datatypes. All it
says is
	that for some datatypes you don't need to provide any special
regular
	expressions if you are doing a 'subset processor'.
	
	
	However, the big thing that *is* glaringly missing is the bridge
from the
	goal that has been described (of not requiring an XForms Basic
processor to
	have a full XML Schema implementation) and the reality of the prose;
we need
	something very clear that explains how a Basic processor should
proceed if
	it is going to 'downgrade' complex types.
	
	In working through some kind of proposal for this, it seems to me
that
	mapping to xs:string may not actually be the best solution. I'll try
to
	explain, and people can say what they think.
	
	
	Looking at the entirety of XML Schema, I would say that what we're
after is
	the following behaviour for a 'subset' schema processor:
	
	 * a reference to any undefined type is an error;
	
	 * any *datatype* that is not in the list in
	   bullet 3 has a regular expression that is
	   equivalent to xs:string;
	
	 * any *simple* type is processed as normal (i.e.,
	   as it would be in Full);
	
	 * any *complex* type is processed as if it were
	   a simple type, with all element and attribute
	   definitions ignored.
	
	The first point may or may not be implicit in our schema processing
anyway,
	but I think it needs some clarification. However, we can ignore it
for this
	discussion since it should really be defined in XForms Full anyway.
	
	The second point, on the behaviour of datatypes, is already given by
bullet
	3 in the spec, so we need do nothing here either.
	
	Similarly, on the third point, the behaviour of simple types is
already
	given by bullet 2 in the spec, and although it might benefit from
	clarification, it's at least there in part.
	
	So all we need is an extra bullet that clarifies how complex types
are
	converted, and here I'm proposing *not* that they are automatically
	converted to strings--which is the current proposal--but that the
	*structural* features are ignored.
	
	
	The following example is given in the XML Schema specification of
how an
	element 'width' can have a value which is a non-negative integer, as
well as
	an attribute which indicates the unit of that non-negative integer:
	
	 <xs:complexType name="length1">
	   <xs:simpleContent>
	     <xs:extension base="xs:nonNegativeInteger">
	       <xs:attribute name="unit" type="xs:NMTOKEN"/>
	     </xs:extension>
	   </xs:simpleContent>
	 </xs:complexType>
	
	 <xs:element name="width" type="length1" />
	
	 <width unit="cm">25</width>
	
	As far as I can see a processor that can handle simple types (which
all
	Basic processors will do) can process the example I just gave, as
easily as
	they can process the following:
	
	 <xs:simpleType name="length1">
	   <xs:restriction base="xs:nonNegativeInteger" />
	 </xs:simpleType>
	
	By doing this, at very little cost we reduce the gap between XForms
Basic
	and XForms Full. (From the XML Schema terminology point of view,
what I'm
	saying is that since:
	
	 simple content = simple type + attributes
	
	we can 'remove' the attributes and still make use of the simple type
	definition, rather than just saying 'string'.)
	
	
	>                 "In my opinion" This is why any attempt to assign
a datatype
	> other than the ones listed should be regarded as string.
	
	I agree if you were using the term datatype in its proper sense. But
	datatypes are simple types, not complex ones, so I disagree, since
it sounds
	like you are using it to cover complex types.
	
	
	>                 At a higher level, the purpose of basic was
exactly so that
	> basic processors did not have to do a very smart schema engine.
	>                 This goal does not seem to be achieved if basic
processors
	> have to be smart enough to read the schema definitions to figure
	>                 out that the datatype is undefined.
	
	That's a slightly different issue. The processor has to process the
XML
	Schema mark-up anyway, in order to find the simple types. Spotting
undefined
	types should be easy.
	
	
	>                 "In my opinion" An implementation should be able
to write
	> lexical analyzers for just those 26 given datatypes, and apply
	> the write analyzer for the given datatype and otherwise it
	> should be able to pretend that the type assignment refers to
	> string.
	
	I understand the goal, and explained that in my post...but I still
don't
	like the fact that you can't predict what the platform you are
running on
	will do. I really don't think it's a good idea to allow Basic to
'maybe' do
	this or 'maybe' do that. I would prefer to see the behaviour defined
clearly
	and then for us to say that this is how a Basic processor will
behave.
	However, I'm happy to leave that issue to one side whilst we
actually sort
	out the lack of clarity on the behaviour.
	
	Regards,
	
	Mark
	
	
	Mark Birbeck
	CEO
	x-port.net Ltd.
	
	e: Mark.Birbeck@x-port.net
	t: +44 (0) 20 7689 9232
	b: http://internet-apps.blogspot.com/
	w: http://www.formsPlayer.com/
	
	Download our XForms processor from
	http://www.formsPlayer.com/
	
	
	
	
	

Received on Monday, 8 May 2006 22:42:10 UTC