My user experience of the user experience workshop from Steven Ericsson-Zenith on 2005-06-23 (xmlschema-dev@w3.org from June 2005)

From: Steven Ericsson-Zenith <steven@semeiosis.com>
Date: Thu, 23 Jun 2005 14:40:26 -0700
To: xmlschema-dev@w3.org
Cc:
Message-Id: <1119562826.20500@memeiosys.com>
Here is an informal "amicus curiae" contribution to the XMLSchema 
committee - notes from the workshop of the past couple of days.

It was hard not to come across as a formalist at this workshop - 
I do understand the pragmatics of formal language development 
and I want to clarify some of my concerns. 

There are two ways to view these pragmatics - one is the long 
term pragmatics of refinement and the second is the short term 
pragmatics of necessity.

An example of a short term pragmatic is the necessity to produce
a result - in the case of a working group, the production of a
specification. 

Typing questions, IMHO, is another example of a short term 
pragmatic - who can doubt the necessity that precision decimal 
should be supported or that date and time stamps should cover
the scope of those specified in SQL?  In the absence of a 
mechanism to specify the base types of a schema it would be an 
error not to support these types IMHO. So this could be solved 
immediately with an errata that extends the base types.

[As an aside the base type mapping problem is identical
conceptually to the problem of binding schema types to 
application types. So a single solution that solves both 
problems should be proposed.]

If your horizon is long term - as I said at the workshop, 
I want schemas (or "schemata") that I write to be valid 40 
or 100 years for now - then base type specification and a 
formal method for binding types between application 
implementations seems essential.

An example of a long term pragmatic is the necessity of 
refinement and this was expressed by everyone at the
workshop under the term "versioning" and manifests for
the committee in the need to release subsequent 
specifications of the XMLSchema standard.

In what many considered another example of a short
term pragmatic, is what many saw as the need to specify
"profiles" - subsets of the specification that could be
guaranteed in specific use-cases. I will argue that
this is a long term consideration also.

I spoke briefly in my presentation about my interest 
in concept distinction, and this is another case in point. 
The concepts of versioning and profiling are essentially 
the same - and can thus be addressed by the same solution.  

Therefore, I would strongly urge the committee not 
invent specialized solutions for what appear initially 
to be distinct concepts.

I also want to clarify what I mean by formal specification 
and what others may mean. When engineers ask for a formal 
specification they do not necessarily need a Zed 
specification. While the computer science formal methods 
community has gone down the road of building new algebras 
it is by no means a necessity that a formal specification 
be entirely written in what has essentially become a 
private language.

John von Neumann and David Hilbert used informal language
too - the tendency toward strict private langauges is
a relatively recent phenomenon - one that manifestly has
not served computer engineering well since it has built
an unneccessary divide between formalists and engineers. 

It is perfectly possible - and pragmatically necessary - 
to write a formal specification that engineers building 
tools can use.  The specification does need a mathematical 
foundation but it is not always necessary that users of the 
specification appreciate that foundation.  We have known
how to do this since the Algol 60 report led the way,
written almost 50 years ago.

As I read the existing specification it is apparent that
the authors did intend to write a formal specification 
of the type I describe but, it seems to me, that the 
mathematical foundations of the specification are unclear
- perhaps absent. Which is what I meant when I expressed 
at the workshop that the specification was from my POV
"insufficiently formal."

XMLSchema is not an imperative programming language so
the Algol 60 report does not help us much - but it does
seem possible to build the spec of a constrained data
desciption language on mathematical foundations
none-the-less.

I could sense the frustration in the Committee whenever 
I pushed for more formalization and it should be clear
that whatever the committee's experience is with formal 
methods, it would be a fundamental mistake to dismiss 
these methologies because of this experience. 

The issue is not whether the language should be specified 
formally - but rather how to specify it formally.  If past 
attempts have failed then it is not a failure of the 
method but a simple failure of communication between 
individuals. Three attempts to write the spec formally 
is no reason not to write a fourth - and perhaps 1.1 can
be that specification.

So, on review, here is a summary of my recommendations:

1. In version 1.1 specify a binding mechanism that permits
base types to be specified and use this mechanism to specify
the 1.0 base types as base types in 1.1. This mechanism would
also enable general binding of types between schemas and
applications.

2. In version 1.1 specify a profiling mechanism that permits
a guarantee in a schema - the guarantee is that the semantics
of the specified subset of the named schema will never change.
This could, perhaps, be implemented by an attribute on types 
that says the type cannot be redefined.

3. Specify using the mechanism of (2) a profile of 1.1 that is
the 1.0 specification and any other 1.0 profile - such as that
proposed for WSDL etc..

4. I support the call for constraints in the langauge 
(commonly called "co-constraints").  In my presentation 
I pointed out that there are two types of constraints.
Essentially, they are those that apply to the generator
and specify whether an instance is valid, and those that
apply to the data in valid instances used by a consumer
and essentially specify rules that apply to data.

An example of the first case: to ensure that a value is 
in a range - a value out of the range produces an invalid 
schema. Similarly, I want to ensure that the timestamp in 
a given field is earlier than the timestamp in other 
fields - otherwise the schema is invalid. I also asked
for a strong type inference that timestamps are state not 
declaration. This type of inference also applies to all 
calculations for example, summation.

An example of the second case - to use the example given 
by BT - is in a valid instance the value of a purchase
order field requires the sign off appears in an associated 
field. In this case the instance is valid and the consumer
needs to see the rule.

5. Finally, I pointed out that there is an error in the 
specification of recursive types. The tail of the recursion 
must read maxOccurs=0, otherwise you can specify infinite 
recursion in a schema. This is clearly an error.

With respect,
Steven

--
Dr. Steven Ericsson Zenith
http://www.semeiosis.com
Received on Thursday, 23 June 2005 21:40:33 UTC