Primitive Types DISGUISED as Derived Types

Issue: some of the built-in derived types are not really DERIVED types, but
primitive types.

This issue is very important if you are implementing a light schema
processor. Light schema processor implements a minimum set of features and
covers 80% of the business use cases. One such approach is to implement
primitive built-in types only. And, softly derive built-in derived types
using schema for built-in data types (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#schema). Is this
possible? The answer is NO NO. Here is the reasoning why?

{NOTATION, timPeriod, time, date, month, year, century, recurringDate,
recurringDay} is the chosen subset for this discussion. By definition, these
built-in types have additional constraints that CANNOT be expressed using
the surface syntax described by CR drafts. Also, these additional
constraints CANNOT be expressed using schema component: datatype definition,
meta langauge used by Part 2 (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#datatype-components). Let
us dissect the details (on a case by case basis).

[1] NOTATION
(reference -http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#NOTATION) -
has two such additional constraints (classified as schema constraints) (a)
enumeration facet value is required for NOTATION (b) values of the
enumeration facet must be the name of a declared NOTATION. Currently,
NOTATION is a derived type of QName. In addition to participating in
validation, this derived type also triggers SCHEMA INFOSET CONTRIBUTION:
[notation] or [notation system], [notation public] (reference -
http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/#Notation_Declaration_deta
ils)

[2] timePeriod (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#timePeriod) - has one
such additional constraint (classified as schema constraints): period facet
value is required for timePeriod.

[3] time (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#time) - value space of
time is a subset of the value space of its base type, recurringDuration.
But, the lexical space of time is not a subset of the lexical space of its
base type, recurringDuration. The lexical space of time is the left
truncated lexical representation for timeInstant. This left truncation
cannot be expressed using the surface syntax.

[4] date (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#date) - Here again, the
lexical space of date is the right truncated lexical representation for
timePeriod. This right truncation cannot be expressed using the surface
syntax.

[5] SAME REASONING applies to month, year, century, recurringDate and
recurringDay ..


Also, in theory (reference -
http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/#built-in-vs-user-derived)
, there should be no difference between the built-in derived datatypes from
CR drafts and user-derived datatypes. In other words, end-user should be
able to derive similar datatypes from built-in primitive types. Can
end-users derive any of the data types listed in the chosen subset? NO. This
is a LITMUS TEST and XSDL FAILS.

This chosen subset of derived types have constraints that cannot be
expressed using XSDL or schema component: datatype defintion. If they cannot
be expressed, then they cannot be derived using built-in primitive types.

All the best,

Asir S Vedamuthu
webMethods, Inc.
(Phone) 703-460-2513 (Fax) 703-460-2513 (E-mail) asirv@webmethods.com
URL: http://www.webmethods.com

Received on Friday, 10 November 2000 09:42:05 UTC