Comments from the XML Schema WG concerning the CC/PP Working Draf t of 8 November, 2002

Dear Colleagues:

The XML Schema WG has created a task force to review the CC/PP 
working draft of November 8, 2002 [1].  Following are the results 
of our review.

First, we commend the CC/PP WG for excellent work, and congratulate 
you on bringing your REC to last call.  Second, we apologize for 
the fact that we are fairly late with our comments, and we sincerely 
hope that you find our comments useful anyway.  And third, we thank 
you for making the effort to describe your types in terms of XML 
Schema datatypes when possible.  

Most of our comments refer to references to simple types as defined
in XML Schema Part 2 [2], and are largely motivated by a desire to 
ease processor burden and to enhance interoperability.  Please note
that our use of the terms "type" and "simple type" are completely
and unapologetically XML Schema-centric.  We appreciate your
indulgence.

On behalf of the XML Schema WG, thanks very much for listening.
Kind regards,
XML Schema WG
(edited by David Ezell)

============================
1) (editorial) the table of contents appears to be missing 
   major heading for section 2.

============================
2) the following example contains a problematic simple type (figure
   2-4b)

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:ex="http://www.example.com/schema#">
  <rdf:Description
      rdf:about="http://www.example.com/profile#HWDefault">
    <rdf:type
        rdf:resource="http://www.example.com/schema#HardwarePlatform" />
    <ex:displayWidth>320</ex:displayWidth>
    <ex:displayHeight>200</ex:displayHeight>
    <ex:memory>16Mb</ex:memory>
  </rdf:Description>
</rdf:RDF>

Describing memory as "16Mb" implies a "compound simple type" where the
unit of measure (mb) co-resides with the value (16).  The Schema WG 
tried and failed to produce a workable method for producing such 
compound types, and the reasons for that failure were very technical 
and non-trivial.

[N.B. the editor of this note was once a strong proponent of
such types, but in the end understood the issues preventing their
inclusion in the REC, and did not object to their omission.]

Our experience is that "compound" values, such as "16Mb", are better 
expressed as separate simple values.

     <ex:memory-value>16</ex:memory-value>
     <ex:memory-unit>Mb</ex:memory-unit>

While the RDF syntax makes this expression slightly more cumbersome than
it might be otherwise, we believe that it helps clarify what kinds of 
simple values are actually definable.

We respectfully suggest that the examples be changed to represent the
state of what's possible with simple types.  Note that this use pattern
appears in several examples, and not just the one quoted.

[N.B. mapping this type to a single value-space (i.e. to create
a simple type) would involve describing canonical lexical representations
for all values.  Please consider such simple type construction as a second 
alternative, along the lines described below for "rational" in item 6.
Such a construction would require restricting the allowed "suffixes"
using a pattern facet (regular expression).]

============================
3) section 4.1.1.2, concerning "case-insensitive text"

Unfortunately, there is no direct way to designate a type based on such
a character set (for purposes of matching and ordering) as a simple type.
Further, while using such a type is slightly more convenient for hand
editing, it arguably adds little real expressive value, and in fact
creates problems for interoperability since it can't be designated as
a type.

Some of the reasons for this restriction on type creation follow.
Based on the work of our members and on comments from other experts, 
we know that case folding is dependent on both language and locale:  in 
Quebec (for example), the uppercase equivalent of '&eacute;' is '&Eacute;', 
but in metropolitan France it is 'E'.  In most countries using the
Latin alphabet, the upper case equivalent of 'i' is 'I', while in Turkey 
it is uppercase-dotted-i (and the lower-case equivalent of 'I' is 
lowercase-i-without-a-dot).  

Further, it can be argued that case sensitivity is less likely to
surprise users, since in XML case sensitivity is the rule.

While the introduction of case-insensitive text was probably intended 
to be a simplifying measure, it seems to us that defining it properly 
would introduce a great deal of complication for very little gain, at 
a large cost in interoperability.

We respectfully suggest that you eliminate this type.

============================
4) section 4.1.1.2, concerning "token" 

The same objection to case folding applies to this type, with an added issue
as to why such a type should be constrained to US-ASCII?  The type xsd:token
(in XML Schema Part 2 [2]) has proven to be a good base from which to 
restrict enumerations.  We understand that you may have some specific use
cases of which we are not aware.

To reduce confusion, we request that you use "token" in the same sense as
XML Schema or use a different term for your case-insensitive ASCII token.

============================    
5) section 4.1.1.3 Integer number

The XML Schema Part 2 REC defines an "xsd:int" type which seems identical
to the one you define.  Please note that xsd:integer is unconstrained in
value whereas xsd:int is constrained *almost* exactly as you have defined
it, with the exception that the minInclusive facet of the datatype is set
to -2147483648 (as opposed to -2147483647).

We respectfully suggest that you clarify that xsd:int is the desired type, 
and modify the prose to be consistent with XML Schema Part 2 [2].

============================
6) section 4.1.1.4 Rational number

We indicated in item 2 (above), that two-part values are often better expressed
as two separate values.  However, rational number may be represented as a
user defined (or in this case, WG defined) simple type, as follows:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
	
targetNamespace="http://www.w3.org/TR/2002/WD-CCPP-struct-vocab-20021108/">
	<xs:simpleType name="rational">
		<xs:annotation>
			<xs:documentation>
				The canonical lexical representation of any
value 
				will be the form of the value reduced to its
lowest 
				common denominator, and with '1' in the
denominator 
				if applicable.
			</xs:documentation>
		</xs:annotation>
		<xs:restriction base="xs:string">
			<xs:pattern value="[0-9]+(/[0-9]+)?"/>
		</xs:restriction>
	</xs:simpleType>
</xs:schema>

Note that the "c14n" should probably be further elaborated:
Lexical value        Canonical Lexical Rep.
=============        ======================
    "3/6"       ==>        "1/2"
    "15"        ==>        "15/1"
etc.

We believe that people will find this definition useful.  

However, the above definition only solves one half of the problem, i.e. that
it describes *only* the lexical representation.  Binding a lexical
representation to
a value space (unfortunately) is not easy; it requires operator definition, and 
must be carefully described since processors which understand simple types will 
be expected to do the arithmetic.  Such expectations are (we believe) essential
for interoperability.

We invite you to raise the issue of the need for a rational number with the
XML Schema WG, since that is the only way this issue can really be resolved.
In the meantime, we suggest that you note in the REC that use of this rational 
type may be harmful to interoperability.

============================
7) section 2.2 (html editorial issue)

The anchor (hyperlink) for http://www.w3.org/2002/11/08-ccpp has the trailing
'#' in bold.  [ed. note:  I believe this variation is invisible in some
browsers.]

============================
8) section 4.1.2.2 (editorial blip)

The sentence "Compare the above attribute value, which is a sequence containing
one
element, with the a simple value as shown in figure 4-5 above." has "the a"
after
the second comma.


[1] http://www.w3.org/TR/2002/WD-CCPP-struct-vocab-20021108/
[2] http://www.w3.org/TR/xmlschema-2/

Received on Tuesday, 11 February 2003 08:37:18 UTC