Oracle Comments on XML Schema Last Call: from Jim Trezzo on 2000-05-12 (www-xml-schema-comments@w3.org from April to June 2000)

From: Jim Trezzo <jtrezzo@us.oracle.com>
Date: Fri, 12 May 2000 16:03:28 -0700
To: www-xml-schema-comments@w3.org
CC: dbeech@usmail07.us.oracle.com, jtrezzo@us.oracle.com
Message-ID: <391C8DC0.F99B1EA8@us.oracle.com>
Oracle Comments on XML Schema Last Call:

Here are the comments we have collected within Oracle.

   David Beech and Jim Trezzo


Part 1: Structures

1. Identity-constraint table
----------------------------

For reasons of performance, and avoidance of duplicate
implementation, we believe that a conforming schema
processor should always be prepared to pass the
results of its keyref checking in the PSV-infoset.
It could be very expensive for applications such as
query processors to have to redo this work, both in
run-time performance and in implementation effort.

Of course, a schema processor could have an option to
allow applications to say when they did not wish to
receive the information.  However, for interoperability,
applications should be able to rely on conforming
processors making this information available on request,
otherwise they would be forced to always include their
own implementation of what they needed, just in case.

A part of the problem may be that the Last Call draft of
Structures distributes the information across
identity-constraint tables attached to different
element information items.  A single table per document
would seem preferable, both conceptually and as a guide
to a practical realization.


2. Default for Element Equivalence Classes
------------------------------------------

Since the flexibility introduced by element equivalence
classes adds complexity to schema design and to parsing,
we feel that the default should be to block it, rather
than as at present to allow it and affect users who are
not even aware of the feature.

For example, the designer of schema A who declares
element <e> and does not block 'equivClass' thereby
allows the designer of any other schema B for a
different namespace to add elements to the equivalence
class for <e> so that they become valid substitutions
for <e> when it is being validated using schema A.

Not only may this be unintended by the designer of schema
A, but it will increase the parsing complexity and we fear
that it may even lead to ambiguity in the content models
for complex types that reference the declaration of e.


3. SchemaLocation problems
--------------------------

The XMLSchema spec seems to allow one to create XML schemas
which can contain references to element declarations and
type definitions from other namespaces, without having to
provide the schemas that contains those definitions. In
effect, each instance document can then suggest the location
of the other schemas, or the schema processor must have
some other way of inding them.

This presents a problem when we are trying to validate
multiple documents against a schema (or create a repository
for documents conforming to a schema), since each document
might suggest changing the type structure for the elements
in this schema.

The upshot of this is that implementations are REQUIRED to
provide a way of locating schemas outside of the <import>
system if they want to avoid using the <schemaLocation> in
the instance documents (which is horrible).

We would like to REQUIRE specification of the schemaLoction
attribute in an <import> element in the schema, so that we
don't have to provide an alternative schema location mechanism
for schemas that come to us without <import schemaLocation=>.
The spec should still allow the <import> to be ignored like
<schemaLocation>, for systems that will find schemas  by
other means.


4. API for PSV-Infoset
----------------------

The information added to the PSV-Infoset will be very
valuable to applications if there is an efficient
practical realizaton of it, e.g. in a standard API.


5. Augmentation of PSV-Infoset for schema documents
---------------------------------------------------

Since a schema document is an XML document (and a
major argument in favor of using XML syntax to
represent XML schemas has been that generic XML
tools would then be applicable to them), the
validation of an XML schema against the Schema
for Schemas will produce a PSV-infoset.

Chapter 4 of Structures transforms what is
effectively this information into a component
and property model, adding information.
However, no generic tools exist for processing
this information, which would be useful if
available to applications such as repositories
or query processors.

Would it be possible to augment the PSV-infoset
with additional information when the instance
being validated is itself a schema document?


Part 2: Datatypes

6. recurringDuration
----------------------------

there appears to be a typo in 3.2.7.  The second sentence:
"The order-relation on timeDuration ..." should read:
"The order-relation on recurringDuration ...".

This also brings out a technical problem.  Since recurring
duration has two facets (duration and period) which should
enter into determining the order-relation, the specified
rule (x<y iff y - x is positive) is not adequate.  We could
say that when either duration or period is fixed, the
variable facet would be used to determine order-relation.

There also seems to be a conflict between what the text of
3.2.7 (paragraph 3) says and what the explanatory box says.
The text says: "... it can be used as a datatype on its own
...",  where the box says: "It is an error for
recurringDuration to be used directly in a schema".
Received on Friday, 12 May 2000 19:03:34 UTC