implementing redefinitions

Hi,

I'm about to implement redefinitions and tried to create some
initial tests for this purpose. I would be glad if you could
find time to comment/correct my observations and tests -
especially what the intended outcome of the tests is.

Reading the spec, plus the mail archives I learned - and wonder -
the following:

- Redefines are pervasive with regard to references. I.e.
  all references to components, wherever they may be
  will resolve to the redefined components only.

- Multiple redefinitions of the same component are disallowed; be it
  in different schemas, inside a single <redefine> or with
  multiple <redefines> referring to the same document inside a single
  document. Only chains of redefinitions are allowed - with
  a chain of documents. This refers to
  http://lists.w3.org/Archives/Public/xmlschema-dev/2005Apr/0004.html
  However, the 2nd scenario presented in
  http://lists.w3.org/Archives/Public/xmlschema-dev/2005Apr/0011.html
  looks like a nice-to-have to redefine components in a compact way;
  i.e. without extense use of chained schema documents. But I guess the
  reason for ruling this out, is the unwanted dependancy on the order
  of components. Hmm, although one could remove the dependancy by
  introducing attributes (like "refer" for IDCs) to clearly define
  chains of redefinitions; after all, a document is more overhead than
  two additional attributes.

- Redefinition of simple/complex types produce two components,
  the newly redefined and a copy of the base type with an "absent"
  name. The unnamed component: what is it needed for here exactly? I
  don't understand the spec here. Further, if the redefining schema is
  itself included into a parent schema, is this unnamed component added
  to the parent schema as well?

- When components of included/redefined/imported schemas are added
  to the parent schema, is the pervasive effect expected to have been
  already applied? An example for this would be in "final.xsd" below:
  here we include a "base" type, and after this we include a
  schema, which redefined this "base" type. If we wouldn't hold
  back component inclusion until all redefinitions are known,
  we would violate sch-props-correct.2, since both the original
  "base" type and the derived type would be included.
  So it looks like the sequence here should be:
    1. parse the schemas and create the components
    2. find the top-most redefinitions, check if there are duplicate
       redefinitions of the same component, apply redefinitions
       globally
    3. add components of included/imported schemas to parent schemas


Tests
-----

I would _really_ be nice if you could spare the time to
comment the results of the tests.

I used Xerces-J 2.7.1 and XSV 2.10-1 for the tests.

I get very different results for Xerces and XSV.
Often Xerces reports the name "base_fn3dktizrknc9pi" for the
base type "base", so I wonder where this comes from, and
how this is related to the errors. Does this reflect
the "unnamed" component?

base.xsd
--------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:simpleType name="base">
		<xs:restriction base="xs:integer">
			<xs:maxInclusive value="5"/>
		</xs:restriction>
	</xs:simpleType>
</xs:schema>

derived.xsd
-----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="4"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>
</xs:schema>

derived2.xsd
------------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="3"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>
</xs:schema>

derived3.xsd (note that this is the same as derived.xsd)
------------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="4"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>
</xs:schema>

chain-dummy.xsd
---------------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:include schemaLocation="base.xsd"/>
</xs:schema>

final.xsd
---------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:include schemaLocation="base.xsd"/>
	<xs:include schemaLocation="derived.xsd"/>
	<xs:element name="foo" type="base"/>
</xs:schema>

final2.xsd
----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="4"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="3"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>
	<xs:element name="foo" type="base"/>
</xs:schema>

final3.xsd
----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:include schemaLocation="derived2.xsd"/>
	<xs:include schemaLocation="derived.xsd"/>
	<xs:element name="foo" type="base"/>
</xs:schema>

final4.xsd
----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:include schemaLocation="derived3.xsd"/>
	<xs:include schemaLocation="derived.xsd"/>
	<xs:element name="foo" type="base"/>
</xs:schema>

final5.xsd
----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">		
	<xs:include schemaLocation="base.xsd"/>
	<xs:redefine schemaLocation="base.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="4"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>	
	<xs:element name="foo" type="base"/>
</xs:schema>

final6.xsd
----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">		
	<xs:include schemaLocation="base.xsd"/>
	<xs:redefine schemaLocation="chain-dummy.xsd">
		<xs:simpleType name="base">
			<xs:restriction base="base">
				<xs:maxInclusive value="4"/>
			</xs:restriction>
		</xs:simpleType>
	</xs:redefine>	
	<xs:element name="foo" type="base"/>
</xs:schema>

instance.xml
------------
<foo>  3  </foo>

Validation of instance.xsd with final.xsd
--------------------------------------------
Xerces-J:
base.xsd:3,29: (Error) sch-props-correct.2: A schema cannot contain two
global components with the same name; this schema contains two
occurrences of ',base_fn3dktizrknc9pi'.

XSV: no errors

This would fall into the category: first apply redefinitions, then
include the components. Looks like this should not fail.

Validation of instance.xsd with final2.xsd
------------------------------------------
Xerces-J:
final2.xsd:13,30: (Error) sch-props-correct.2: A schema cannot contain
two global components with the same name; this schema contains two
occurrences of ',base'.

XSV: no errors

Should this fail due to multiple redefinitions of the same
component?

Validation of instance.xsd with final3.xsd
------------------------------------------
Xerces-J:
derived2.xsd:4,30: (Error) sch-props-correct.2: A schema cannot contain
two global components with the same name; this schema contains two
occurrences of ',base_fn3dktizrknc9pi'.

XSV: no errors

This is interesting, since the instance is still valid with XSV if
we change the integer simple content of <foo> to 4. So it seems
like the 2nd include of "derived.xsd" overwrote the restriction made
in "derived2.xsd". How does this work? Is this correct?
This is the sequence I would expect:
- "base" is included
- "base" is redefined in "derive2.xsd" to maxInclusive == 3
- redefines are pervasive with an "unbounded" scope, so
  all operations to come on "base", should be performed on the
  redefined "base"
- the redefined "base" is tried to be redefined in "derive.xsd",
  but should fail since maxInclusive == 4 is not a valid
  restriction of maxInclusive == 3

I checked if XSV perhaps simply does not catch the restriction
error, and indeed, in a standalone test, XSV does not bark when
maxInclusive == 3 is restricted with maxInclusive == 4. Although
I tend to think that this is just a restriction bug, clarification
is needed if XSV perhaps redefined the original "base" in
"derived.xsd" a second time and overwrote previous redefinitions.

Anyway, shouldn't this test fail due to multiple redefinitions
of the same component?

Validation of instance.xsd with final4.xsd
------------------------------------------
Xerces-J:
derived3.xsd:4,30: (Error) sch-props-correct.2: A schema cannot contain
two global components with the same name; this schema contains two
occurrences of ',base_fn3dktizrknc9pi'.

XSV: no errors

I think here, Xerces should not bark since the components are identic.
I.e. the redefinitions of "base" in "derived.xsd" and "derived3.xsd"
produce identic types; component identity is still implementation
dependant, so they, like us, don't try yet to check for identity.
But maybe something different happens here as the report comes with a
name of "base_fn3dktizrknc9pi" for the type.
What's going on here exactly?

Validation of instance.xsd with final5.xsd
------------------------------------------
Xerces-J:
base.xsd:3,29: (Error) sch-props-correct.2: A schema cannot contain two
global components with the same name; this schema contains two
occurrences of ',base_fn3dktizrknc9pi'.

XSV: no errors

We include and redefine the same schema document here. This is
comparable to "final.xsd", with the difference that the redefinition
is in-line here, rather than via an other document.

Validation of instance.xsd with final6.xsd
------------------------------------------
Xerces-J: no errors
XSV: no errors

Now, that's interesting. Xerces doesn't bark, although we just
used a dummy middle-man for the redefine. In what way is this so
different from "final5.xsd"? This is the reduced scenario reported
in http://lists.w3.org/Archives/Public/xmlschema-dev/2005Feb/0008.html


Regards,

Kasimier

Received on Thursday, 11 August 2005 13:11:21 UTC