RE: schema composition questions from Shlomo Yona on 2007-07-09 (xmlschema-dev@w3.org from July 2007)

From: Shlomo Yona <S.Yona@F5.com>
Date: Sun, 8 Jul 2007 21:32:37 -0700
To: "Paul Kiel" <paul@xmlhelpline.com>, <xmlschema-dev@w3.org>
Message-ID: <B546C312A37C12438A22154026CDC7E01615D059@exchfive.olympus.f5net.com>
Hello, Paul.

Thanks for the link. I actually read that article almost a year ago to get some perspective on how XML schemas are being used. It was a useful read indeed.

Thank you.

Shlomo.


-----Original Message-----
From: Paul Kiel [mailto:paul@xmlhelpline.com]
Sent: Mon 7/9/2007 2:08 AM
To: Shlomo Yona; xmlschema-dev@w3.org
Subject: RE: schema composition questions
 
A couple more comments inline.  You are right in that I came at this from a
schema designers perspective.  

 

I know you are not necessarily interested in the "best practice"
perspective, but fwiw I put out an article on xml schema profiles on xml.com
last fall which talked about how people are using the spec.  What they
choose not to use reflects a certain expectation if you read between the
lines in my opinion. 

 

http://www.xml.com/pub/a/2006/09/20/profiling-xml-schema.html

 

Paul

 

 

 

 

  _____  

From: Shlomo Yona [mailto:S.Yona@F5.com] 
Sent: Sunday, July 08, 2007 3:42 AM
To: Paul Kiel; xmlschema-dev@w3.org
Subject: RE: schema composition questions

 

 

Hello,


[This is related to my 1st question]
>>Paul: this is certainly legal and useful. It is called "namespace
>>coercion" because you are coercing the schema B into the namespace A.  But
>>if the nesting gets more complicated it sometimes is not exactly
>>interoperable across implementations.  We ran into trouble using this some
>>time ago.

I think that I understand what happens in a trivial case such as this:
A (tns "A") includes B (no tns)

All the top level names in B are now in the namespace that A declares as its
target namespace (tns "A").

What I don't understand is the case where
A (tns "A") includes B (no tns) which imports C (tns "C").
Is this a legal situation?
What is the fully qualified name of top level names from C in the composed
schema document?

>>Paul: Imports are easier as there is no changing of anything.  The fully
qualified name from schema C is the namespace it belongs to in its schema.
Importing it does not change this in any fashion.  You cannot "coerce" with
an import.


I would argue that the following possibilities should be considered (I am
not sure that I listed all possibilities and am not sure at all which is the
correct or desired behavior):
* This is illegal and the whole schema processing fails
* A include B import C fails (not all top level names in the composed B and
C can be coerced into the target namespace of A) and the result is only A
include B (ignoring the import of C into B)
* The composed schema is only A itself due to failure of coercing all the
names in the composed B and C schema documents
* The composed schema document includes top level names in A under the
target namespace "A" + top level names in B under the target namespace "A" +
top level names in C under the target namespace "C".

So... what is the correct or desired behavior?

>>Paul: From my perspective, use namespaces explicitly whether it is an
include or an import.  Don't coerce.  But more to your question.  Components
of A and B are both in the namespace of A.  The components of C are in the C
namespace.  There are no errors in doing what you said.  And there are no
combinations  (coercions) of A and C namespaces.



[this is related to my 2nd question]
>>Paul:  The ordering is not significant here.

Is it always the case that the order of processing xsd:import and
xsd:include in a schema document is insignificant? Or are there examples
where order matters?

>>Paul: the order of imports is never significant.  I don't believe the
order of includes is ever significant either.  The only weird thing I've
seen is when coercive includes are nested multiple times.  Such as:

A (ns="A") includes B (no ns) which includes C (no ns).  In this case all
schema componets belong to namespace "A".  If there were any imports, they
would belong to their own ns and are not effected.

Or:

A (ns="A") imports B (ns="B") which includes C (no ns).  I've seen this
interpreted as C belonging to ns "B" because it is the direct include.  I've
also seen folks think components in C belong in "A" since it is the ultimate
includer/importer.  Of course in either case, A is in ns "A" and B is in ns
"B".  I did some research into this and I would need to do some digging to
find out where things came out.  I'll see if I can dig it up.


[This is related to my 3rd question]
>>Paul: This is where interoperability can be a problem.
>>I would suggest that you keep it to one no-ns schema being
>>included into one ns schema.
>>If you have multiple levels of includes and multiple levels
>>of "coercion" then tools can interpret that differently.
>>To be frank, we ran into problems with namespace coercion
>> and decided to abandon it altogether. 

This is exactly my question. I'm asking this from an XML processor
implementation point of view and not from a schema author "best practice"
point of view. I want to implement the "right thing" and the correct
behavior. My problem is that I do not understand from the recommendation
what is exactly the correct behavior.

[This is related to my 4th question]
>>Paul: There is not a problem with circular includes per se.
>>Namespaces aside, the spec says that duplicative includes/imports
>>should be ignored.
>>So just being circular is not an error.
>>Now with namespaces, you are better off avoiding this kind of
>> behaviour because tools may interpret it differently.

Again, I want to implement the correct behavior into my XML processor. I'm
not asking this as a schema author. Can you point out the proper way to
handle such cases? Where in the recommendation is this issue being discussed
and how is it suggested to process a set of XML schema documents that have
circular dependencies in the general case?

[This is related to my 5th question]
>>Paul: not sure I understand here.
>>If it is about coercion of the namespace of the any, then I refer
>>to earlier comments.
>>If it is about what other options are available for namespace
>>declaration of any, then there are options such as "##other"
>>for specifying a different namespace, "##any" for any namespace,
>>and there are some others too.  The spec lists them.

I was probably not clear. Let me try with an example:

A (tns "A") includes B (no tns) which includes C (no tns) which imports A.
C contains: <xsd:any namespace="A"/>.
To which top level names does this wildcard refer to?
Only those listed in the A schema document?
Perhaps those listed in B too?
Perhaps those in A, B and C?

>>Paul: this is easier.  Since you have explicitly stated that the any is in
namespace "A", then that is what can be in it.  You cannot change ns due to
importing.  Even if C has no ns, you have been explicit that the any is in
the A ns.  End of story.  If you had not stated that the any is to be ns "A"
, then it defaults to ##any meaning any namespace components can be in it.



I can give other examples where this is complicated for me to understand,
for example:

A (tns "A") includes AA (tns "A") and also includes B (no tns) which
includes C (no tns).
C contains: <xsd:any namespace="A"/>.
But to which top level names does this wildcard refer to?
What if AA contains <xsd:any namespace="A"/>?
To which top level names does this wildcard refer to?

Of course, there are plenty more other examples to cook up which are, at
least for me, similarly unclear.



I hope that you and the other experts here can help me out.

Thanks.

Shlomo.
Received on Monday, 9 July 2007 04:35:08 UTC