RE: XML schema validation and namespaces from Sanjay Dahiya, Noida on 2002-09-06 (xmlschema-dev@w3.org from September 2002)

From: Sanjay Dahiya, Noida <sanjay@noida.hcltech.com>
Date: Fri, 6 Sep 2002 13:20:03 +0530
To: noah_mendelsohn@us.ibm.com
Cc: xmlschema-dev@w3.org
Message-ID: <E04CF3F88ACBD5119EFE00508BBB2121044BEDC4@exch-01.noida.hcltech.com>
thanks that was real help
the reason i asked the question was that i am into making a 
schema/document processor, which i initilally (for simplicity) did
for nonamespace documents, this point was not clear while moving to multiple
namespaces

thanks again
sanjay

-----Original Message-----
From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com]
Sent: Friday, September 06, 2002 8:20 AM
To: Sanjay Dahiya, Noida
Cc: xmlschema-dev@w3.org
Subject: Re: XML schema validation and namespaces


Well, I can give you some general idea of how things work.  First of all, 
you're right, there are namespaces tbat you bump into in schema documents, 
namespaces that the instance might use, and a set of rules that have to 
keep straight how these all work together.  The schema design goes to 
great length to cover these, and it probably gives more flexibility than 
you'd notice at first.

A schema document can use the <import> construction to refer to other 
schema definitions for other namespaces.  Optionally, the import can 
supply a schemaLocation hint (and it's only a hint!) that the process MAY 
choose to follow to look for the schema definitions for that other 
namespace.  Alternatively, the schema processor can use some other means 
to figure out what schema definitions to use for that other namespace 
(maybe it takes command line options, has an API, builds in definitions 
for certain namespaces, etc.)

So, what happens if an instance document uses a namespace in some element 
in the middle of content:

        <ns1:outer xmlns:ns1="uri1">
                <ns2:inner xmlns:ns2="uri2"/>
        </ns1:outer>

What are the possibilities for where we get the definitions to validate 
ns2:inner (presuming we had a schema for ns1?)  Well, I'm too lazy to type 
all the schemas exactly correctly, but if the schema for ns1 says roughly

        <schema targetNamespace="ur1" xmlns:ns1="uri1" xmlns:ns2="uri2">
                <import namespace="ur12">

                <element name="ns1:outer">
                        <sequence>
                                <element ref="ns2:inner/>
                        </sequence>
                </element
        </schema>

then the processor will go looking for some schema document (or other 
source of definitions) for ns2:inner.  Exactly how is, as described above, 
up to the processor.  With:

        <schema targetNamespace="ur1" xmlns:ns1="uri1" xmlns:ns2="uri2">
                <import namespace="ur12" 
schemaLocation="http://example.org/ns2.xsd">

                <element name="ns1:outer">
                        <sequence>
                                <element ref="ns2:inner/>
                        </sequence>
                </element
        </schema>

then the processor MAY chose to get those definitions from 
http://example.org/ns2.xsd.  Another way it might get the hint is from the 
instance:

        <ns1:outer xmlns:ns1="uri1">
                <ns2:inner xmlns:ns2="uri2" schemaLocation"uri2 
http://example.org/ns2b.xsd">
        </ns1:outer>

Again, it's a hint.  The processor can honor the one in the import, in the 
instance, neither, etc.

Now consider:

        <schema targetNamespace="ur1" xmlns:ns1="uri1" xmlns:ns2="uri2">
                <import namespace="ur12" 
schemaLocation="http://example.org/ns2.xsd">

                <element name="ns1:outer">
                        <sequence>
                                <any processContents="lax">
                        </sequence>
                </element
        </schema>

This says that outer can have most any contents.  Does ns2:inner get 
validated?  Well, if the processor choses to find a schema, perhaps from 
one of the schemaLocation hints, then the inner element does get 
validated.  "Lax" says: validate if you have an element declaration, 
otherwise don't worry about it.  "strict" (instead of lax) means "you 
better have an element declaration, if not fail".  "skip" means don't 
validate the inner element even if you could.

To really understand this, you should find a good tutorial on schema, or 
maybe even do the hard work of reading the spec (it is hard in this area.) 
 I hope you can see that the design provides quite a bit of power for 
dealing with the situations you've raised.   Many of them do arise in 
various uses of XML.  I hope this is helpful.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







"Sanjay Dahiya, Noida" <sanjay@noida.hcltech.com>
Sent by: xmlschema-dev-request@w3.org
09/04/2002 11:55 AM

 
        To:     xmlschema-dev@w3.org
        cc:     (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        XML schema validation and namespaces


Hi All

Correct me if I am wrong anywhere in the following
One can comopse an XML schema using multiple schemas and multiple 
namespaces so that definitions can be reused (makes perfect sense). 
For this purpose XML schema can have references to other schemas which 
might have different namespaces. This is done by 'import' / 'include' tags 
in the schema definition. Q1: what are the possible ways for doing the 
same. 
The instance document reffering to this schema would be validated by 
loading other schemas and looking for corresponding definitions in those 
schemas. (good so far)

Now XML instance document can also contain references to multiple 
namespaces and schema using schemaLocation and
noNamespaceSchemaLocation ( why multiple ??) 
Q2: XML document ( which of course must have a root element) would have 
its definition in one schema only. the child elements which in case refer 
to other schemas must be refered in the schema only.
Second for a validating parser what would be the precise set of rules for 
locating the definition of an element that is mentioned in the instance 
document.
Now to make things worse each element can refer to a namepace and supply 
the prefix there itself (No idea where it is going now !!)  Q3: now how 
would the validating parser locate the definition of this element. and Q4: 
why would someone put a namespace with an element in the instance document 
and not in its schema ?
looks like some of the constructs have been designed with non-validating 
parsers in mind and others for validating ones. could someone clear things 
a bit here.
thanks and regards
Sanjay
Received on Friday, 6 September 2002 03:53:48 UTC