W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2000

Re: Round 2: How an XML instance document references an XML Schema

From: <Noah_Mendelsohn@lotus.com>
Date: Tue, 4 Jan 2000 11:10:23 -0500
To: "Roger L. Costello" <costello@mitre.org>
Cc: costello@mitre.org, jcs@mitre.org, msc@mitre.org, www-xml-schema-comments@w3c.org, xml-dev@ic.ac.uk
Message-ID: <OF872987F6.AF8ABC58-ON8525685C.0057D4C4@lotus.com>
You are very much on the right track, but there are a few details that 
should be clarified:

Case 1:
-------

>> Thus, for this case the value of
>> schemaLocation is:

           schemaLocation="urn:person-schema
                           urn:person-schema/person-schema.xsd"

>> A Schema-validating parser will use the URI to fetch the schema
>> document, and then will verify that the targetNamespace value matches
>> the namespace in schemaLocation.

Here there is a subtlety, because schemaLocation is a hint.  The correct 
statement would be:

"A Schema-validating parser that honors schemaLocation will use the URI to 
fetch the schema
document, and then will verify that the targetNamespace value matches the 
namespace in schemaLocation."

Let me attempt to rephrase what Andrew described in his note: the choice 
of which schema to use ultimately lies with the consumer.  If you as a 
consumer wish to rely on the schemaLocation idiom, then you should 
purchase/use processors that will honor it for you.  The reason that some 
other processors might not provide that service for you, is that they are 
designed to run in environments where it is impractical or undesirable to 
allow the document author to force reference to end use of some particular 
schema document.  The schemas working draft is designed to allow both 
these types of processors to formally claim conformance with the 
specification, while insuring interoperability for those processors that 
you choose to honor the attribute.

I believe that in all other respects, your explanation of case 1 is 
correct.

Case 2:
-------

>> The only element in the instance document which has the same namespace
>> as the schema namespace is fname.  Thus, it is the only element which
>> will get schema-validated.

If you really understand what I said above, then you will see why this is 
not quite correct.  Which schema to use, and how much to validate is at 
the discretion of the processor, possibly based on parameters or other 
tailoring provided by the invoking application.  So, it is quite possible 
that your case 2 document is being consumed by an application that 
specifically supplies the schema(s) to be used for the elements not 
governed by a schemaLocation, (which in your example happens to consist of 
elements which are not in any namespace), and/or chooses to honor or not 
honor the schemaLocation attribute for urn:person-schema.  Indeed, if the 
application had some particular reason to do so, it could direct the 
processor specifically to use some other schema components to validate 
elements and attributes from urn:person-schema. 

That said, it will be quite common for people to acquire and use 
processors that will honor the schemaLocation attribute, and that will 
signal errors or provide other indication when the designated schema is 
inaccessible.

Case 3:
-------

The spirit of this example is correct, modulo the point repeated twice 
above.

While we are on the subject, it is worth noting that we wrestled long and 
hard with the scoping of schemaLocation attributes.  The problem we have 
is that the Namespaces Recommendation gives us the xmlns: mechanism, but 
gives us know explicit means to provide modifiers on it.  So, it is not 
possible for us within an xmlns: attribute to actually designate the 
schema.  Worse (for our purposes), is that the following are absolutely 
equivalent documents according to the Namespace's Recommendation:

        <a:e xmlns:a="auri">
                <a:f>
                </a:f>
                <a:f>
                </a:f>
        </a:e>

... and ...

        <a:e xmlns:a="auri">
                <aa:f xmlns:aa="auri">
                </aa:f>
                <a:f>
                </a:f>
        </a:e>


So the question arose for us, what would we do with constructions along 
the lines of:

        <a:e xmlns:a="auri" schemalocation="auri aschema1">
                <aa:f xmlns:aa="auri" schemalocation="auri aschema2">
                </aa:f>
                <a:f schemalocation="auri aschema3">
                </a:f>
        </a:e>

all of which are syntactically correct according to the schemas working 
draft?  The answer we came up with is that there may be at most one 
definition or declaration for any schema construction during the course of 
a single validation.  In principle, the schema processor can honor at most 
one of the above schemaLocation attributes, presuming that the several 
schemas include conflicting declarations for the same elements, 
attributes, etc. In practice, constructions of this sort are strongly 
discouraged.

There are several reasons for the restriction above, including: (1) it 
would significantly complicate the implementation of processors to require 
that the definitions be pushed and popped during the validation of a 
single document, and (2) we wanted to admit implementations in which 
various combinations of particular schemas were precompiled, particularly 
for high-performance at the server.  Such compilation is facilitated if 
the definitions for elements, attributes, etc. are stable through the 
validation.  One way to combine such compilation with the use of the 
schemaLocation attribute is to have the processor check that the 
schemaLocation supplied in any instance agrees with the one used at the 
time that schemas were precompiled, but such check is at the discretion of 
the processor: as always the processor and application are free to 
validate against what ever schema they choose to use.

I hope that you find these explanations to be helpful.

------------------------------------------------------------------------
Noah Mendelsohn                                    Voice: 1-617-693-4036
Lotus Development Corp.                            Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------------






"Roger L. Costello" <costello@mitre.org>
Sent by: www-xml-schema-comments-request@w3.org
01/04/00 09:22 AM

 
        To:     xml-dev@ic.ac.uk
        cc:     www-xml-schema-comments@w3c.org, "Schneider,John C." <jcs@mitre.org>, 
"Cokus,Michael S." <msc@mitre.org>, "Costello,Roger L." 
<costello@mitre.org>, (bcc: Noah Mendelsohn/CAM/Lotus)
        Subject:        Round 2: How an XML instance document references an XML Schema

Hi Folks,

There has been a considerable amount of discussion (and confusion) on
how an XML instance document indicates the XML Schema(s) that it
conforms to.  I am not sure that it is yet clear in people's minds on
how to do it.  I will take a stab at explaining it, based upon the
discussions.  However, we really need this to be verified by someone
from the Schema WG.

[Henry, I haven't fully digested your most recent message.  Hopefully
the following is consistent with what you said.]

[Also, thanks a lot to Henry Thompson, Andrew Layman, and Rick Jelliffe 
for taking the time to answer my endless barrage of questions.  I hope
that these questions and their answers are useful to all.]

Case 1.  Entire instance document conforms to a single XML Schema

Let's use the example that Gabe Beged-Dov gave yesterday.  Here's the
skeleton of the XML Schema:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "structures.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="urn:person-schema">
    ...
</schema>

Let's assume that the URI for this schema is: 

      urn:person-schema/person-schema.xsd

Thus the namespace for the elements and attributes that are declared in
person-schema.xsd is urn:person-schema.

An XML instance document that wishes to indicate that all or part of it
conforms to person-schema.xsd must use the attribute, schemaLocation. 
The value of schemaLocation must include a pair of values - the
namespace (urn:person-schema) and the URI to the Schema
(urn:person-schema/person-schema.xsd).  Thus, for this case the value of
schemaLocation is:

           schemaLocation="urn:person-schema
                           urn:person-schema/person-schema.xsd"

A Schema-validating parser will use the URI to fetch the schema
document, and then will verify that the targetNamespace value matches
the namespace in schemaLocation.

The schemaLocation attribute is defined in the schema instance
namespace.  So, to use it in our instance document we first need to
define a qualifier for the schema instance namespace and then prefix
schemaLocation:

          xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
          xsi:schemaLocation="urn:person-schema
                              urn:person-schema/person-schema.xsd"


Now then, is that all that's needed in the XML instance document -
simply add schemaLocation as an attribute to the root element, i.e.

<?xml version="1.0"?>
<Person xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
        xsi:schemaLocation="urn:person-schema
                            urn:person-schema/person-schema.xsd">
    <fname>Helen</fname>
    <lname>Jones</lname>
</Person>

Based upon Andrew Layman's messages yesterday, the answer is no.  I
believe that I now understand why.  In the above instance document we
have not declared a namespace for the elements - Person, fname, and
lname.  Thus, they are in the document's namespace.  However, with the
schemaLocation attribute we are asserting that the elements declared in
the schema are in the urn:person-schema namespace.  Thus, in our
instance document we must make a namespace declaration to indicate that
the elements in the instance document also are in the urn:person-schema
namespace.  Since we want to declare that all the instance document
elements come from the urn:person-schema namespace, we can use it as the
default namespace.  Thus, our instance document looks like this:

<?xml version="1.0"?>
<Person xmlns="urn:person-schema"
        xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
        xsi:schemaLocation="urn:person-schema
                            urn:person-schema/person-schema.xsd">
    <fname>Helen</fname>
    <lname>Jones</lname>
</Person>

Using the default namespace declaration, all the elements in the
instance document have the same namespace as the schema namespace. 
Thus, the entire instance document will get schema-validated.

Case 2.  Part of the instance document conforms to a single XML Schema

Let's use the same schema as above and the same instance document. 
However, in this case let's suppose that we just want to validate
"fname" against the schema.  What would the instance document look like?

As usual we use the schemaLocation attribute to indicate the schema that
we are using.  In the instance document we need to distinguish between
those elements that are in the document namespace versus the fname
element which is in the urn:person-schema.  We can do this anywhere, but
for simplicity let's do it at the root element:

<?xml version="1.0"?>
<Person xmlns:p="urn:person-schema"
        xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
        xsi:schemaLocation="urn:person-schema
                            urn:person-schema/person-schema.xsd">
    <p:fname>Helen</p:fname>
    <lname>Jones</lname>
</Person>

The only element in the instance document which has the same namespace
as the schema namespace is fname.  Thus, it is the only element which
will get schema-validated.

Case 3.  Instance document conforms to multiple XML Schemas

Let's suppose that we have a second schema.  This second schema
specializes in defining last names (I know, it's silly):

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "structures.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="urn:last-name-schema">
    ...
</schema>

Note that this second schema's namespace is:

       urn:last-name-schema

Let's continue to use the same instance document.  However, let's assume
that we want to validate fname against the first schema and lname
against the second schema.  For the Person element, we don't want any
validation.

Our schemaLocation attribute now will have two pairs of values - the
first pair is for the first schema and the second pair is for the second
schema.  We will declare the two different namespaces and prefix fname
and lname appropriately.  Thus, the instance document is:

<?xml version="1.0"?>
<Person xmlns:p="urn:person-schema"
        xmlns:l="urn:last-name-schema"
        xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
        xsi:schemaLocation="urn:person-schema
                            urn:person-schema/person-schema.xsd
                            urn:last-name-schema
                            urn:last-name-schema/last-name-schema.xsd">
    <p:fname>Helen</p:fname>
    <l:lname>Jones</l:lname>
</Person>
 
Well, I am getting tired of writing.  Hopefully this makes sense.  Even
more, hopefully it is correct.  Comments?  /Roger
Received on Tuesday, 4 January 2000 11:05:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:46 GMT