Re: XML Schema compliance from Jeni Tennison on 2003-07-07 (xmlschema-dev@w3.org from July 2003)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Mon, 7 Jul 2003 09:52:03 +0100
To: "Savas Parastatidis" <Savas.Parastatidis@newcastle.ac.uk>
CC: "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, xmlschema-dev@w3.org, "Jim Webber" <jim.webber@arjuna.com>, "Paul Watson" <Paul.Watson@newcastle.ac.uk>
Message-ID: <129390263899.20030707095203@jenitennison.com>

Hi Savas,

> I understand this. The XML Schema processor needs to know about the
> "element declarations", the "type definitions" and so on. However,
> in order to validate a schema that I am writing, like the following,
> it needs to know about the structure of the <xs:element> (the
> instance as you call it).
>
> So, when I write
>
> <xs:element name="foo" type="xs:string"/>
>
> the processor knows what <xs:element> means.

Sure. When a processor looks at an XML Schema written in the XML
representation for XML Schema, there are a whole bunch of things that
it needs to check -- all the constraints that are listed in the Rec.
as "XML Representation Constraints". Some of these constraints are
constraints that can be represented in an XML Schema (the
Schema-for-Schema). Some of them aren't.

XML Schema validators *could* check some of the XML Representation
Constraints by validating the schema you're using against the
Schema-for-Schema, and then by checking the additional constraints
separately. I suspect that most of them don't do this, partly because
the process of parsing and compiling a schema is laborious, so it
makes sense to use an internal version and partly because the
Schema-for-Schema is a very special case that deserves special
attention.

In any case, what the validator is interested in is the set of schema
components that your schema contains. Once it's checked the XML
representation of your schema, there's absolutely no need for it to
keep the schema components of the Schema-for-Schema hanging around
taking up memory. The information about the XML representation of your
schema is redundant once the validator has determined the schema
components it contains; so is the information about the
Schema-for-Schema that may or may not have been used to validate that
XML representation.

So while yes, of course the validator has to know what an <xs:element>
element should look like, (a) that information doesn't necessarily
come from the Schema-for-Schema and (b) that information is long gone
by the time the validator gets round to validating your instance
against your schema.

> By including
>
> <xs:import namespace="http://www.w3.org/2001/XMLSchema" 
>            schemaLocation="http://www.w3.org/2001/XMLSchema.xsd" />
>
> in your schema, you introduce the xs:element type under the "type
> defininitions" infoset component. This is fine. However, you also
> introduce the xs:element element under the "element declarations"
> infoset component, which already exists. Shouldn't xml schema
> processors disallow this?

I think that you are mixing up two infosets that are completely
separate:

  (1) the infoset from the Schema-for-Schema, which a validator might
      use to check the XML representation of your schema

  (2) the infoset generated from your schema

Processors should not combine these two infosets. If you think of the
pipeline:

  XML schema  -->  (parse & compile)  -->  schema infoset
                          ^
                          |
                Schema-for-Schema infoset

The Schema-for-Schema infoset, if it gets used at all, gets used
to check the XML representation of your schema. The schema infoset
that's used to validate your instance is the result of the parsing and
compilation of the XML representation of your schema. The
Schema-for-Schema infoset, if it exists, can get thrown away once your
schema has been checked and there's no way that it should pollute the
your schema infoset.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Monday, 7 July 2003 04:52:25 UTC