Re: Should this schema be invalid? from Jeni Tennison on 2002-08-01 (xmlschema-dev@w3.org from August 2002)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 1 Aug 2002 14:44:23 +0100
To: xmlschema-dev@w3.org, Brenda Bell <bbell@juicesoftware.com>
Message-ID: <10176064354.20020801144423@jenitennison.com>
Hi Brenda,

>> As I see it, the mapping is:
>> 
>>   (name, age, #other*) -> (name, age, (ssn, #other*))
>> 
>> The sequence (ssn, #other*) counts as pointless because it has a
>> minOccurs and maxOccurs of 1 and it appears in another sequence, so
>> the mapping is actually:
>> 
>>   (name, age, #other*) -> (name, age, ssn, #other*)
>> 
>> The best functional mapping is then:
>> 
>>   name    -> name
>>   age     -> age
>>           -> ssn
>>   #other* -> #other*
>> 
>> which isn't valid because there's no particle in the base type for the
>> ssn particle to map on to.
>
> I tried to work through this one before I saw your response and got
> as far as ignoring the pointless sequence... but I still came up
> with the restriction as valid because the bounds of xs:any on the
> base type was 0 to unbounded.

Perhaps you can describe how you got to there, because it would
definitely be a more pleasant conclusion to reach! ;) I don't think
that you get to that level of checking because you get stuck on the
sequence-sequence mapping of:

  (name, age, #other*) -> (name, age, ssn, #other*)

This mapping is governed by the dreaded "Schema Component Constraint:
Particle Derivation OK (All:All,Sequence:Sequence -- Recurse)", which
says:

  For an all or sequence group particle to be a ·valid restriction· of
  another group particle with the same {compositor} all of the
  following must be true:

  1 R's occurrence range is a valid restriction of B's occurrence
    range as defined by Occurrence Range OK (§3.9.6).

  2 There is a complete ·order-preserving· functional mapping from
    the particles in the {particles} of R to the particles in the
    {particles} of B such that all of the following must be true:

  2.1 Each particle in the {particles} of R is a ·valid
      restriction· of the particle in the {particles} of B it maps to
      as defined by Particle Valid (Restriction) (§3.9.6).

  2.2 All particles in the {particles} of B which are not mapped
      to by any particle in the {particles} of R are ·emptiable· as
      defined by Particle Emptiable (§3.9.6).
  ...
  [Definition:] A complete functional mapping is order-preserving if
  each particle r in the domain R maps to a particle b in the range B
  which follows (not necessarily immediately) the particle in the
  range B mapped to by the predecessor of r, if any, where
  "predecessor" and "follows" are defined with respect to the order of
  the lists which constitute R and B.

                       http://www.w3.org/TR/xmlschema-1/#rcase-Recurse

Here, R is the sequence (name, age, ssn, #other*) and B is the
sequence (name, age, #other*). We need to find a complete
order-preserving functional mapping from the particles of R to those
in B. In other words, for each particle in (name, age, ssn, #other*)
we need to identify a particle in (name, age, #other*) such that its
preceding particle maps to a particle that precedes the one we're
mapping to.

So name -> name is OK because name doesn't have a preceding particle.
age -> age is OK because the preceding particle, name in the
restriction, maps to the preceding particle name in the base type. ssn
can map to #other* with no problems in a similar way, but then #other*
in the restricted type doesn't have a particle left onto which it can
map.

Basically, it's impossible to map 4 particles in a sequence the
restricted type to 3 particles in a sequence in the base type (or more
generally N particles in a sequence in the restricted type to M
particles in a sequence in the base type if M is less than N), as I
understand the constraint.

> I got more confused when I referenced Vlist's who says "any content
> valid per the restricted type must also be valid per the base type".

Sure, but Eric doesn't say (or I don't think he says) "if all content
that's valid per the restricted type is valid per the base type, then
it's a valid restriction". We might wish that we could say that, and
that might even have been the intention of the XML Schema WG, but the
particle derivation rules mean that we can't.

> In this case, that's true so I ran a couple of tests in XMLSpy...
> one with the original schema as posted and one where I physically
> removed the pointless sequence. It accepted both as valid.

You could view that as a feature or a bug. I don't think it's
compliant with the letter of the spec, so I'd view it as a bug. Both
Xerces-J and MSXML generate errors under the same circumstances.

> I then went back to the spec and decided that my confusion stems
> from this statement: "Any pointless occurrences of <sequence>,
> <choice> or <all> are ignored". I can read this two different
> ways... that the entire sequence is ignored and mapping applies to
> what's left... or that the pointless sequence is simply "verbose"
> such that you remove it from the equation and view its members as
> part of its containing sequence?

I think it has to be read as the latter. If it were the former then it
would be easy to "hide" parts of the content model that you didn't
want to be checked against the base content model by slipping them
into a sequence, and I don't think that's the intention.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Thursday, 1 August 2002 09:44:25 UTC