Re: Multiple XML schema files for a common target namespace (PROV-ISSUE-608)

On Tue, Feb 5, 2013 at 12:30 AM, Stephan Zednik <zednis@rpi.edu> wrote:
> https://dvcs.w3.org/hg/prov/file/tip/xml/schema/extensions/prov-dictionary.xsd
> https://dvcs.w3.org/hg/prov/file/tip/xml/schema/extensions/prov-links.xsd
>
> Both extension schemas include the prov-core schema and use
> substitutionGroup to extend the prov:abstractElement abstract element.
>
> prov.xsd includes the core schema and all extension schemas.


I think this is a very clean solution.

documentElements defines anything allowed at the top-level <document>
or in a <bundle>.

Extensions, our own and third-party, simply simply define a type (if
they need to), and a new element. To make it fit inside the document,
they just define the substitution group:

   <xs:element name="emptyDictionary" type="prov:EmptyDictionary"
substitutionGroup="prov:abstractElement" />

Therefore they don't need to redefine prov:Dictionary or anything
magic like that.

It's good to make our extension the same way third-parties should -
because we would be the primary source for copy/paste techniques.


We provide prov.xsd which simply includes the core and all our known
extensions. Thus this is for a user something that is easily
downloaded and modified if needed. For most cases, simply referring to
prov.xsd should be sufficient.   There is a slight cost in that if you
want to mirror the XSDs, but still want to support all the schemas (at
time of writing) the developer would have to download each of the
included schemas as well.   In some cases this can be a pain, with
multiple files including and importing at different levels - but here
it's a rather flat hierarchy - anything listed in prov.xsd is all
you'll need to download.

Keeping the schemas from the different notes separate - I would expect
nothing less. If say prov-links.xsd causes you problems - simply
delete it locally.



This is however a bit odd:

<xs:group name="documentElements">
  <xs:sequence>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element ref="prov:entity"/>
      <xs:element ref="prov:activity"/>
      <xs:element ref="prov:wasGeneratedBy"/>
      <xs:any namespace="##other"/>
      <!-- .. -->
    </xs:choice>
    <xs:element ref="prov:abstractElement" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
</xs:group>


It means I can say <activity>... <entity> etc. in any order and as
many times as I want - but any extensions by prov:abstractElement is
placed after these - in the end.

However:

<xs:complexType name="Document">
  <xs:sequence>
    <xs:choice maxOccurs="unbounded">
      <xs:group ref="prov:documentElements"/>
      <xs:element ref="prov:bundle"/>
    </xs:choice>
  </xs:sequence>
</xs:complexType>

And so it means I am also allowed to combine multiple
prov:documentElements's and bundles in any order, and hence I would be
allowed to
also start the document with an extension from  prov:abstractElement
and then do a
<entity>, etc., a bundle, and then another entity.


Is this intended?  (I know some discussion about ordering was done,
but I don't remember the details).

If so - should not the abstract element be moved
inside the xs:choice and no longer need its min/max?  (Or does this
cause some silly XSD validity problem with single interpretations
etc?).  The prov:Bundle does not allow this:

<xs:complexType name="Bundle">
  <xs:complexContent>
    <xs:extension base="prov:Entity">
      <xs:sequence>
        <xs:group ref="prov:documentElements"/>
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>


So if we are to keep the restriction, I suggest changing the Document type:


<xs:complexType name="Document">
  <xs:choice>
      <xs:group ref="prov:documentElements"/>
      <xs:sequence>
        <xs:element ref="prov:bundle"  maxOccurs="unbounded" />
      </xs:sequence>
  </xs:choice>
</xs:complexType>

(Untested! I don't think you need the upper level xs:sequence)

now bundles are required to be in the bottom, after all the normal
prov:entity and then after all the extensions.

-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Tuesday, 5 February 2013 14:18:38 UTC