- From: Stephan Zednik <zednis@rpi.edu>
- Date: Tue, 12 Feb 2013 13:57:04 -0700
- To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Cc: W3C provenance WG <public-prov-wg@w3.org>
A summary of the possible changes based on this discussion. I am in favor of all three listed changes.
1) rename prov:abstractElement to prov:internalElement (or similar) to make it clear we do not expect non-PROV extensions to use this element.
2) add processContents="lax" on all xs:any elements.
3) change the definition of prov:Bundle to the following (bundleElements name is not final)
<xs:complexType name="Bundle">
<xs:complexContent>
<xs:extension base="prov:Entity">
<xs:sequence>
<xs:element name="bundleElements" minOccurs="0">
<xs:complexType>
<xs:sequence maxOccurs="unbounded">
<xs:group ref="prov:documentElements"/>
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
With the updated Bundle complexType the PROV-XML serialization for a bundle would look like this
<prov:document
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:ex="http://example.com/ns/ex#"
xmlns:prov="http://www.w3.org/ns/prov#">
<prov:person prov:id="bob"/>
<ex:label>outside-bundle-label</ex:label>
<prov:activity prov:id="a1"/>
<prov:bundle prov:id="bundle1">
<prov:label>bundle1</prov:label>
<ex:label>label-on-bundle-entity</ex:label>
<prov:bundleElements>
<ex:label>in-bundle-label</ex:label>
<prov:entity prov:id="ex:report1">
<prov:type xsi:type="xsd:QName">report</prov:type>
<ex:version>1</ex:version>
</prov:entity>
<ex:version>1.0.0</ex:version>
<prov:wasGeneratedBy>
<prov:entity prov:ref="ex:report1"/>
<prov:activity prov:ref="a1"/>
<prov:time>2012-05-24T10:00:01</prov:time>
</prov:wasGeneratedBy>
<ex:content>foo</ex:content>
</prov:bundleElements>
</prov:bundle>
</prov:document>
I used elements from the namespace "ex" to show how non-PROV elements can be used within a bundle and as PROV attributes on the bundle entity.
--Stephan
On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu> wrote:
> Comments in-line, last two comments are the most important.
>
>
> On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote:
>
>> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu> wrote:
>>> This does not follow the pattern Stian suggested of updating Document so
>>> that bundles are required at the bottom of the document.
>>>
>>> Stian, does this make sense? Do you still prefer the other pattern you
>>> suggested in the earlier email?
>>
>> Well to me it does not really matter if xs:any can appear anywhere in
>> <document> or just at the bottom of the <document> - but I think your
>> current solution means that you are allowed to put anything anywhere
>> in <document>, but in <bundle> you can only put the extensions after
>> <prov:value> but before the documentelements, which is a bit odd.
>>
>> It might be 'cleaner' to only allow extension stuff at the bottom, but
>> that could make it tricky for the bundle as it (now) specializes the
>> prov:Entity type and therefore the additional elements of Bundle come
>> below the <xs:any> from entity.
>>
>
> Yes, originally this worked because we had multiple xs:any in the prov:Bundle (inherited from both prov:Entity and prov:documentElements) but we violated the "unique particle attribution" rule which caused xjc to fail to generate java classes from the schema.
>
> We changed the schema to work well with xjc but in doing so introduced the odd restriction you have noted. I am still playing around with it to try to come up with a solution.
>
>>
>>
>>
>>> Also, I think that we put the abstract element after the choice in document
>>> Elements because it caused problems with schema validation, but I can double
>>> check on that and see if it can be included in the choice.
>>
>> I know, those things can get tricky.. it's another problem with XSD
>> and its particle separation.
>>
>>
>> I tried some example of making an extension:
>>
>> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples>
>>
>> Here in <custom.xsd> I was *NOT* able to use
>> substitutionGroup="prov:abstractElement", because I get:
>>
>> Can't include the substitutionGroup as it causes:
>> "http://www.w3.org/ns/prov#":abstractElement
>> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their
>> substitution
>> group) violate "Unique Particle Attribution".
>>
>>
>> Basically this means that the only way to use the
>> substitutionGroup="prov:abstractElement" is to stay within the PROV
>> namespace. This might not be obvious to someone looking at our
>> schema. So I'm having doubts now.
>
> We can try to make this more clear in the Note. The abstractElement is only to be intended to be used with substitionGroups that are in the PROV Namespace.
>
>>
>>
>> However, the general extension mechanism through xsd:any do work well,
>> and can validate also my non-prov elements -<custom-example.xml>, even
>> when I inserted those elements inside <prov:document>.
>>
>>
>> In <with-extensions.xml> I tried reusing some schemas of the shelf,
>> XHTML, MathML and DC Terms. This works fine thanks to xs:any as well.
>> I was even able to do nested inclusion reusing prov: elements, ie:
>>
>> <prov:document>
>> <mathml:annotation-xml>
>> <prov:wasAttributedTo>
>> <prov:entity prov:ref="formula"></prov:entity>
>> <prov:agent prov:ref="fred"/>
>> <dcterms:description>blalalla</dcterms:description>
>> <!-- ... -->
>>
>> (Those internal prov: elements should probably in most cases NOT be
>> considered part of the <prov:document> !)
>>
>> Now you can argue whether this would make sense or not, but that is
>> the downside of xsd:any - anything (in non-prov namespaces, in this
>> case) is allowed, not just content that should make sense by
>> declaration of substitution groups. The more xsd:any - the less you
>> have a schema and more you just have lots of fragmented types.
>>
>
> I think we are very limited in what we can say about how non-PROV extensions integrate with PROV.
>
>>
>>
>> However I was unable to reuse namespaces like FOAF, because it does
>> not have an XSD schema. So sadly this is not allowed:
>>
>> <prov:person prov:id="johndoe">
>> <foaf:name>John Doe</foaf:name>
>> </prov:person>
>>
>> I think this is too strict, and I suggest changing the xsd:any of
>> <prov:entity> and friends to processContent="lax" - this would only
>> validate against a schema if it's known.
>
>
>>
>> We could rename prov:abstractElement to prov:internal or something to
>> make it less 'tempting' for external use.
>>
>
> I am ok with this.
>
>>
>>
>>
>> We could in theory get rid of the whole documentElements and use only xs:any:
>>
>>
>> <xs:element name="document" type="prov:Document" />
>> <xs:complexType name="Document">
>> <xs:choice maxOccurs="unbounded">
>> <xs:any namespace="##targetNamespace" processContents="strict" />
>> <xs:any namespace="##other" processContents="lax" />
>> </xs:choice>
>> </xs:complexType>
>>
>> And then no substition groups is needed in our PROV extensions, any
>> declared <xs:element> would be allowed.
>
> If I understand this correctly, this would allow PROV attribute elements to be used on the document.
>
>> For consistency I've set
>> processContent=lax even for content of <prov:document> but we might
>> want to instead say that it should be strict, to encourage
>> PROV-extensions (rather than just providing attributes) to at least
>> declare a schema.
>
> I agree that PROV extensions should declare a schema.
>
>>
>>
>> This would mean you could also insert <prov:value> inside
>> <prov:document> and so we would have to ensure that only "proper"
>> elements are declared as named <xs:element>. I tried changing them to
>> xs:group's and group refs which works fine.
>>
>>
>>
>> The above is quite tricky to get to work inside a <prov:bundle>
>> because all its prov elements are optional, and we get a clash between
>> those and the optional xs:any in the prov namespace.
>>
>> This is a bit odd anyway because <prov:bundle> plays a dual role with
>> both being a way to say an entity which is a bundle, but also just
>> lists its content flatly, and so we can't know if something listed is
>> part of the bundle or an attribute of the bundle - specially for
>> extensions.
>>
>> Saying something is a bundle could also be done as:
>>
>> <prov:entity>
>> <prov:type>prov:Bundle</prov:type>
>> </prov:entity>
>>
>> (I am a bit confused now, as the PROV-XML document says this is how
>> it should be done)
>
> We made a change to the types some time ago which is reflected in the editors' draft.
>
> https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html
>
> Since Bundles are specializations of Entity prov:Bundle extends prov:Entity.
>
>>
>>
>> .. but I know the XML schema has similar 'helpers' for types like
>> prov:Person and prov:Revision so let's assume we keep the
>> <prov:bundle> entity.
>>
>> I then would propose changing the bundle to be:
>>
>> <prov:bundle>
>> <prov:label>A bundle</prov:bundle>
>> <dcterms:description>Still not part of the bundle</dcterms:description>
>> <prov:provenanceDescriptions>
>> <!-- the bundle content -->
>> <prov:activity />
>> <!-- .. -->
>> </prov:provenanceDescriptions>
>> </prov:bundle>
>>
>
> I like this.
>
>> (We can argue about the name prov:provenanceDescriptions - I went for
>> something close to PROV-DM)
>>
>>
>> So this works fine:
>>
>> <xs:complexType name="Bundle">
>> <xs:complexContent>
>> <xs:extension base="prov:Entity">
>> <xs:sequence>
>> <xs:element name="provenanceDescriptions" minOccurs="0">
>> <xs:complexType>
>> <xs:choice minOccurs="0" maxOccurs="unbounded">
>> <xs:any namespace="##targetNamespace" processContents="strict" />
>> <xs:any namespace="##other" processContents="lax" />
>> </xs:choice>
>> </xs:complexType>
>> </xs:element>
>> </xs:sequence>
>> </xs:extension>
>> </xs:complexContent>
>> </xs:complexType>
>>
>>
>> Now the xsd:any from prov:Entity does not cause any problems, except
>> that they have to be stated BEFORE <prov:provenanceDescriptions>. To
>> change this we would have to do a copy/paste from prov:Entity instead
>> and move the xsd:any down.
>
> I am OK with this.
>
> What does the group think?
>
>>
>>
>>
>> So it's possible, and not that unclean, to get rid of the substitution
>> groups, but it would allow non-PROV garbage (ie. schema elements which
>> were not intended as PROV extensions, like my MathML example above)
>> within <prov:document> and <prov:bundle>.
>>
>> I don't know what is the groups thoughts on extensions we should allow
>> for those, but at least it would be consistent with what PROV-N allows
>> - and then perhaps any PROV-N document could be translatable to
>> PROV-XML even without knowing the extensions.
>>
>
> I am ok with the substitution groups as they are.
>
> If you can present a desirable use case that is disallowed by the current modeling with substitution groups and supported by an alternate modeling than I will consider it. I don't want to make a late change without an example use case to consider.
>
> --Stephan
>
>>
>> If you wish I can commit my version of the schemas which does the
>> above (but slightly tidied up), either to the tip or a new branch.
>>
>>
>> --
>> Stian Soiland-Reyes, myGrid team
>> School of Computer Science
>> The University of Manchester
>>
>
>
>
Received on Tuesday, 12 February 2013 20:57:42 UTC