- From: Stephan Zednik <zednis@rpi.edu>
- Date: Tue, 12 Feb 2013 15:40:48 -0700
- To: Luc Moreau <l.moreau@ecs.soton.ac.uk>
- Cc: public-prov-wg@w3.org
- Message-Id: <9F0A0303-690E-4461-B11A-073299ADCC13@rpi.edu>
On Feb 12, 2013, at 3:14 PM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote: > Hi Stephan, > > Thanks for the explanation on lax. Yes this seems reasonable. > > In your new propose schema, the bundleElements element correspond to the bundle construct > in prov-n. The difference is that bundleElements are allowed inside entity, whereas the prov-n > bundle construct is only allowed at the toplevel of a document. > > One strong requirement of part of the WG membership was to avoid nesting of bundles. > With this, you have introduced nesting of bundles. > An entity containing a bundleElements occurring inside another bundleElements. Is this requirement in the DM? Is this requirement define outside of the recommendation documents? On the wiki perhaps? > > I think it's a significant departure from the dm. > > Also, personally, I find it useful to be able to return bundles, as a response to a provenance query. Is there any guarantee that the bundle entity will be a part of the returned bundle? This is how just a bundle would look as PROV-XML: <prov:document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ex="http://example.com/ns/ex#" xmlns:prov="http://www.w3.org/ns/prov#"> <prov:bundle prov:id="bundle1"> <prov:label>bundle1</prov:label> <ex:label>label-on-bundle-entity</ex:label> <prov:bundleElements> <ex:label>in-bundle-label</ex:label> <prov:entity prov:id="ex:report1"> <prov:type xsi:type="xsd:QName">report</prov:type> <ex:version>1</ex:version> </prov:entity> <ex:version>1.0.0</ex:version> <prov:wasGeneratedBy> <prov:entity prov:ref="ex:report1"/> <prov:activity prov:ref="a1"/> <prov:time>2012-05-24T10:00:01</prov:time> </prov:wasGeneratedBy> <ex:content>foo</ex:content> </prov:bundleElements> </prov:bundle> </prov:document> I think it would be easy enough to construct a bundle as a response to a provenance query. > With the proposed schema change, they would now be nested inside an entity. Why this extra level of > nesting? The schema previously had what the DM calls the bundleConstructor as an implicit child of a bundle entity so this issue has been present with PROV-XML bundle representation for some time. This is the natural way to model bundles in XML, but it does introduce the possibility of nesting bundles. The nesting issue could be corrected if we remove prov:bundle from documentElements and add it to the sequence in prov:Document. Then bundles would not be nestable, but you would also not be able to define a bundle entity inside a bundle. The current modeling makes a bundle entity outside the scope of the bundle container. If this is wrong and we always want the bundle entity to be defined within the scope of the bundle entity then we should use the modeling you suggest of defining a prov:bundleConstructor element which is a member of the prov:Document sequence but not the documentElements sequence. We should probably pick a scope for the bundle entity to provide direction. Is the bundle entity inside or outside the bundleConstructor (it's probably too late to ask for a rename to bundleContainer, correct?) --Stephan > > So given the above, I am not supportive of the change. > > Luc > > On 12/02/13 21:54, Stephan Zednik wrote: >> >> On Feb 12, 2013, at 2:09 PM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote: >> >>> Hi Stephan, >>> >>> Response interleaved. >>> >>> On 12/02/13 20:57, Stephan Zednik wrote: >>>> A summary of the possible changes based on this discussion. I am in favor of all three listed changes. >>>> >>>> 1) rename prov:abstractElement to prov:internalElement (or similar) to make it clear we do not expect non-PROV extensions to use this element. >>> >>> It's good. >>>> 2) add processContents="lax" on all xs:any elements. >>> What was the problem with the current definition, what does this allow us to do? >> >> If a non-PROV namespace does not have a corresponding schema then the document will fail to validate. >> >> processContents Optional. Specifies how the XML processor should handle validation against the elements specified by this any element. Can be set to one of the following: >> strict - the XML processor must obtain the schema for the required namespaces and validate the elements (this is default) >> lax - same as strict but; if the schema cannot be obtained, no errors will occur >> skip - The XML processor does not attempt to validate any elements from the specified namespaces >> >> >> This loosens our validation requirements for non-PROV elements. >> >> Stian's use case example was to use some FOAF elements but validation failed because he had not specified a FOAF schema. >> >>> >>>> 3) change the definition of prov:Bundle to the following (bundleElements name is not final) >>>> >>>> <xs:complexType name="Bundle"> >>>> <xs:complexContent> >>>> <xs:extension base="prov:Entity"> >>>> <xs:sequence> >>>> <xs:element name="bundleElements" minOccurs="0"> >>>> <xs:complexType> >>>> <xs:sequence maxOccurs="unbounded"> >>>> <xs:group ref="prov:documentElements"/> >>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> >>>> </xs:sequence> >>>> </xs:complexType> >>>> </xs:element> >>>> </xs:sequence> >>>> </xs:extension> >>>> </xs:complexContent> >>>> </xs:complexType> >>> >>> To me, this does not correspond to prov-dm. >>> I regard the bundle construct as distinct from the entity construct. >> >> Well, a Bundle is an Entity so the Bundle complexType extending the Entity complexType is good. >> >> How then to have what the PROV-DM calls the 'bundle constructor'? >> >> I think of the prov:bundleElements as the bundle constructor and I believe that it corresponds to PROV-DM. >> >> An alternative option would be to make a new element prov:bundleConstructor and put it in the documentElements sequence. This may be more like PROV-N, but is less like XML. >> >> The PROV-DM does not specify a serialization or syntax so a XML-native approach should be ok. I think having the bundle constructor as an XML element of a Bundle makes sense in XML. >> >> --Stephan >> >>> >>> >>> Luc >>> >>>> With the updated Bundle complexType the PROV-XML serialization for a bundle would look like this >>>> >>>> <prov:document >>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>>> xmlns:xsd="http://www.w3.org/2001/XMLSchema" >>>> xmlns:ex="http://example.com/ns/ex#" >>>> xmlns:prov="http://www.w3.org/ns/prov#"> >>>> >>>> <prov:person prov:id="bob"/> >>>> >>>> <ex:label>outside-bundle-label</ex:label> >>>> >>>> <prov:activity prov:id="a1"/> >>>> >>>> <prov:bundle prov:id="bundle1"> >>>> >>>> <prov:label>bundle1</prov:label> >>>> <ex:label>label-on-bundle-entity</ex:label> >>>> >>>> <prov:bundleElements> >>>> >>>> <ex:label>in-bundle-label</ex:label> >>>> >>>> <prov:entity prov:id="ex:report1"> >>>> <prov:type xsi:type="xsd:QName">report</prov:type> >>>> <ex:version>1</ex:version> >>>> </prov:entity> >>>> >>>> <ex:version>1.0.0</ex:version> >>>> >>>> <prov:wasGeneratedBy> >>>> <prov:entity prov:ref="ex:report1"/> >>>> <prov:activity prov:ref="a1"/> >>>> <prov:time>2012-05-24T10:00:01</prov:time> >>>> </prov:wasGeneratedBy> >>>> >>>> <ex:content>foo</ex:content> >>>> >>>> </prov:bundleElements> >>>> >>>> </prov:bundle> >>>> >>>> </prov:document> >>>> >>>> I used elements from the namespace "ex" to show how non-PROV elements can be used within a bundle and as PROV attributes on the bundle entity. >>>> >>>> --Stephan >>>> >>>> On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu> wrote: >>>> >>>>> Comments in-line, last two comments are the most important. >>>>> >>>>> >>>>> On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote: >>>>> >>>>>> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu> wrote: >>>>>>> This does not follow the pattern Stian suggested of updating Document so >>>>>>> that bundles are required at the bottom of the document. >>>>>>> >>>>>>> Stian, does this make sense? Do you still prefer the other pattern you >>>>>>> suggested in the earlier email? >>>>>> Well to me it does not really matter if xs:any can appear anywhere in >>>>>> <document> or just at the bottom of the <document> - but I think your >>>>>> current solution means that you are allowed to put anything anywhere >>>>>> in <document>, but in <bundle> you can only put the extensions after >>>>>> <prov:value> but before the documentelements, which is a bit odd. >>>>>> >>>>>> It might be 'cleaner' to only allow extension stuff at the bottom, but >>>>>> that could make it tricky for the bundle as it (now) specializes the >>>>>> prov:Entity type and therefore the additional elements of Bundle come >>>>>> below the <xs:any> from entity. >>>>>> >>>>> Yes, originally this worked because we had multiple xs:any in the prov:Bundle (inherited from both prov:Entity and prov:documentElements) but we violated the "unique particle attribution" rule which caused xjc to fail to generate java classes from the schema. >>>>> >>>>> We changed the schema to work well with xjc but in doing so introduced the odd restriction you have noted. I am still playing around with it to try to come up with a solution. >>>>> >>>>>> >>>>>> >>>>>>> Also, I think that we put the abstract element after the choice in document >>>>>>> Elements because it caused problems with schema validation, but I can double >>>>>>> check on that and see if it can be included in the choice. >>>>>> I know, those things can get tricky.. it's another problem with XSD >>>>>> and its particle separation. >>>>>> >>>>>> >>>>>> I tried some example of making an extension: >>>>>> >>>>>> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> >>>>>> >>>>>> Here in <custom.xsd> I was *NOT* able to use >>>>>> substitutionGroup="prov:abstractElement", because I get: >>>>>> >>>>>> Can't include the substitutionGroup as it causes: >>>>>> "http://www.w3.org/ns/prov#":abstractElement >>>>>> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their >>>>>> substitution >>>>>> group) violate "Unique Particle Attribution". >>>>>> >>>>>> >>>>>> Basically this means that the only way to use the >>>>>> substitutionGroup="prov:abstractElement" is to stay within the PROV >>>>>> namespace. This might not be obvious to someone looking at our >>>>>> schema. So I'm having doubts now. >>>>> We can try to make this more clear in the Note. The abstractElement is only to be intended to be used with substitionGroups that are in the PROV Namespace. >>>>> >>>>>> >>>>>> However, the general extension mechanism through xsd:any do work well, >>>>>> and can validate also my non-prov elements -<custom-example.xml>, even >>>>>> when I inserted those elements inside <prov:document>. >>>>>> >>>>>> >>>>>> In <with-extensions.xml> I tried reusing some schemas of the shelf, >>>>>> XHTML, MathML and DC Terms. This works fine thanks to xs:any as well. >>>>>> I was even able to do nested inclusion reusing prov: elements, ie: >>>>>> >>>>>> <prov:document> >>>>>> <mathml:annotation-xml> >>>>>> <prov:wasAttributedTo> >>>>>> <prov:entity prov:ref="formula"></prov:entity> >>>>>> <prov:agent prov:ref="fred"/> >>>>>> <dcterms:description>blalalla</dcterms:description> >>>>>> <!-- ... --> >>>>>> >>>>>> (Those internal prov: elements should probably in most cases NOT be >>>>>> considered part of the <prov:document> !) >>>>>> >>>>>> Now you can argue whether this would make sense or not, but that is >>>>>> the downside of xsd:any - anything (in non-prov namespaces, in this >>>>>> case) is allowed, not just content that should make sense by >>>>>> declaration of substitution groups. The more xsd:any - the less you >>>>>> have a schema and more you just have lots of fragmented types. >>>>>> >>>>> I think we are very limited in what we can say about how non-PROV extensions integrate with PROV. >>>>> >>>>>> >>>>>> However I was unable to reuse namespaces like FOAF, because it does >>>>>> not have an XSD schema. So sadly this is not allowed: >>>>>> >>>>>> <prov:person prov:id="johndoe"> >>>>>> <foaf:name>John Doe</foaf:name> >>>>>> </prov:person> >>>>>> >>>>>> I think this is too strict, and I suggest changing the xsd:any of >>>>>> <prov:entity> and friends to processContent="lax" - this would only >>>>>> validate against a schema if it's known. >>>>> >>>>>> We could rename prov:abstractElement to prov:internal or something to >>>>>> make it less 'tempting' for external use. >>>>>> >>>>> I am ok with this. >>>>> >>>>>> >>>>>> >>>>>> We could in theory get rid of the whole documentElements and use only xs:any: >>>>>> >>>>>> >>>>>> <xs:element name="document" type="prov:Document" /> >>>>>> <xs:complexType name="Document"> >>>>>> <xs:choice maxOccurs="unbounded"> >>>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>>> <xs:any namespace="##other" processContents="lax" /> >>>>>> </xs:choice> >>>>>> </xs:complexType> >>>>>> >>>>>> And then no substition groups is needed in our PROV extensions, any >>>>>> declared <xs:element> would be allowed. >>>>> If I understand this correctly, this would allow PROV attribute elements to be used on the document. >>>>> >>>>>> For consistency I've set >>>>>> processContent=lax even for content of <prov:document> but we might >>>>>> want to instead say that it should be strict, to encourage >>>>>> PROV-extensions (rather than just providing attributes) to at least >>>>>> declare a schema. >>>>> I agree that PROV extensions should declare a schema. >>>>> >>>>>> >>>>>> This would mean you could also insert <prov:value> inside >>>>>> <prov:document> and so we would have to ensure that only "proper" >>>>>> elements are declared as named <xs:element>. I tried changing them to >>>>>> xs:group's and group refs which works fine. >>>>>> >>>>>> >>>>>> >>>>>> The above is quite tricky to get to work inside a <prov:bundle> >>>>>> because all its prov elements are optional, and we get a clash between >>>>>> those and the optional xs:any in the prov namespace. >>>>>> >>>>>> This is a bit odd anyway because <prov:bundle> plays a dual role with >>>>>> both being a way to say an entity which is a bundle, but also just >>>>>> lists its content flatly, and so we can't know if something listed is >>>>>> part of the bundle or an attribute of the bundle - specially for >>>>>> extensions. >>>>>> >>>>>> Saying something is a bundle could also be done as: >>>>>> >>>>>> <prov:entity> >>>>>> <prov:type>prov:Bundle</prov:type> >>>>>> </prov:entity> >>>>>> >>>>>> (I am a bit confused now, as the PROV-XML document says this is how >>>>>> it should be done) >>>>> We made a change to the types some time ago which is reflected in the editors' draft. >>>>> >>>>> https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html >>>>> >>>>> Since Bundles are specializations of Entity prov:Bundle extends prov:Entity. >>>>> >>>>>> >>>>>> .. but I know the XML schema has similar 'helpers' for types like >>>>>> prov:Person and prov:Revision so let's assume we keep the >>>>>> <prov:bundle> entity. >>>>>> >>>>>> I then would propose changing the bundle to be: >>>>>> >>>>>> <prov:bundle> >>>>>> <prov:label>A bundle</prov:bundle> >>>>>> <dcterms:description>Still not part of the bundle</dcterms:description> >>>>>> <prov:provenanceDescriptions> >>>>>> <!-- the bundle content --> >>>>>> <prov:activity /> >>>>>> <!-- .. --> >>>>>> </prov:provenanceDescriptions> >>>>>> </prov:bundle> >>>>>> >>>>> I like this. >>>>> >>>>>> (We can argue about the name prov:provenanceDescriptions - I went for >>>>>> something close to PROV-DM) >>>>>> >>>>>> >>>>>> So this works fine: >>>>>> >>>>>> <xs:complexType name="Bundle"> >>>>>> <xs:complexContent> >>>>>> <xs:extension base="prov:Entity"> >>>>>> <xs:sequence> >>>>>> <xs:element name="provenanceDescriptions" minOccurs="0"> >>>>>> <xs:complexType> >>>>>> <xs:choice minOccurs="0" maxOccurs="unbounded"> >>>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>>> <xs:any namespace="##other" processContents="lax" /> >>>>>> </xs:choice> >>>>>> </xs:complexType> >>>>>> </xs:element> >>>>>> </xs:sequence> >>>>>> </xs:extension> >>>>>> </xs:complexContent> >>>>>> </xs:complexType> >>>>>> >>>>>> >>>>>> Now the xsd:any from prov:Entity does not cause any problems, except >>>>>> that they have to be stated BEFORE <prov:provenanceDescriptions>. To >>>>>> change this we would have to do a copy/paste from prov:Entity instead >>>>>> and move the xsd:any down. >>>>> I am OK with this. >>>>> >>>>> What does the group think? >>>>> >>>>>> >>>>>> >>>>>> So it's possible, and not that unclean, to get rid of the substitution >>>>>> groups, but it would allow non-PROV garbage (ie. schema elements which >>>>>> were not intended as PROV extensions, like my MathML example above) >>>>>> within <prov:document> and <prov:bundle>. >>>>>> >>>>>> I don't know what is the groups thoughts on extensions we should allow >>>>>> for those, but at least it would be consistent with what PROV-N allows >>>>>> - and then perhaps any PROV-N document could be translatable to >>>>>> PROV-XML even without knowing the extensions. >>>>>> >>>>> I am ok with the substitution groups as they are. >>>>> >>>>> If you can present a desirable use case that is disallowed by the current modeling with substitution groups and supported by an alternate modeling than I will consider it. I don't want to make a late change without an example use case to consider. >>>>> >>>>> --Stephan >>>>> >>>>>> If you wish I can commit my version of the schemas which does the >>>>>> above (but slightly tidied up), either to the tip or a new branch. >>>>>> >>>>>> >>>>>> -- >>>>>> Stian Soiland-Reyes, myGrid team >>>>>> School of Computer Science >>>>>> The University of Manchester >>>>>> >>>>> >>>>> >>>> >>> >>> -- >>> Professor Luc Moreau >>> Electronics and Computer Science tel: +44 23 8059 4487 >>> University of Southampton fax: +44 23 8059 2865 >>> Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk >>> United Kingdom http://www.ecs.soton.ac.uk/~lavm >>> >>> >>> >> > > -- > Professor Luc Moreau > Electronics and Computer Science tel: +44 23 8059 4487 > University of Southampton fax: +44 23 8059 2865 > Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk > United Kingdom http://www.ecs.soton.ac.uk/~lavm >
Received on Tuesday, 12 February 2013 22:41:18 UTC