- From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
- Date: Tue, 12 Feb 2013 23:01:34 +0000
- To: Stephan Zednik <zednis@rpi.edu>
- CC: public-prov-wg@w3.org
- Message-ID: <EMEW3|8b43ab276fc6888a5ef2aba16212a826p1BN2s08l.moreau|ecs.soton.ac.uk|511AC9CE>
Hi Stephan, On 12/02/13 22:40, Stephan Zednik wrote: > > On Feb 12, 2013, at 3:14 PM, Luc Moreau <l.moreau@ecs.soton.ac.uk > <mailto:l.moreau@ecs.soton.ac.uk>> wrote: > >> Hi Stephan, >> >> Thanks for the explanation on lax. Yes this seems reasonable. >> >> In your new propose schema, the bundleElements element correspond to >> the bundle construct >> in prov-n. The difference is that bundleElements are allowed inside >> entity, whereas the prov-n >> bundle construct is only allowed at the toplevel of a document. >> >> One strong requirement of part of the WG membership was to avoid >> nesting of bundles. >> With this, you have introduced nesting of bundles. >> An entity containing a bundleElements occurring inside another >> bundleElements. > > Is this requirement in the DM? Is this requirement define outside of > the recommendation documents? On the wiki perhaps? > First sentence in http://www.w3.org/TR/2012/CR-prov-n-20121211/#component4 is Bundles cannot be nested because a bundle is not an expression, and therefore cannot occur inside another bundle. In prov-dm, given the definition of entity: http://www.w3.org/TR/2012/CR-prov-dm-20121211/#term-entity I don't see where provenance descriptions contained in a bundle can occur. >> >> I think it's a significant departure from the dm. >> >> Also, personally, I find it useful to be able to return bundles, as a >> response to a provenance query. > > Is there any guarantee that the bundle entity will be a part of the > returned bundle? > What do you mean? > This is how just a bundle would look as PROV-XML: > > <prov:document > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xmlns:xsd="http://www.w3.org/2001/XMLSchema" > xmlns:ex="http://example.com/ns/ex#" > xmlns:prov="http://www.w3.org/ns/prov#"> > > <prov:bundle prov:id="bundle1"> > <prov:label>bundle1</prov:label> > <ex:label>label-on-bundle-entity</ex:label> > <prov:bundleElements> > <ex:label>in-bundle-label</ex:label> > <prov:entity prov:id="ex:report1"> > <prov:type xsi:type="xsd:QName">report</prov:type> > <ex:version>1</ex:version> > </prov:entity> > <ex:version>1.0.0</ex:version> > <prov:wasGeneratedBy> > <prov:entity prov:ref="ex:report1"/> > <prov:activity prov:ref="a1"/> > <prov:time>2012-05-24T10:00:01</prov:time> > </prov:wasGeneratedBy> > <ex:content>foo</ex:content> > </prov:bundleElements> > </prov:bundle> > </prov:document> Remember this is just one syntax. It is still possible to write <prov:entity prov:id="bundle1": < prov:type>prov:Bundle</prov:type> ... </prov:entity> And obviously, all the variants outside prov namespace. > > I think it would be easy enough to construct a bundle as a response to > a provenance query. > >> With the proposed schema change, they would now be nested inside an >> entity. Why this extra level of >> nesting? > > The schema previously had what the DM calls the bundleConstructor as > an implicit child of a bundle entity so this issue has been present > with PROV-XML bundle representation for some time. > I don't think so. The xml schema was aligned to the prov-n grammar, with bundle allowed inside document only. > This is the natural way to model bundles in XML, but it does introduce > the possibility of nesting bundles. The nesting issue could be > corrected if we remove prov:bundle from documentElements and add it to > the sequence in prov:Document. Then bundles would not be nestable, > but you would also not be able to define a bundle entity inside a bundle. > > The current modeling makes a bundle entity outside the scope of the > bundle container. If this is wrong and we always want the bundle > entity to be defined within the scope of the bundle entity then we > should use the modeling you suggest of defining a > prov:bundleConstructor element which is a member of the prov:Document > sequence but not the documentElements sequence. > I have use cases where the bundle entity is outside the bundle, and others where it is inside. > We should probably pick a scope for the bundle entity to provide > direction. Is the bundle entity inside or outside the > bundleConstructor (it's probably too late to ask for a rename to > bundleContainer, correct?) The schema should not make that decision and should let asserters decide where they want the bundle entity. Luc > > --Stephan > >> >> So given the above, I am not supportive of the change. >> >> Luc >> >> On 12/02/13 21:54, Stephan Zednik wrote: >>> >>> On Feb 12, 2013, at 2:09 PM, Luc Moreau <l.moreau@ecs.soton.ac.uk >>> <mailto:l.moreau@ecs.soton.ac.uk>> wrote: >>> >>>> Hi Stephan, >>>> >>>> Response interleaved. >>>> >>>> On 12/02/13 20:57, Stephan Zednik wrote: >>>>> A summary of the possible changes based on this discussion. I am >>>>> in favor of all three listed changes. >>>>> >>>>> 1) rename prov:abstractElement to prov:internalElement (or >>>>> similar) to make it clear we do not expect non-PROV extensions to >>>>> use this element. >>>> >>>> It's good. >>>>> 2) add processContents="lax" on all xs:any elements. >>>> What was the problem with the current definition, what does this >>>> allow us to do? >>> >>> If a non-PROV namespace does not have a corresponding schema then >>> the document will fail to validate. >>> >>> processContents Optional. Specifies how the XML processor should >>> handle validation against the elements specified by this any >>> element. Can be set to one of the following: >>> >>> * strict - the XML processor must obtain the schema for the >>> required namespaces and validate the elements (this is default) >>> * lax - same as strict but; if the schema cannot be obtained, no >>> errors will occur >>> * skip - The XML processor does not attempt to validate any >>> elements from the specified namespaces >>> >>> >>> >>> This loosens our validation requirements for non-PROV elements. >>> >>> Stian's use case example was to use some FOAF elements but >>> validation failed because he had not specified a FOAF schema. >>> >>>> >>>>> 3) change the definition of prov:Bundle to the following >>>>> (bundleElements name is not final) >>>>> >>>>> <xs:complexType name="Bundle"> >>>>> <xs:complexContent> >>>>> <xs:extension base="prov:Entity"> >>>>> <xs:sequence> >>>>> <xs:element name="bundleElements" minOccurs="0"> >>>>> <xs:complexType> >>>>> <xs:sequence maxOccurs="unbounded"> >>>>> <xs:group ref="prov:documentElements"/> >>>>> <xs:any namespace="##other" processContents="lax" >>>>> minOccurs="0" maxOccurs="unbounded"/> >>>>> </xs:sequence> >>>>> </xs:complexType> >>>>> </xs:element> >>>>> </xs:sequence> >>>>> </xs:extension> >>>>> </xs:complexContent> >>>>> </xs:complexType> >>>> >>>> To me, this does not correspond to prov-dm. >>>> I regard the bundle construct as distinct from the entity construct. >>> >>> Well, a Bundle is an Entity so the Bundle complexType extending the >>> Entity complexType is good. >>> >>> How then to have what the PROV-DM calls the 'bundle constructor'? >>> >>> I think of the prov:bundleElements as the bundle constructor and I >>> believe that it corresponds to PROV-DM. >>> >>> An alternative option would be to make a new element >>> prov:bundleConstructor and put it in the documentElements sequence. >>> This may be more like PROV-N, but is less like XML. >>> >>> The PROV-DM does not specify a serialization or syntax so a >>> XML-native approach should be ok. I think having the bundle >>> constructor as an XML element of a Bundle makes sense in XML. >>> >>> --Stephan >>> >>>> >>>> >>>> Luc >>>> >>>>> With the updated Bundle complexType the PROV-XML serialization for >>>>> a bundle would look like this >>>>> >>>>> <prov:document >>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>>>> xmlns:xsd="http://www.w3.org/2001/XMLSchema" >>>>> xmlns:ex="http://example.com/ns/ex#" >>>>> xmlns:prov="http://www.w3.org/ns/prov#"> >>>>> >>>>> <prov:person prov:id="bob"/> >>>>> >>>>> <ex:label>outside-bundle-label</ex:label> >>>>> >>>>> <prov:activity prov:id="a1"/> >>>>> >>>>> <prov:bundle prov:id="bundle1"> >>>>> >>>>> <prov:label>bundle1</prov:label> >>>>> <ex:label>label-on-bundle-entity</ex:label> >>>>> >>>>> <prov:bundleElements> >>>>> >>>>> <ex:label>in-bundle-label</ex:label> >>>>> >>>>> <prov:entity prov:id="ex:report1"> >>>>> <prov:type xsi:type="xsd:QName">report</prov:type> >>>>> <ex:version>1</ex:version> >>>>> </prov:entity> >>>>> >>>>> <ex:version>1.0.0</ex:version> >>>>> >>>>> <prov:wasGeneratedBy> >>>>> <prov:entity prov:ref="ex:report1"/> >>>>> <prov:activity prov:ref="a1"/> >>>>> <prov:time>2012-05-24T10:00:01</prov:time> >>>>> </prov:wasGeneratedBy> >>>>> >>>>> <ex:content>foo</ex:content> >>>>> >>>>> </prov:bundleElements> >>>>> >>>>> </prov:bundle> >>>>> >>>>> </prov:document> >>>>> >>>>> I used elements from the namespace "ex" to show how non-PROV >>>>> elements can be used within a bundle and as PROV attributes on the >>>>> bundle entity. >>>>> >>>>> --Stephan >>>>> >>>>> On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu >>>>> <mailto:zednis@rpi.edu>> wrote: >>>>> >>>>>> Comments in-line, last two comments are the most important. >>>>>> >>>>>> >>>>>> On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes >>>>>> <soiland-reyes@cs.manchester.ac.uk >>>>>> <mailto:soiland-reyes@cs.manchester.ac.uk>> wrote: >>>>>> >>>>>>> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu >>>>>>> <mailto:zednis@rpi.edu>> wrote: >>>>>>>> This does not follow the pattern Stian suggested of updating >>>>>>>> Document so >>>>>>>> that bundles are required at the bottom of the document. >>>>>>>> >>>>>>>> Stian, does this make sense? Do you still prefer the other >>>>>>>> pattern you >>>>>>>> suggested in the earlier email? >>>>>>> Well to me it does not really matter if xs:any can appear >>>>>>> anywhere in >>>>>>> <document> or just at the bottom of the <document> - but I think >>>>>>> your >>>>>>> current solution means that you are allowed to put anything anywhere >>>>>>> in <document>, but in <bundle> you can only put the extensions after >>>>>>> <prov:value> but before the documentelements, which is a bit odd. >>>>>>> >>>>>>> It might be 'cleaner' to only allow extension stuff at the >>>>>>> bottom, but >>>>>>> that could make it tricky for the bundle as it (now) specializes the >>>>>>> prov:Entity type and therefore the additional elements of Bundle >>>>>>> come >>>>>>> below the <xs:any> from entity. >>>>>>> >>>>>> Yes, originally this worked because we had multiple xs:any in the >>>>>> prov:Bundle (inherited from both prov:Entity and >>>>>> prov:documentElements) but we violated the "unique particle >>>>>> attribution" rule which caused xjc to fail to generate java >>>>>> classes from the schema. >>>>>> >>>>>> We changed the schema to work well with xjc but in doing so >>>>>> introduced the odd restriction you have noted. I am still >>>>>> playing around with it to try to come up with a solution. >>>>>> >>>>>>> >>>>>>> >>>>>>>> Also, I think that we put the abstract element after the choice >>>>>>>> in document >>>>>>>> Elements because it caused problems with schema validation, but >>>>>>>> I can double >>>>>>>> check on that and see if it can be included in the choice. >>>>>>> I know, those things can get tricky.. it's another problem with XSD >>>>>>> and its particle separation. >>>>>>> >>>>>>> >>>>>>> I tried some example of making an extension: >>>>>>> >>>>>>> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> >>>>>>> >>>>>>> Here in <custom.xsd> I was *NOT* able to use >>>>>>> substitutionGroup="prov:abstractElement", because I get: >>>>>>> >>>>>>> Can't include the substitutionGroup as it causes: >>>>>>> "http://www.w3.org/ns/prov#":abstractElement >>>>>>> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their >>>>>>> substitution >>>>>>> group) violate "Unique Particle Attribution". >>>>>>> >>>>>>> >>>>>>> Basically this means that the only way to use the >>>>>>> substitutionGroup="prov:abstractElement" is to stay within the PROV >>>>>>> namespace. This might not be obvious to someone looking at our >>>>>>> schema. So I'm having doubts now. >>>>>> We can try to make this more clear in the Note. The >>>>>> abstractElement is only to be intended to be used with >>>>>> substitionGroups that are in the PROV Namespace. >>>>>> >>>>>>> >>>>>>> However, the general extension mechanism through xsd:any do work >>>>>>> well, >>>>>>> and can validate also my non-prov elements >>>>>>> -<custom-example.xml>, even >>>>>>> when I inserted those elements inside <prov:document>. >>>>>>> >>>>>>> >>>>>>> In <with-extensions.xml> I tried reusing some schemas of the shelf, >>>>>>> XHTML, MathML and DC Terms. This works fine thanks to xs:any as >>>>>>> well. >>>>>>> I was even able to do nested inclusion reusing prov: elements, ie: >>>>>>> >>>>>>> <prov:document> >>>>>>> <mathml:annotation-xml> >>>>>>> <prov:wasAttributedTo> >>>>>>> <prov:entity prov:ref="formula"></prov:entity> >>>>>>> <prov:agent prov:ref="fred"/> >>>>>>> <dcterms:description>blalalla</dcterms:description> >>>>>>> <!-- ... --> >>>>>>> >>>>>>> (Those internal prov: elements should probably in most cases NOT be >>>>>>> considered part of the <prov:document> !) >>>>>>> >>>>>>> Now you can argue whether this would make sense or not, but that is >>>>>>> the downside of xsd:any - anything (in non-prov namespaces, in this >>>>>>> case) is allowed, not just content that should make sense by >>>>>>> declaration of substitution groups. The more xsd:any - the less you >>>>>>> have a schema and more you just have lots of fragmented types. >>>>>>> >>>>>> I think we are very limited in what we can say about how non-PROV >>>>>> extensions integrate with PROV. >>>>>> >>>>>>> >>>>>>> However I was unable to reuse namespaces like FOAF, because it does >>>>>>> not have an XSD schema. So sadly this is not allowed: >>>>>>> >>>>>>> <prov:person prov:id="johndoe"> >>>>>>> <foaf:name>John Doe</foaf:name> >>>>>>> </prov:person> >>>>>>> >>>>>>> I think this is too strict, and I suggest changing the xsd:any of >>>>>>> <prov:entity> and friends to processContent="lax" - this would only >>>>>>> validate against a schema if it's known. >>>>>> >>>>>>> We could rename prov:abstractElement to prov:internal or >>>>>>> something to >>>>>>> make it less 'tempting' for external use. >>>>>>> >>>>>> I am ok with this. >>>>>> >>>>>>> >>>>>>> >>>>>>> We could in theory get rid of the whole documentElements and use >>>>>>> only xs:any: >>>>>>> >>>>>>> >>>>>>> <xs:element name="document" type="prov:Document" /> >>>>>>> <xs:complexType name="Document"> >>>>>>> <xs:choice maxOccurs="unbounded"> >>>>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>>>> <xs:any namespace="##other" processContents="lax" /> >>>>>>> </xs:choice> >>>>>>> </xs:complexType> >>>>>>> >>>>>>> And then no substition groups is needed in our PROV extensions, any >>>>>>> declared <xs:element> would be allowed. >>>>>> If I understand this correctly, this would allow PROV attribute >>>>>> elements to be used on the document. >>>>>> >>>>>>> For consistency I've set >>>>>>> processContent=lax even for content of <prov:document> but we might >>>>>>> want to instead say that it should be strict, to encourage >>>>>>> PROV-extensions (rather than just providing attributes) to at least >>>>>>> declare a schema. >>>>>> I agree that PROV extensions should declare a schema. >>>>>> >>>>>>> >>>>>>> This would mean you could also insert <prov:value> inside >>>>>>> <prov:document> and so we would have to ensure that only "proper" >>>>>>> elements are declared as named <xs:element>. I tried changing >>>>>>> them to >>>>>>> xs:group's and group refs which works fine. >>>>>>> >>>>>>> >>>>>>> >>>>>>> The above is quite tricky to get to work inside a <prov:bundle> >>>>>>> because all its prov elements are optional, and we get a clash >>>>>>> between >>>>>>> those and the optional xs:any in the prov namespace. >>>>>>> >>>>>>> This is a bit odd anyway because <prov:bundle> plays a dual role >>>>>>> with >>>>>>> both being a way to say an entity which is a bundle, but also just >>>>>>> lists its content flatly, and so we can't know if something >>>>>>> listed is >>>>>>> part of the bundle or an attribute of the bundle - specially for >>>>>>> extensions. >>>>>>> >>>>>>> Saying something is a bundle could also be done as: >>>>>>> >>>>>>> <prov:entity> >>>>>>> <prov:type>prov:Bundle</prov:type> >>>>>>> </prov:entity> >>>>>>> >>>>>>> (I am a bit confused now, as the PROV-XML document says this is how >>>>>>> it should be done) >>>>>> We made a change to the types some time ago which is reflected in >>>>>> the editors' draft. >>>>>> >>>>>> https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html >>>>>> >>>>>> Since Bundles are specializations of Entity prov:Bundle extends >>>>>> prov:Entity. >>>>>> >>>>>>> >>>>>>> .. but I know the XML schema has similar 'helpers' for types like >>>>>>> prov:Person and prov:Revision so let's assume we keep the >>>>>>> <prov:bundle> entity. >>>>>>> >>>>>>> I then would propose changing the bundle to be: >>>>>>> >>>>>>> <prov:bundle> >>>>>>> <prov:label>A bundle</prov:bundle> >>>>>>> <dcterms:description>Still not part of the >>>>>>> bundle</dcterms:description> >>>>>>> <prov:provenanceDescriptions> >>>>>>> <!-- the bundle content --> >>>>>>> <prov:activity /> >>>>>>> <!-- .. --> >>>>>>> </prov:provenanceDescriptions> >>>>>>> </prov:bundle> >>>>>>> >>>>>> I like this. >>>>>> >>>>>>> (We can argue about the name prov:provenanceDescriptions - I >>>>>>> went for >>>>>>> something close to PROV-DM) >>>>>>> >>>>>>> >>>>>>> So this works fine: >>>>>>> >>>>>>> <xs:complexType name="Bundle"> >>>>>>> <xs:complexContent> >>>>>>> <xs:extension base="prov:Entity"> >>>>>>> <xs:sequence> >>>>>>> <xs:element name="provenanceDescriptions" minOccurs="0"> >>>>>>> <xs:complexType> >>>>>>> <xs:choice minOccurs="0" maxOccurs="unbounded"> >>>>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>>>> <xs:any namespace="##other" processContents="lax" /> >>>>>>> </xs:choice> >>>>>>> </xs:complexType> >>>>>>> </xs:element> >>>>>>> </xs:sequence> >>>>>>> </xs:extension> >>>>>>> </xs:complexContent> >>>>>>> </xs:complexType> >>>>>>> >>>>>>> >>>>>>> Now the xsd:any from prov:Entity does not cause any problems, except >>>>>>> that they have to be stated BEFORE <prov:provenanceDescriptions>. To >>>>>>> change this we would have to do a copy/paste from prov:Entity >>>>>>> instead >>>>>>> and move the xsd:any down. >>>>>> I am OK with this. >>>>>> >>>>>> What does the group think? >>>>>> >>>>>>> >>>>>>> >>>>>>> So it's possible, and not that unclean, to get rid of the >>>>>>> substitution >>>>>>> groups, but it would allow non-PROV garbage (ie. schema elements >>>>>>> which >>>>>>> were not intended as PROV extensions, like my MathML example above) >>>>>>> within <prov:document> and <prov:bundle>. >>>>>>> >>>>>>> I don't know what is the groups thoughts on extensions we should >>>>>>> allow >>>>>>> for those, but at least it would be consistent with what PROV-N >>>>>>> allows >>>>>>> - and then perhaps any PROV-N document could be translatable to >>>>>>> PROV-XML even without knowing the extensions. >>>>>>> >>>>>> I am ok with the substitution groups as they are. >>>>>> >>>>>> If you can present a desirable use case that is disallowed by the >>>>>> current modeling with substitution groups and supported by an >>>>>> alternate modeling than I will consider it. I don't want to make >>>>>> a late change without an example use case to consider. >>>>>> >>>>>> --Stephan >>>>>> >>>>>>> If you wish I can commit my version of the schemas which does the >>>>>>> above (but slightly tidied up), either to the tip or a new branch. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Stian Soiland-Reyes, myGrid team >>>>>>> School of Computer Science >>>>>>> The University of Manchester >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> Professor Luc Moreau >>>> Electronics and Computer Science tel: +44 23 8059 4487 >>>> University of Southampton fax: +44 23 8059 2865 >>>> Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk >>>> <mailto:l.moreau@ecs.soton.ac.uk> >>>> United Kingdom http://www.ecs.soton.ac.uk/~lavm >>>> <http://www.ecs.soton.ac.uk/%7Elavm> >>>> >>>> >>>> >>> >> >> -- >> Professor Luc Moreau >> Electronics and Computer Science tel: +44 23 8059 4487 >> University of Southampton fax: +44 23 8059 2865 >> Southampton SO17 1BJ email:l.moreau@ecs.soton.ac.uk >> United Kingdomhttp://www.ecs.soton.ac.uk/~lavm >> > -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Tuesday, 12 February 2013 23:03:24 UTC