- From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
- Date: Tue, 12 Feb 2013 22:14:55 +0000
- To: Stephan Zednik <zednis@rpi.edu>
- CC: public-prov-wg@w3.org
- Message-ID: <EMEW3|888d5531cbf243a35049d0958439b2abp1BMF508l.moreau|ecs.soton.ac.uk|511ABEDF>
Hi Stephan, Thanks for the explanation on lax. Yes this seems reasonable. In your new propose schema, the bundleElements element correspond to the bundle construct in prov-n. The difference is that bundleElements are allowed inside entity, whereas the prov-n bundle construct is only allowed at the toplevel of a document. One strong requirement of part of the WG membership was to avoid nesting of bundles. With this, you have introduced nesting of bundles. An entity containing a bundleElements occurring inside another bundleElements. I think it's a significant departure from the dm. Also, personally, I find it useful to be able to return bundles, as a response to a provenance query. With the proposed schema change, they would now be nested inside an entity. Why this extra level of nesting? So given the above, I am not supportive of the change. Luc On 12/02/13 21:54, Stephan Zednik wrote: > > On Feb 12, 2013, at 2:09 PM, Luc Moreau <l.moreau@ecs.soton.ac.uk > <mailto:l.moreau@ecs.soton.ac.uk>> wrote: > >> Hi Stephan, >> >> Response interleaved. >> >> On 12/02/13 20:57, Stephan Zednik wrote: >>> A summary of the possible changes based on this discussion. I am in >>> favor of all three listed changes. >>> >>> 1) rename prov:abstractElement to prov:internalElement (or similar) >>> to make it clear we do not expect non-PROV extensions to use this >>> element. >> >> It's good. >>> 2) add processContents="lax" on all xs:any elements. >> What was the problem with the current definition, what does this >> allow us to do? > > If a non-PROV namespace does not have a corresponding schema then the > document will fail to validate. > > processContents Optional. Specifies how the XML processor should > handle validation against the elements specified by this any element. > Can be set to one of the following: > > * strict - the XML processor must obtain the schema for the required > namespaces and validate the elements (this is default) > * lax - same as strict but; if the schema cannot be obtained, no > errors will occur > * skip - The XML processor does not attempt to validate any elements > from the specified namespaces > > > > This loosens our validation requirements for non-PROV elements. > > Stian's use case example was to use some FOAF elements but validation > failed because he had not specified a FOAF schema. > >> >>> 3) change the definition of prov:Bundle to the following >>> (bundleElements name is not final) >>> >>> <xs:complexType name="Bundle"> >>> <xs:complexContent> >>> <xs:extension base="prov:Entity"> >>> <xs:sequence> >>> <xs:element name="bundleElements" minOccurs="0"> >>> <xs:complexType> >>> <xs:sequence maxOccurs="unbounded"> >>> <xs:group ref="prov:documentElements"/> >>> <xs:any namespace="##other" processContents="lax" >>> minOccurs="0" maxOccurs="unbounded"/> >>> </xs:sequence> >>> </xs:complexType> >>> </xs:element> >>> </xs:sequence> >>> </xs:extension> >>> </xs:complexContent> >>> </xs:complexType> >> >> To me, this does not correspond to prov-dm. >> I regard the bundle construct as distinct from the entity construct. > > Well, a Bundle is an Entity so the Bundle complexType extending the > Entity complexType is good. > > How then to have what the PROV-DM calls the 'bundle constructor'? > > I think of the prov:bundleElements as the bundle constructor and I > believe that it corresponds to PROV-DM. > > An alternative option would be to make a new element > prov:bundleConstructor and put it in the documentElements sequence. > This may be more like PROV-N, but is less like XML. > > The PROV-DM does not specify a serialization or syntax so a XML-native > approach should be ok. I think having the bundle constructor as an > XML element of a Bundle makes sense in XML. > > --Stephan > >> >> >> Luc >> >>> With the updated Bundle complexType the PROV-XML serialization for a >>> bundle would look like this >>> >>> <prov:document >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>> xmlns:xsd="http://www.w3.org/2001/XMLSchema" >>> xmlns:ex="http://example.com/ns/ex#" >>> xmlns:prov="http://www.w3.org/ns/prov#"> >>> >>> <prov:person prov:id="bob"/> >>> >>> <ex:label>outside-bundle-label</ex:label> >>> >>> <prov:activity prov:id="a1"/> >>> >>> <prov:bundle prov:id="bundle1"> >>> >>> <prov:label>bundle1</prov:label> >>> <ex:label>label-on-bundle-entity</ex:label> >>> >>> <prov:bundleElements> >>> >>> <ex:label>in-bundle-label</ex:label> >>> >>> <prov:entity prov:id="ex:report1"> >>> <prov:type xsi:type="xsd:QName">report</prov:type> >>> <ex:version>1</ex:version> >>> </prov:entity> >>> >>> <ex:version>1.0.0</ex:version> >>> >>> <prov:wasGeneratedBy> >>> <prov:entity prov:ref="ex:report1"/> >>> <prov:activity prov:ref="a1"/> >>> <prov:time>2012-05-24T10:00:01</prov:time> >>> </prov:wasGeneratedBy> >>> >>> <ex:content>foo</ex:content> >>> >>> </prov:bundleElements> >>> >>> </prov:bundle> >>> >>> </prov:document> >>> >>> I used elements from the namespace "ex" to show how non-PROV >>> elements can be used within a bundle and as PROV attributes on the >>> bundle entity. >>> >>> --Stephan >>> >>> On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu >>> <mailto:zednis@rpi.edu>> wrote: >>> >>>> Comments in-line, last two comments are the most important. >>>> >>>> >>>> On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes >>>> <soiland-reyes@cs.manchester.ac.uk >>>> <mailto:soiland-reyes@cs.manchester.ac.uk>> wrote: >>>> >>>>> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu >>>>> <mailto:zednis@rpi.edu>> wrote: >>>>>> This does not follow the pattern Stian suggested of updating >>>>>> Document so >>>>>> that bundles are required at the bottom of the document. >>>>>> >>>>>> Stian, does this make sense? Do you still prefer the other >>>>>> pattern you >>>>>> suggested in the earlier email? >>>>> Well to me it does not really matter if xs:any can appear anywhere in >>>>> <document> or just at the bottom of the <document> - but I think your >>>>> current solution means that you are allowed to put anything anywhere >>>>> in <document>, but in <bundle> you can only put the extensions after >>>>> <prov:value> but before the documentelements, which is a bit odd. >>>>> >>>>> It might be 'cleaner' to only allow extension stuff at the bottom, but >>>>> that could make it tricky for the bundle as it (now) specializes the >>>>> prov:Entity type and therefore the additional elements of Bundle come >>>>> below the <xs:any> from entity. >>>>> >>>> Yes, originally this worked because we had multiple xs:any in the >>>> prov:Bundle (inherited from both prov:Entity and >>>> prov:documentElements) but we violated the "unique particle >>>> attribution" rule which caused xjc to fail to generate java classes >>>> from the schema. >>>> >>>> We changed the schema to work well with xjc but in doing so >>>> introduced the odd restriction you have noted. I am still playing >>>> around with it to try to come up with a solution. >>>> >>>>> >>>>> >>>>>> Also, I think that we put the abstract element after the choice >>>>>> in document >>>>>> Elements because it caused problems with schema validation, but I >>>>>> can double >>>>>> check on that and see if it can be included in the choice. >>>>> I know, those things can get tricky.. it's another problem with XSD >>>>> and its particle separation. >>>>> >>>>> >>>>> I tried some example of making an extension: >>>>> >>>>> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> >>>>> >>>>> Here in <custom.xsd> I was *NOT* able to use >>>>> substitutionGroup="prov:abstractElement", because I get: >>>>> >>>>> Can't include the substitutionGroup as it causes: >>>>> "http://www.w3.org/ns/prov#":abstractElement >>>>> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their >>>>> substitution >>>>> group) violate "Unique Particle Attribution". >>>>> >>>>> >>>>> Basically this means that the only way to use the >>>>> substitutionGroup="prov:abstractElement" is to stay within the PROV >>>>> namespace. This might not be obvious to someone looking at our >>>>> schema. So I'm having doubts now. >>>> We can try to make this more clear in the Note. The >>>> abstractElement is only to be intended to be used with >>>> substitionGroups that are in the PROV Namespace. >>>> >>>>> >>>>> However, the general extension mechanism through xsd:any do work well, >>>>> and can validate also my non-prov elements -<custom-example.xml>, even >>>>> when I inserted those elements inside <prov:document>. >>>>> >>>>> >>>>> In <with-extensions.xml> I tried reusing some schemas of the shelf, >>>>> XHTML, MathML and DC Terms. This works fine thanks to xs:any as well. >>>>> I was even able to do nested inclusion reusing prov: elements, ie: >>>>> >>>>> <prov:document> >>>>> <mathml:annotation-xml> >>>>> <prov:wasAttributedTo> >>>>> <prov:entity prov:ref="formula"></prov:entity> >>>>> <prov:agent prov:ref="fred"/> >>>>> <dcterms:description>blalalla</dcterms:description> >>>>> <!-- ... --> >>>>> >>>>> (Those internal prov: elements should probably in most cases NOT be >>>>> considered part of the <prov:document> !) >>>>> >>>>> Now you can argue whether this would make sense or not, but that is >>>>> the downside of xsd:any - anything (in non-prov namespaces, in this >>>>> case) is allowed, not just content that should make sense by >>>>> declaration of substitution groups. The more xsd:any - the less you >>>>> have a schema and more you just have lots of fragmented types. >>>>> >>>> I think we are very limited in what we can say about how non-PROV >>>> extensions integrate with PROV. >>>> >>>>> >>>>> However I was unable to reuse namespaces like FOAF, because it does >>>>> not have an XSD schema. So sadly this is not allowed: >>>>> >>>>> <prov:person prov:id="johndoe"> >>>>> <foaf:name>John Doe</foaf:name> >>>>> </prov:person> >>>>> >>>>> I think this is too strict, and I suggest changing the xsd:any of >>>>> <prov:entity> and friends to processContent="lax" - this would only >>>>> validate against a schema if it's known. >>>> >>>>> We could rename prov:abstractElement to prov:internal or something to >>>>> make it less 'tempting' for external use. >>>>> >>>> I am ok with this. >>>> >>>>> >>>>> >>>>> We could in theory get rid of the whole documentElements and use >>>>> only xs:any: >>>>> >>>>> >>>>> <xs:element name="document" type="prov:Document" /> >>>>> <xs:complexType name="Document"> >>>>> <xs:choice maxOccurs="unbounded"> >>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>> <xs:any namespace="##other" processContents="lax" /> >>>>> </xs:choice> >>>>> </xs:complexType> >>>>> >>>>> And then no substition groups is needed in our PROV extensions, any >>>>> declared <xs:element> would be allowed. >>>> If I understand this correctly, this would allow PROV attribute >>>> elements to be used on the document. >>>> >>>>> For consistency I've set >>>>> processContent=lax even for content of <prov:document> but we might >>>>> want to instead say that it should be strict, to encourage >>>>> PROV-extensions (rather than just providing attributes) to at least >>>>> declare a schema. >>>> I agree that PROV extensions should declare a schema. >>>> >>>>> >>>>> This would mean you could also insert <prov:value> inside >>>>> <prov:document> and so we would have to ensure that only "proper" >>>>> elements are declared as named <xs:element>. I tried changing them to >>>>> xs:group's and group refs which works fine. >>>>> >>>>> >>>>> >>>>> The above is quite tricky to get to work inside a <prov:bundle> >>>>> because all its prov elements are optional, and we get a clash between >>>>> those and the optional xs:any in the prov namespace. >>>>> >>>>> This is a bit odd anyway because <prov:bundle> plays a dual role with >>>>> both being a way to say an entity which is a bundle, but also just >>>>> lists its content flatly, and so we can't know if something listed is >>>>> part of the bundle or an attribute of the bundle - specially for >>>>> extensions. >>>>> >>>>> Saying something is a bundle could also be done as: >>>>> >>>>> <prov:entity> >>>>> <prov:type>prov:Bundle</prov:type> >>>>> </prov:entity> >>>>> >>>>> (I am a bit confused now, as the PROV-XML document says this is how >>>>> it should be done) >>>> We made a change to the types some time ago which is reflected in >>>> the editors' draft. >>>> >>>> https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html >>>> >>>> Since Bundles are specializations of Entity prov:Bundle extends >>>> prov:Entity. >>>> >>>>> >>>>> .. but I know the XML schema has similar 'helpers' for types like >>>>> prov:Person and prov:Revision so let's assume we keep the >>>>> <prov:bundle> entity. >>>>> >>>>> I then would propose changing the bundle to be: >>>>> >>>>> <prov:bundle> >>>>> <prov:label>A bundle</prov:bundle> >>>>> <dcterms:description>Still not part of the >>>>> bundle</dcterms:description> >>>>> <prov:provenanceDescriptions> >>>>> <!-- the bundle content --> >>>>> <prov:activity /> >>>>> <!-- .. --> >>>>> </prov:provenanceDescriptions> >>>>> </prov:bundle> >>>>> >>>> I like this. >>>> >>>>> (We can argue about the name prov:provenanceDescriptions - I went for >>>>> something close to PROV-DM) >>>>> >>>>> >>>>> So this works fine: >>>>> >>>>> <xs:complexType name="Bundle"> >>>>> <xs:complexContent> >>>>> <xs:extension base="prov:Entity"> >>>>> <xs:sequence> >>>>> <xs:element name="provenanceDescriptions" minOccurs="0"> >>>>> <xs:complexType> >>>>> <xs:choice minOccurs="0" maxOccurs="unbounded"> >>>>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>>>> <xs:any namespace="##other" processContents="lax" /> >>>>> </xs:choice> >>>>> </xs:complexType> >>>>> </xs:element> >>>>> </xs:sequence> >>>>> </xs:extension> >>>>> </xs:complexContent> >>>>> </xs:complexType> >>>>> >>>>> >>>>> Now the xsd:any from prov:Entity does not cause any problems, except >>>>> that they have to be stated BEFORE <prov:provenanceDescriptions>. To >>>>> change this we would have to do a copy/paste from prov:Entity instead >>>>> and move the xsd:any down. >>>> I am OK with this. >>>> >>>> What does the group think? >>>> >>>>> >>>>> >>>>> So it's possible, and not that unclean, to get rid of the substitution >>>>> groups, but it would allow non-PROV garbage (ie. schema elements which >>>>> were not intended as PROV extensions, like my MathML example above) >>>>> within <prov:document> and <prov:bundle>. >>>>> >>>>> I don't know what is the groups thoughts on extensions we should allow >>>>> for those, but at least it would be consistent with what PROV-N allows >>>>> - and then perhaps any PROV-N document could be translatable to >>>>> PROV-XML even without knowing the extensions. >>>>> >>>> I am ok with the substitution groups as they are. >>>> >>>> If you can present a desirable use case that is disallowed by the >>>> current modeling with substitution groups and supported by an >>>> alternate modeling than I will consider it. I don't want to make a >>>> late change without an example use case to consider. >>>> >>>> --Stephan >>>> >>>>> If you wish I can commit my version of the schemas which does the >>>>> above (but slightly tidied up), either to the tip or a new branch. >>>>> >>>>> >>>>> -- >>>>> Stian Soiland-Reyes, myGrid team >>>>> School of Computer Science >>>>> The University of Manchester >>>>> >>>> >>>> >>> >> >> -- >> Professor Luc Moreau >> Electronics and Computer Science tel: +44 23 8059 4487 >> University of Southampton fax: +44 23 8059 2865 >> Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk >> <mailto:l.moreau@ecs.soton.ac.uk> >> United Kingdom http://www.ecs.soton.ac.uk/~lavm >> <http://www.ecs.soton.ac.uk/%7Elavm> >> >> >> > -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Tuesday, 12 February 2013 22:15:40 UTC