- From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
- Date: Tue, 12 Feb 2013 21:09:07 +0000
- To: public-prov-wg@w3.org
Hi Stephan, Response interleaved. On 12/02/13 20:57, Stephan Zednik wrote: > A summary of the possible changes based on this discussion. I am in favor of all three listed changes. > > 1) rename prov:abstractElement to prov:internalElement (or similar) to make it clear we do not expect non-PROV extensions to use this element. It's good. > 2) add processContents="lax" on all xs:any elements. What was the problem with the current definition, what does this allow us to do? > 3) change the definition of prov:Bundle to the following (bundleElements name is not final) > > <xs:complexType name="Bundle"> > <xs:complexContent> > <xs:extension base="prov:Entity"> > <xs:sequence> > <xs:element name="bundleElements" minOccurs="0"> > <xs:complexType> > <xs:sequence maxOccurs="unbounded"> > <xs:group ref="prov:documentElements"/> > <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> > </xs:sequence> > </xs:complexType> > </xs:element> > </xs:sequence> > </xs:extension> > </xs:complexContent> > </xs:complexType> To me, this does not correspond to prov-dm. I regard the bundle construct as distinct from the entity construct. Luc > With the updated Bundle complexType the PROV-XML serialization for a bundle would look like this > > <prov:document > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xmlns:xsd="http://www.w3.org/2001/XMLSchema" > xmlns:ex="http://example.com/ns/ex#" > xmlns:prov="http://www.w3.org/ns/prov#"> > > <prov:person prov:id="bob"/> > > <ex:label>outside-bundle-label</ex:label> > > <prov:activity prov:id="a1"/> > > <prov:bundle prov:id="bundle1"> > > <prov:label>bundle1</prov:label> > <ex:label>label-on-bundle-entity</ex:label> > > <prov:bundleElements> > > <ex:label>in-bundle-label</ex:label> > > <prov:entity prov:id="ex:report1"> > <prov:type xsi:type="xsd:QName">report</prov:type> > <ex:version>1</ex:version> > </prov:entity> > > <ex:version>1.0.0</ex:version> > > <prov:wasGeneratedBy> > <prov:entity prov:ref="ex:report1"/> > <prov:activity prov:ref="a1"/> > <prov:time>2012-05-24T10:00:01</prov:time> > </prov:wasGeneratedBy> > > <ex:content>foo</ex:content> > > </prov:bundleElements> > > </prov:bundle> > > </prov:document> > > I used elements from the namespace "ex" to show how non-PROV elements can be used within a bundle and as PROV attributes on the bundle entity. > > --Stephan > > On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu> wrote: > >> Comments in-line, last two comments are the most important. >> >> >> On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote: >> >>> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu> wrote: >>>> This does not follow the pattern Stian suggested of updating Document so >>>> that bundles are required at the bottom of the document. >>>> >>>> Stian, does this make sense? Do you still prefer the other pattern you >>>> suggested in the earlier email? >>> Well to me it does not really matter if xs:any can appear anywhere in >>> <document> or just at the bottom of the <document> - but I think your >>> current solution means that you are allowed to put anything anywhere >>> in <document>, but in <bundle> you can only put the extensions after >>> <prov:value> but before the documentelements, which is a bit odd. >>> >>> It might be 'cleaner' to only allow extension stuff at the bottom, but >>> that could make it tricky for the bundle as it (now) specializes the >>> prov:Entity type and therefore the additional elements of Bundle come >>> below the <xs:any> from entity. >>> >> Yes, originally this worked because we had multiple xs:any in the prov:Bundle (inherited from both prov:Entity and prov:documentElements) but we violated the "unique particle attribution" rule which caused xjc to fail to generate java classes from the schema. >> >> We changed the schema to work well with xjc but in doing so introduced the odd restriction you have noted. I am still playing around with it to try to come up with a solution. >> >>> >>> >>>> Also, I think that we put the abstract element after the choice in document >>>> Elements because it caused problems with schema validation, but I can double >>>> check on that and see if it can be included in the choice. >>> I know, those things can get tricky.. it's another problem with XSD >>> and its particle separation. >>> >>> >>> I tried some example of making an extension: >>> >>> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> >>> >>> Here in <custom.xsd> I was *NOT* able to use >>> substitutionGroup="prov:abstractElement", because I get: >>> >>> Can't include the substitutionGroup as it causes: >>> "http://www.w3.org/ns/prov#":abstractElement >>> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their >>> substitution >>> group) violate "Unique Particle Attribution". >>> >>> >>> Basically this means that the only way to use the >>> substitutionGroup="prov:abstractElement" is to stay within the PROV >>> namespace. This might not be obvious to someone looking at our >>> schema. So I'm having doubts now. >> We can try to make this more clear in the Note. The abstractElement is only to be intended to be used with substitionGroups that are in the PROV Namespace. >> >>> >>> However, the general extension mechanism through xsd:any do work well, >>> and can validate also my non-prov elements -<custom-example.xml>, even >>> when I inserted those elements inside <prov:document>. >>> >>> >>> In <with-extensions.xml> I tried reusing some schemas of the shelf, >>> XHTML, MathML and DC Terms. This works fine thanks to xs:any as well. >>> I was even able to do nested inclusion reusing prov: elements, ie: >>> >>> <prov:document> >>> <mathml:annotation-xml> >>> <prov:wasAttributedTo> >>> <prov:entity prov:ref="formula"></prov:entity> >>> <prov:agent prov:ref="fred"/> >>> <dcterms:description>blalalla</dcterms:description> >>> <!-- ... --> >>> >>> (Those internal prov: elements should probably in most cases NOT be >>> considered part of the <prov:document> !) >>> >>> Now you can argue whether this would make sense or not, but that is >>> the downside of xsd:any - anything (in non-prov namespaces, in this >>> case) is allowed, not just content that should make sense by >>> declaration of substitution groups. The more xsd:any - the less you >>> have a schema and more you just have lots of fragmented types. >>> >> I think we are very limited in what we can say about how non-PROV extensions integrate with PROV. >> >>> >>> However I was unable to reuse namespaces like FOAF, because it does >>> not have an XSD schema. So sadly this is not allowed: >>> >>> <prov:person prov:id="johndoe"> >>> <foaf:name>John Doe</foaf:name> >>> </prov:person> >>> >>> I think this is too strict, and I suggest changing the xsd:any of >>> <prov:entity> and friends to processContent="lax" - this would only >>> validate against a schema if it's known. >> >>> We could rename prov:abstractElement to prov:internal or something to >>> make it less 'tempting' for external use. >>> >> I am ok with this. >> >>> >>> >>> We could in theory get rid of the whole documentElements and use only xs:any: >>> >>> >>> <xs:element name="document" type="prov:Document" /> >>> <xs:complexType name="Document"> >>> <xs:choice maxOccurs="unbounded"> >>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>> <xs:any namespace="##other" processContents="lax" /> >>> </xs:choice> >>> </xs:complexType> >>> >>> And then no substition groups is needed in our PROV extensions, any >>> declared <xs:element> would be allowed. >> If I understand this correctly, this would allow PROV attribute elements to be used on the document. >> >>> For consistency I've set >>> processContent=lax even for content of <prov:document> but we might >>> want to instead say that it should be strict, to encourage >>> PROV-extensions (rather than just providing attributes) to at least >>> declare a schema. >> I agree that PROV extensions should declare a schema. >> >>> >>> This would mean you could also insert <prov:value> inside >>> <prov:document> and so we would have to ensure that only "proper" >>> elements are declared as named <xs:element>. I tried changing them to >>> xs:group's and group refs which works fine. >>> >>> >>> >>> The above is quite tricky to get to work inside a <prov:bundle> >>> because all its prov elements are optional, and we get a clash between >>> those and the optional xs:any in the prov namespace. >>> >>> This is a bit odd anyway because <prov:bundle> plays a dual role with >>> both being a way to say an entity which is a bundle, but also just >>> lists its content flatly, and so we can't know if something listed is >>> part of the bundle or an attribute of the bundle - specially for >>> extensions. >>> >>> Saying something is a bundle could also be done as: >>> >>> <prov:entity> >>> <prov:type>prov:Bundle</prov:type> >>> </prov:entity> >>> >>> (I am a bit confused now, as the PROV-XML document says this is how >>> it should be done) >> We made a change to the types some time ago which is reflected in the editors' draft. >> >> https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html >> >> Since Bundles are specializations of Entity prov:Bundle extends prov:Entity. >> >>> >>> .. but I know the XML schema has similar 'helpers' for types like >>> prov:Person and prov:Revision so let's assume we keep the >>> <prov:bundle> entity. >>> >>> I then would propose changing the bundle to be: >>> >>> <prov:bundle> >>> <prov:label>A bundle</prov:bundle> >>> <dcterms:description>Still not part of the bundle</dcterms:description> >>> <prov:provenanceDescriptions> >>> <!-- the bundle content --> >>> <prov:activity /> >>> <!-- .. --> >>> </prov:provenanceDescriptions> >>> </prov:bundle> >>> >> I like this. >> >>> (We can argue about the name prov:provenanceDescriptions - I went for >>> something close to PROV-DM) >>> >>> >>> So this works fine: >>> >>> <xs:complexType name="Bundle"> >>> <xs:complexContent> >>> <xs:extension base="prov:Entity"> >>> <xs:sequence> >>> <xs:element name="provenanceDescriptions" minOccurs="0"> >>> <xs:complexType> >>> <xs:choice minOccurs="0" maxOccurs="unbounded"> >>> <xs:any namespace="##targetNamespace" processContents="strict" /> >>> <xs:any namespace="##other" processContents="lax" /> >>> </xs:choice> >>> </xs:complexType> >>> </xs:element> >>> </xs:sequence> >>> </xs:extension> >>> </xs:complexContent> >>> </xs:complexType> >>> >>> >>> Now the xsd:any from prov:Entity does not cause any problems, except >>> that they have to be stated BEFORE <prov:provenanceDescriptions>. To >>> change this we would have to do a copy/paste from prov:Entity instead >>> and move the xsd:any down. >> I am OK with this. >> >> What does the group think? >> >>> >>> >>> So it's possible, and not that unclean, to get rid of the substitution >>> groups, but it would allow non-PROV garbage (ie. schema elements which >>> were not intended as PROV extensions, like my MathML example above) >>> within <prov:document> and <prov:bundle>. >>> >>> I don't know what is the groups thoughts on extensions we should allow >>> for those, but at least it would be consistent with what PROV-N allows >>> - and then perhaps any PROV-N document could be translatable to >>> PROV-XML even without knowing the extensions. >>> >> I am ok with the substitution groups as they are. >> >> If you can present a desirable use case that is disallowed by the current modeling with substitution groups and supported by an alternate modeling than I will consider it. I don't want to make a late change without an example use case to consider. >> >> --Stephan >> >>> If you wish I can commit my version of the schemas which does the >>> above (but slightly tidied up), either to the tip or a new branch. >>> >>> >>> -- >>> Stian Soiland-Reyes, myGrid team >>> School of Computer Science >>> The University of Manchester >>> >> >> > -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Tuesday, 12 February 2013 21:10:00 UTC