- From: Stephan Zednik <zednis@rpi.edu>
- Date: Tue, 12 Feb 2013 13:57:04 -0700
- To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Cc: W3C provenance WG <public-prov-wg@w3.org>
A summary of the possible changes based on this discussion. I am in favor of all three listed changes. 1) rename prov:abstractElement to prov:internalElement (or similar) to make it clear we do not expect non-PROV extensions to use this element. 2) add processContents="lax" on all xs:any elements. 3) change the definition of prov:Bundle to the following (bundleElements name is not final) <xs:complexType name="Bundle"> <xs:complexContent> <xs:extension base="prov:Entity"> <xs:sequence> <xs:element name="bundleElements" minOccurs="0"> <xs:complexType> <xs:sequence maxOccurs="unbounded"> <xs:group ref="prov:documentElements"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> With the updated Bundle complexType the PROV-XML serialization for a bundle would look like this <prov:document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ex="http://example.com/ns/ex#" xmlns:prov="http://www.w3.org/ns/prov#"> <prov:person prov:id="bob"/> <ex:label>outside-bundle-label</ex:label> <prov:activity prov:id="a1"/> <prov:bundle prov:id="bundle1"> <prov:label>bundle1</prov:label> <ex:label>label-on-bundle-entity</ex:label> <prov:bundleElements> <ex:label>in-bundle-label</ex:label> <prov:entity prov:id="ex:report1"> <prov:type xsi:type="xsd:QName">report</prov:type> <ex:version>1</ex:version> </prov:entity> <ex:version>1.0.0</ex:version> <prov:wasGeneratedBy> <prov:entity prov:ref="ex:report1"/> <prov:activity prov:ref="a1"/> <prov:time>2012-05-24T10:00:01</prov:time> </prov:wasGeneratedBy> <ex:content>foo</ex:content> </prov:bundleElements> </prov:bundle> </prov:document> I used elements from the namespace "ex" to show how non-PROV elements can be used within a bundle and as PROV attributes on the bundle entity. --Stephan On Feb 12, 2013, at 12:49 PM, Stephan Zednik <zednis@rpi.edu> wrote: > Comments in-line, last two comments are the most important. > > > On Feb 12, 2013, at 7:29 AM, Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote: > >> On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu> wrote: >>> This does not follow the pattern Stian suggested of updating Document so >>> that bundles are required at the bottom of the document. >>> >>> Stian, does this make sense? Do you still prefer the other pattern you >>> suggested in the earlier email? >> >> Well to me it does not really matter if xs:any can appear anywhere in >> <document> or just at the bottom of the <document> - but I think your >> current solution means that you are allowed to put anything anywhere >> in <document>, but in <bundle> you can only put the extensions after >> <prov:value> but before the documentelements, which is a bit odd. >> >> It might be 'cleaner' to only allow extension stuff at the bottom, but >> that could make it tricky for the bundle as it (now) specializes the >> prov:Entity type and therefore the additional elements of Bundle come >> below the <xs:any> from entity. >> > > Yes, originally this worked because we had multiple xs:any in the prov:Bundle (inherited from both prov:Entity and prov:documentElements) but we violated the "unique particle attribution" rule which caused xjc to fail to generate java classes from the schema. > > We changed the schema to work well with xjc but in doing so introduced the odd restriction you have noted. I am still playing around with it to try to come up with a solution. > >> >> >> >>> Also, I think that we put the abstract element after the choice in document >>> Elements because it caused problems with schema validation, but I can double >>> check on that and see if it can be included in the choice. >> >> I know, those things can get tricky.. it's another problem with XSD >> and its particle separation. >> >> >> I tried some example of making an extension: >> >> <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> >> >> Here in <custom.xsd> I was *NOT* able to use >> substitutionGroup="prov:abstractElement", because I get: >> >> Can't include the substitutionGroup as it causes: >> "http://www.w3.org/ns/prov#":abstractElement >> and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their >> substitution >> group) violate "Unique Particle Attribution". >> >> >> Basically this means that the only way to use the >> substitutionGroup="prov:abstractElement" is to stay within the PROV >> namespace. This might not be obvious to someone looking at our >> schema. So I'm having doubts now. > > We can try to make this more clear in the Note. The abstractElement is only to be intended to be used with substitionGroups that are in the PROV Namespace. > >> >> >> However, the general extension mechanism through xsd:any do work well, >> and can validate also my non-prov elements -<custom-example.xml>, even >> when I inserted those elements inside <prov:document>. >> >> >> In <with-extensions.xml> I tried reusing some schemas of the shelf, >> XHTML, MathML and DC Terms. This works fine thanks to xs:any as well. >> I was even able to do nested inclusion reusing prov: elements, ie: >> >> <prov:document> >> <mathml:annotation-xml> >> <prov:wasAttributedTo> >> <prov:entity prov:ref="formula"></prov:entity> >> <prov:agent prov:ref="fred"/> >> <dcterms:description>blalalla</dcterms:description> >> <!-- ... --> >> >> (Those internal prov: elements should probably in most cases NOT be >> considered part of the <prov:document> !) >> >> Now you can argue whether this would make sense or not, but that is >> the downside of xsd:any - anything (in non-prov namespaces, in this >> case) is allowed, not just content that should make sense by >> declaration of substitution groups. The more xsd:any - the less you >> have a schema and more you just have lots of fragmented types. >> > > I think we are very limited in what we can say about how non-PROV extensions integrate with PROV. > >> >> >> However I was unable to reuse namespaces like FOAF, because it does >> not have an XSD schema. So sadly this is not allowed: >> >> <prov:person prov:id="johndoe"> >> <foaf:name>John Doe</foaf:name> >> </prov:person> >> >> I think this is too strict, and I suggest changing the xsd:any of >> <prov:entity> and friends to processContent="lax" - this would only >> validate against a schema if it's known. > > >> >> We could rename prov:abstractElement to prov:internal or something to >> make it less 'tempting' for external use. >> > > I am ok with this. > >> >> >> >> We could in theory get rid of the whole documentElements and use only xs:any: >> >> >> <xs:element name="document" type="prov:Document" /> >> <xs:complexType name="Document"> >> <xs:choice maxOccurs="unbounded"> >> <xs:any namespace="##targetNamespace" processContents="strict" /> >> <xs:any namespace="##other" processContents="lax" /> >> </xs:choice> >> </xs:complexType> >> >> And then no substition groups is needed in our PROV extensions, any >> declared <xs:element> would be allowed. > > If I understand this correctly, this would allow PROV attribute elements to be used on the document. > >> For consistency I've set >> processContent=lax even for content of <prov:document> but we might >> want to instead say that it should be strict, to encourage >> PROV-extensions (rather than just providing attributes) to at least >> declare a schema. > > I agree that PROV extensions should declare a schema. > >> >> >> This would mean you could also insert <prov:value> inside >> <prov:document> and so we would have to ensure that only "proper" >> elements are declared as named <xs:element>. I tried changing them to >> xs:group's and group refs which works fine. >> >> >> >> The above is quite tricky to get to work inside a <prov:bundle> >> because all its prov elements are optional, and we get a clash between >> those and the optional xs:any in the prov namespace. >> >> This is a bit odd anyway because <prov:bundle> plays a dual role with >> both being a way to say an entity which is a bundle, but also just >> lists its content flatly, and so we can't know if something listed is >> part of the bundle or an attribute of the bundle - specially for >> extensions. >> >> Saying something is a bundle could also be done as: >> >> <prov:entity> >> <prov:type>prov:Bundle</prov:type> >> </prov:entity> >> >> (I am a bit confused now, as the PROV-XML document says this is how >> it should be done) > > We made a change to the types some time ago which is reflected in the editors' draft. > > https://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html > > Since Bundles are specializations of Entity prov:Bundle extends prov:Entity. > >> >> >> .. but I know the XML schema has similar 'helpers' for types like >> prov:Person and prov:Revision so let's assume we keep the >> <prov:bundle> entity. >> >> I then would propose changing the bundle to be: >> >> <prov:bundle> >> <prov:label>A bundle</prov:bundle> >> <dcterms:description>Still not part of the bundle</dcterms:description> >> <prov:provenanceDescriptions> >> <!-- the bundle content --> >> <prov:activity /> >> <!-- .. --> >> </prov:provenanceDescriptions> >> </prov:bundle> >> > > I like this. > >> (We can argue about the name prov:provenanceDescriptions - I went for >> something close to PROV-DM) >> >> >> So this works fine: >> >> <xs:complexType name="Bundle"> >> <xs:complexContent> >> <xs:extension base="prov:Entity"> >> <xs:sequence> >> <xs:element name="provenanceDescriptions" minOccurs="0"> >> <xs:complexType> >> <xs:choice minOccurs="0" maxOccurs="unbounded"> >> <xs:any namespace="##targetNamespace" processContents="strict" /> >> <xs:any namespace="##other" processContents="lax" /> >> </xs:choice> >> </xs:complexType> >> </xs:element> >> </xs:sequence> >> </xs:extension> >> </xs:complexContent> >> </xs:complexType> >> >> >> Now the xsd:any from prov:Entity does not cause any problems, except >> that they have to be stated BEFORE <prov:provenanceDescriptions>. To >> change this we would have to do a copy/paste from prov:Entity instead >> and move the xsd:any down. > > I am OK with this. > > What does the group think? > >> >> >> >> So it's possible, and not that unclean, to get rid of the substitution >> groups, but it would allow non-PROV garbage (ie. schema elements which >> were not intended as PROV extensions, like my MathML example above) >> within <prov:document> and <prov:bundle>. >> >> I don't know what is the groups thoughts on extensions we should allow >> for those, but at least it would be consistent with what PROV-N allows >> - and then perhaps any PROV-N document could be translatable to >> PROV-XML even without knowing the extensions. >> > > I am ok with the substitution groups as they are. > > If you can present a desirable use case that is disallowed by the current modeling with substitution groups and supported by an alternate modeling than I will consider it. I don't want to make a late change without an example use case to consider. > > --Stephan > >> >> If you wish I can commit my version of the schemas which does the >> above (but slightly tidied up), either to the tip or a new branch. >> >> >> -- >> Stian Soiland-Reyes, myGrid team >> School of Computer Science >> The University of Manchester >> > > >
Received on Tuesday, 12 February 2013 20:57:42 UTC