- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Tue, 12 Feb 2013 14:29:51 +0000
- To: Stephan Zednik <zednis@rpi.edu>
- Cc: W3C provenance WG <public-prov-wg@w3.org>
On Tue, Feb 5, 2013 at 7:29 PM, Stephan Zednik <zednis@rpi.edu> wrote: > This does not follow the pattern Stian suggested of updating Document so > that bundles are required at the bottom of the document. > > Stian, does this make sense? Do you still prefer the other pattern you > suggested in the earlier email? Well to me it does not really matter if xs:any can appear anywhere in <document> or just at the bottom of the <document> - but I think your current solution means that you are allowed to put anything anywhere in <document>, but in <bundle> you can only put the extensions after <prov:value> but before the documentelements, which is a bit odd. It might be 'cleaner' to only allow extension stuff at the bottom, but that could make it tricky for the bundle as it (now) specializes the prov:Entity type and therefore the additional elements of Bundle come below the <xs:any> from entity. > Also, I think that we put the abstract element after the choice in document > Elements because it caused problems with schema validation, but I can double > check on that and see if it can be included in the choice. I know, those things can get tricky.. it's another problem with XSD and its particle separation. I tried some example of making an extension: <https://dvcs.w3.org/hg/prov/file/0bb02b43e80b/xml/examples> Here in <custom.xsd> I was *NOT* able to use substitutionGroup="prov:abstractElement", because I get: Can't include the substitutionGroup as it causes: "http://www.w3.org/ns/prov#":abstractElement and WC[##other:"http://www.w3.org/ns/prov#"] (or elements from their substitution group) violate "Unique Particle Attribution". Basically this means that the only way to use the substitutionGroup="prov:abstractElement" is to stay within the PROV namespace. This might not be obvious to someone looking at our schema. So I'm having doubts now. However, the general extension mechanism through xsd:any do work well, and can validate also my non-prov elements -<custom-example.xml>, even when I inserted those elements inside <prov:document>. In <with-extensions.xml> I tried reusing some schemas of the shelf, XHTML, MathML and DC Terms. This works fine thanks to xs:any as well. I was even able to do nested inclusion reusing prov: elements, ie: <prov:document> <mathml:annotation-xml> <prov:wasAttributedTo> <prov:entity prov:ref="formula"></prov:entity> <prov:agent prov:ref="fred"/> <dcterms:description>blalalla</dcterms:description> <!-- ... --> (Those internal prov: elements should probably in most cases NOT be considered part of the <prov:document> !) Now you can argue whether this would make sense or not, but that is the downside of xsd:any - anything (in non-prov namespaces, in this case) is allowed, not just content that should make sense by declaration of substitution groups. The more xsd:any - the less you have a schema and more you just have lots of fragmented types. However I was unable to reuse namespaces like FOAF, because it does not have an XSD schema. So sadly this is not allowed: <prov:person prov:id="johndoe"> <foaf:name>John Doe</foaf:name> </prov:person> I think this is too strict, and I suggest changing the xsd:any of <prov:entity> and friends to processContent="lax" - this would only validate against a schema if it's known. We could rename prov:abstractElement to prov:internal or something to make it less 'tempting' for external use. We could in theory get rid of the whole documentElements and use only xs:any: <xs:element name="document" type="prov:Document" /> <xs:complexType name="Document"> <xs:choice maxOccurs="unbounded"> <xs:any namespace="##targetNamespace" processContents="strict" /> <xs:any namespace="##other" processContents="lax" /> </xs:choice> </xs:complexType> And then no substition groups is needed in our PROV extensions, any declared <xs:element> would be allowed. For consistency I've set processContent=lax even for content of <prov:document> but we might want to instead say that it should be strict, to encourage PROV-extensions (rather than just providing attributes) to at least declare a schema. This would mean you could also insert <prov:value> inside <prov:document> and so we would have to ensure that only "proper" elements are declared as named <xs:element>. I tried changing them to xs:group's and group refs which works fine. The above is quite tricky to get to work inside a <prov:bundle> because all its prov elements are optional, and we get a clash between those and the optional xs:any in the prov namespace. This is a bit odd anyway because <prov:bundle> plays a dual role with both being a way to say an entity which is a bundle, but also just lists its content flatly, and so we can't know if something listed is part of the bundle or an attribute of the bundle - specially for extensions. Saying something is a bundle could also be done as: <prov:entity> <prov:type>prov:Bundle</prov:type> </prov:entity> (I am a bit confused now, as the PROV-XML document says this is how it should be done) .. but I know the XML schema has similar 'helpers' for types like prov:Person and prov:Revision so let's assume we keep the <prov:bundle> entity. I then would propose changing the bundle to be: <prov:bundle> <prov:label>A bundle</prov:bundle> <dcterms:description>Still not part of the bundle</dcterms:description> <prov:provenanceDescriptions> <!-- the bundle content --> <prov:activity /> <!-- .. --> </prov:provenanceDescriptions> </prov:bundle> (We can argue about the name prov:provenanceDescriptions - I went for something close to PROV-DM) So this works fine: <xs:complexType name="Bundle"> <xs:complexContent> <xs:extension base="prov:Entity"> <xs:sequence> <xs:element name="provenanceDescriptions" minOccurs="0"> <xs:complexType> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:any namespace="##targetNamespace" processContents="strict" /> <xs:any namespace="##other" processContents="lax" /> </xs:choice> </xs:complexType> </xs:element> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> Now the xsd:any from prov:Entity does not cause any problems, except that they have to be stated BEFORE <prov:provenanceDescriptions>. To change this we would have to do a copy/paste from prov:Entity instead and move the xsd:any down. So it's possible, and not that unclean, to get rid of the substitution groups, but it would allow non-PROV garbage (ie. schema elements which were not intended as PROV extensions, like my MathML example above) within <prov:document> and <prov:bundle>. I don't know what is the groups thoughts on extensions we should allow for those, but at least it would be consistent with what PROV-N allows - and then perhaps any PROV-N document could be translatable to PROV-XML even without knowing the extensions. If you wish I can commit my version of the schemas which does the above (but slightly tidied up), either to the tip or a new branch. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Tuesday, 12 February 2013 14:30:39 UTC