- From: Paul Groth <pgroth@gmail.com>
- Date: Mon, 5 Nov 2012 21:52:14 -0500
- To: Luc Moreau <l.moreau@ecs.soton.ac.uk>
- Cc: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
- Message-ID: <CAJCyKRoybQU2U8uQ6S3ND1FCmYwV7V34z4szOjQoEbXC0M0JVg@mail.gmail.com>
Hello, I have reviewed prov-xml ( http://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html_ The document can be released as a first public working draft. Here are my detailed comments below many of which I now realize echo some of James' comments. The key issues being: leveraging xsd schema and explaining design decisions. Regards Paul ---Detailed Comments-- ==Abstract== The abstract could be shorter. Suggested revision: Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. It defines a concepts for expressing provenance information enabling interchange. This document introduces an XML schema for the PROV data model (PROV-DM), allowing instances of the PROV data model to be serialized in XML. ==Introduction== I like the focused nature of the document, not lots of justification around design choices, etc.. However, this should be clearly stated in the introduction. I would add a sentence something like: "This specification goal is to provide a succinct definition of the XML form of PROV-DM, thus, we refer the reader to the PROV-DM to provide overall justification and context to the definitions presented here." Also, I would link out to each of the concepts in the DM when they are presented within the document. ==2.1.1 Entity== In the example you have ex:version which I think may be confusing because we have revision in PROV. ==Use of prov:type for terms within the dataset== For all subtypes defined in prov the spec defines that one should use the prov:type construct. e.g. <prov:wasDerivedFrom> <prov:type>prov:Revision</prov:type></prov:wasDerivedFrom>. I was wondering what the rationale for that choice is. Why doesn't one see <prov:wasRevisionOf>? Clearly this is a pattern used throughout the document. I think this pattern deserves a small paragraph explaining why the approach was taken. This is especially true as XML Schema supports the definition of subtypes through xsd:extension ==Other Patterns== I think there are a couple of other patterns used within the schema design. Maybe adding a section on those patterns would help the reader more easily understand the approach. The patterns I see are: 1) use xml ids and refs to express the provenance graph (2) in type definitions required provenance elements are presented first, then optional provenance elements, then application specific elements 3) prov:attributes are interpreted as extra non-provenance elements within complex types (e.g. <xs:any namespace="##other"/>). I assume this is why specialization and alternate do not have extensibility points. 4) can you define the "salami slice XSD design pattern" in the text? ==prov:id== I was a bit confused by prov:id. Can you give some examples of what can go in prov:id? It's defined as a QName so I can't put a full url in? Your example of prov:id (prov:id="tr:WD-prov-dm-20111215) uses tr: which is not defined in the namespace. Is this just a mistake? It would be good to see an an example linking out beyond the scope of one document.
Received on Tuesday, 6 November 2012 02:52:42 UTC