Re: prov-xml review for release as a FPWD from Stephan Zednik on 2012-11-06 (public-prov-wg@w3.org from November 2012)

From: Stephan Zednik <zednis@rpi.edu>
Date: Mon, 5 Nov 2012 23:57:45 -0700
To: "pgroth@gmail.com" <pgroth@gmail.com>
Cc: Luc Moreau <l.moreau@ecs.soton.ac.uk>, "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Message-Id: <344F8FB9-A3E6-40CA-A324-739B1009C8E2@rpi.edu>

Thanks Paul.

--Stephan

On Nov 5, 2012, at 7:52 PM, Paul Groth <pgroth@gmail.com> wrote:

> Hello,
> 
> I have reviewed prov-xml ( http://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html_
> 
> The document can be released as a first public working draft. Here are my detailed comments below many of which I now realize echo some of James' comments. The key issues being: leveraging xsd schema and explaining design decisions.
> 
> Regards
> Paul
> 
> ---Detailed Comments--
> 
> ==Abstract==
> 
> The abstract could be shorter. Suggested revision:
> 
> Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. It defines a concepts for expressing provenance information enabling interchange. This document introduces an XML schema for the PROV data model (PROV-DM), allowing instances of the PROV data model to be serialized in XML.
> 
> ==Introduction==
> I like the focused nature of the document, not lots of justification around design choices, etc.. However, this should be clearly stated in the introduction. I would add a sentence something like: "This specification goal is to provide a succinct definition of the XML form of PROV-DM, thus, we refer the reader to the PROV-DM to provide overall justification and context to the definitions presented here."
> 
> Also, I would link out to each of the concepts in the DM when they are presented within the document.	
> 
> ==2.1.1 Entity==
> In the example you have ex:version which I think may be confusing because we have revision in PROV. 
> 
> ==Use of prov:type for terms within the dataset==
> 
> For all subtypes defined in prov the spec defines that one should use the prov:type construct. e.g. <prov:wasDerivedFrom> <prov:type>prov:Revision</prov:type></prov:wasDerivedFrom>. I was wondering what the rationale for that choice is. Why doesn't one see <prov:wasRevisionOf>? 
> 
> Clearly this is a pattern used throughout the document. I think this pattern deserves a small paragraph explaining why the approach was taken. This is especially true as XML Schema supports the definition of subtypes through xsd:extension
> 
> ==Other Patterns==
> I think there are a couple of other patterns used within the schema design. Maybe adding a section on those patterns would help the reader more easily understand the approach. The patterns I see are:
> 1) use xml ids and refs to express the provenance graph (2) in type definitions required provenance elements are presented first, then optional provenance elements, then application specific elements
> 3) prov:attributes are interpreted as extra non-provenance elements within complex types (e.g.  <xs:any namespace="##other"/>). I assume this is why specialization and alternate do not have extensibility points.
> 4) can you define the "salami slice XSD design pattern" in the text?
> 
> ==prov:id==
> I was a bit confused by prov:id. Can you give some examples of what can go in prov:id? It's defined as a QName so I can't put a full url in? Your example of prov:id (prov:id="tr:WD-prov-dm-20111215) uses tr: which is not defined in the namespace. Is this just a mistake? It would be good to see an an example linking out beyond the scope of one document.
> 
> 
> 
> 
> 
> 
> 
>

Received on Tuesday, 6 November 2012 06:58:13 UTC