Re: [dxwg] Provenance information [RPIF]

@larsgsvensson the differentiation you describe is how I describe things so I agree with your general characterisations. 

I do agree that defining a *process* can be tricky but if we stick to the "provenance that we want", not a "provenance that could be modelled" then we can usually do something sensible. In the example you give of a servlet validating something I would model it thus:

* the *process* as a `prov:Activity` - starting and ending with the processing of the RDF of interest, regardless of any other jobs it may be doing (we don't care about those)
* the servlet as a `prov:SoftwareAgent`, if that's important to know, or perhaps the server itself
  * the choice of which Agent to model will come down to what facts are most importantt o know for a Use Case such as recording info for potential process recreation
* the *input* of the RDF file being validated as a `prov:Entity`
* the *input* of a SHACL file as a `prov:Entity` - not a `prov:Plan`
  * here the SHACL file is not instructing the Activity. It's determining a validation assessment but the conducting of the Activity itself is, in fact, guided by the code that applies the validation to the data, the SHACL file to the input RDF.
* the output of the validation task - pass, fail, error messages etc - a `prov:Entity` that `prov:wasDerivedFrom` the two *inputs* AND the `prov:Plan` that instructed that the SHACL *input* be applied to the RDF *input*

So this modelling will allow someone to see when (`Activity`) something (whichever `Agent`) did what (`Plan`) with what inputs (`Entity` x 2) and what output (`Entity`). Sure, you could model things differently but what's the Use Case? 

-- 
GitHub Notification of comment by nicholascar
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/76#issuecomment-394080012 using your GitHub account

Received on Saturday, 2 June 2018 11:26:32 UTC