Meeting Record HCLS SciDiscourse call Monday February 20, 2012: Yolanda Gil and Paul Groth discuss Prov

http://www.w3.org/2012/02/20-hcls2-minutes

   [1]W3C

                              Scientific Discourse

20 Feb 2012

Attendees

   Present
          +1.619.252.aaaa, pgroth, EricP, RichBoyce, +1.310.279.aabb,
          [IPcaller], Yolanda, Jodi, +44.777.500.aacc, +1.617.947.aadd

   Regrets

   Chair
          Anita

   Scribe
          ericP

Contents

     * [2]Topics
         1. [3]Use Case 1
         2. [4]Use Case 2
         3. [5]Use Case 3
     * [6]Summary of Action Items
     __________________________________________________________________

   <Anita> Did you dial into the UK or France? My colleagues are having
   problems dialing in

   <pgroth> helena deus?

   <pgroth> [7]http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

   <boycer> please repost link to slides for those of us who came in late
   to the chat

   <Anita> prov-overview-update-pg.pdf [View next to chat]

   <jodi> [8]http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

   <boycer> thanks!

   [slide 2]

   <Anita> Will also post to meeting wiki at
   [9]http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/m
   eetings/20120220

   <scribe> scribenick: ericP

   [slide 3]

   <Anita> ericP In 'note that the topic has the link to the slides' -
   where/what is this 'topic' that you speak of?

   <Gully> repost slide location?

   pgroth: Science needs repeatability and reproducability

   <pgroth> [10]http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

   [slide 4]

   pgroth: [timbl's quote about the "Oh yeah?" button] [actually comes
   from Dan Connolly, I believe]

   [slide 5]

   pgroth: many folks want prov representations immediately
   ... also, many implementations

   <YolandaGil> We took Tim's quote from in his book as I recall

   <Anita> ericP do you have the details on the 'other' way to dial in?

   pgroth: the incubator group produced a literature review and a WG

   [slide 8]

   <pgroth> slide 8

   pgroth: wide membership: industry, govornment, science

   [slide 9]

   pgroth: at the top of slide 8 are serializations of e.g. OWL encoding
   (Prov-O), XML (Prov-XML), JSON, ...
   ... these express the provenance data model (Prov-DM)
   ... one can access the DM via Prov-AQ

   <Anita> Prov AQ = Provenance Accessing Query - how do I go from
   resource to Provenance

   pgroth: we also have a provenance primer

   [slide 10]

   pgroth: DM has a small set of classes and relationships

   [slide 11 - provenance example]

   pgroth: [re: slide 11] we need a process to describe the chart which
   Alice produced

   [slide 12]

   pgroth: "Entity" describes a chart, CSV file, car, ...

   [slide 13]

   [slide 14]

   pgroth: Prov helps us know who did what

   <Anita> Agent takes an active role in an activity

   <Anita> (Small thought: you could model the entire process of doing
   science in this model!)

   pgroth: Prov talks about Entities, Agents and Activities

   [slide 16]

   pgroth: Activities can produce Entities

   [slide 17]

   <Anita> 'Generation is the production of a new entity by an activity'
   (by an agent, correct?)

   pgroth: Usage is when an Activity consumes an entity

   [slide 18]

   pgroth: sometimes you want to describe the derivation without
   describing the exact process.
   ... e.g. "Chart1 was derived from Entity1" without going into details

   [slide 19]

   pgroth: want to link an Activity to an Agent
   ... e.g. "Alice is responsible for the Excel analysis"

   [slide 20]

   pgroth: responsibility of Alice. analysis used a CSV file, chart was
   derived from the CSV file

   @@1: when you write out workflows for e.g. experiments, you have lots
   of intermediate activities

   scribe: i found having an activity-to-activity relationship is useful
   ... do you have a representation of that intermediate structure.
   otherwise representation gets laborious

   [slide 21]

   pgroth: i think you're describing something we call wasInformedBy
   ... wasAttributedTo is fundamental
   ... we offer extra constructs.
   ... there may be many constructs which you want, but we need a small
   set to achieve interop
   ... that's a trade-off frequently discussed in the group

   [slide 22]

   pgroth: it's easy to write this in Turtle

   [slide 23]

   [slide 24]

   pgroth: we're working on simplifying the explainations
   ... released a draft last year. got the feedback that we need a simpler
   explaination
   ... Ontology is available but volatile at this point
   ... we're working with Dublin Core on a document describing the
   relationship between Prov and DC
   ... preparing for deeper community feedback
   ... aiming for Rec by the end of 2012

   [slide 25]

   pgroth: want to hear how in HCLS you need to extend this model
   ... there are some implementations, and if you want to implement, we're
   anxious to work with you

   <Anita> This is a good time to give feedback to teh PROV model

   pgroth: feedback is useful now

   1+

   <pgroth> go ahead

   Tim: please use a transcript of this presentation as documentation as
   it was very clear

   <Anita> DavidShotton did you want to be added to the speaker queue?

   Tim: Roles would be useful

   <pgroth> err... actually...

   <Anita> Tim: Can you add roles to the (core) model?

   Tim: I reallize that Roles don't belong in the Core
   ... e.g. role as a presenter, convener, etc.
   ... need it for a project

   <Anita> TIm; I have a project I need that in

   <DavidShotton> We are developing role ontologies

   <Anita> pgroth: roles are in the spec

   pgroth: we do have roles. i should add to this explaination
   ... we have a placeholder for entities with respect to activites

   <Anita> pgroth: have a placeholder - entities wrt activity: e.g. chart
   plays-role-of output

   <Anita> pgroth: but we don't define any roles

   pgroth: we have a construct called prov:Role, but don't supply any
   kinds of Role

   <pgroth>
   [11]http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html

   <DavidShotton> I would distinguish the STATUS of a document as an
   output from the ROLE of an AGENT

   pgroth: you'll find Roles in the primer

   Tim

   <Anita> Tim: implementations: we started working with Susanna Sansone
   on an OWL model to encapsulate ISA-TAB data for experiments;

   Tim: re: implementations and use use, we're working with Susana S on
   the ISA representation of genetic experiments
   ... we started saying "we need to subclass OBI" but migrated to Prov

   <Anita> Tim: incorporating provenance as a motivator for that; much
   more useful to take Provenance as a motivator for experiments

   <Zakim> ericP, you wanted to ask about Prov-AQ vs. SPARQL over the RDF
   representation

   <Anita> So - if you have a question please type in 'q+' and then your
   question

   pgroth: Prov-AQ tells you how to go from a resource to it's associated
   provenance
   ... we have a number of ways

   <Anita> EricP asks question about Prov-AQ

   pgroth: .. what metadata you need to embed in HTML to get to the
   provenance
   ... .. use of HTTP protocol to look up a prov service or set of data

   <Anita> Are there other questions?

   pgroth: another part of Prov-AQ is an query endpoint
   ... .. you POST or GET and we give you back some prov info
   ... also discusses SPARQL patterns

   DavidShotton: excellent work. the simplicity is the key thing
   ... people have tried to capture e.g. time-dependent Roles, but that
   complicates things

   pgroth: we've seen two broad uses:
   ... .. i've got a web doc or publication and i want to assign some
   prove info.

   <Anita> paulG 1) web document, want to assign Provenance information;
   need simple document

   pgroth: .. i want to track in a detailed fashion provenance in an
   automated system

   <Anita> paulG: 2) You want to track in lots of detail provenance in
   automated system, fixed or time-dependent things, Paul on Feb 20th at
   16:43 in Amsterdam

   pgroth: in latter, i want to talk about e.g. "Paul at 4:21 while he's
   in Europe"
   ... supporting both of these views has been a challenge to the group
   ... we think we've got it with this model of starting with a simple
   model but having additions you can use

   <Zakim> Anita, you wanted to say current implementations anywhere?
   Also: publishers can apply somehow?

   Anita: there might be provenance stored in workflow tools
   ... could the prov model expose that in e.g. supplementary material in
   a publication?

   <Anita> Sorry really two questions: first what are implementations?
   Second can we store workflow data in publication as Prov?

   pgroth: Workflow4Ever (Taverna) provenance info is exposed via the Prov
   model
   ... there are some tools for mapping from OPN to Prov

   <DavidShotton> For encoding time-related roles, see
   [12]http://imageweb.zoo.ox.ac.uk/pub/2012/cerif/Shotton&Peroni_PRO-and-
   PSO.ppt

   pgroth: not many implementation yet, still in a bit of flux

   <pgroth> [13]http://www.w3.org/2011/prov/wiki/TavernaProvenance

   <pgroth> ack, I forgot about wings!

   Anita: we've been talking with many folks about capturing workflows

   <pgroth> sorry yolanda :-)

   <david_r_newman> [14]http://www.wf4ever-project.org/

   Anita: and presenting in a way which can be easily used

   <DavidShotton> Workflows 4 Ever [15]http://www.wf4ever-project.org/

   Anita: prov model is generic enough that it could be used in many
   domain

   Tim: i expected another use case:
   ... .. when you guys publish, there's workflow applied to treating the
   manuscript

   Anita: good point, we don't have good ways of capturing that

   <pgroth> [16]https://github.com/lucmoreau/ProvToolbox

   Anita: could start earlier at e.g. figures and charts
   ... i don't know of any publisher who has that

   <DavidShotton> Simple post hoc capture of publishing workflows: The
   Publishing Workflows Ontology [17]http://purl.org/spar/pwo/

   Tim: this is a goal of ISA-RDF
   ... .. define metadata in a way which standardizes workflow and
   provenance

   <pgroth> [18]https://github.com/INCF/ProvenanceLibrary

   <Anita> Tim: say if you publish something and you have a figure - be
   able to click on it and go to a data repository that has primary data,
   steps and provenance

   Tim: we're hoping by having a standard model, you could be able to e.g.
   click on a heatmap and get back to the source data

   YolandaGil: i thought Tim was discussing the e.g. review process
   ... there's also the processing to generate a certain figure

   <Anita> YolandaGil: Tim is capturing review process, is one workflow;
   other one is kind of processing that took place to generate a certain
   figure

   YolandaGil: or the lab experiments and steps you took to obtain the
   data in the first place
   ... these are different steps, but they are connected

   <Anita> Or link to gully's work: steps taken in the lab, observational
   assertions, interpretational assertions...

   YolandaGil: reviewers often don't have a good way to check the work
   that they are reviewing
   ... this could help the reviewer inspect and detect errors

   <Anita> YolandaGil: easier to inspect errors and insufficiencies in the
   papers; helps us review (and reproduce! - Anita) what was done

   YolandaGil: the issue of credit is crucial to scientists

   <Anita> YolandaGil credit is intertwined with provenance - different
   people do different things: who did what?

   YolandaGil: with today's level of collaboration/re-use, they want to
   assign credit with precise detail

   <Anita> ORCID is interested in developing a taxonomy of microcredit
   attribution

   YolandaGil: if we were able to easily record prov on a dataset,
   samples, etc, we'd end up with extensive provenance records
   ... gives us the movie-level credits

   <Anita> It would be interesting to link this - Amy Brand at Harvard is
   leading this, my colleague Mike Taylor is working on it in practice

   Anita: Orchid is working on a model for micro-credit for authors
   working on a paper
   ... use case came from Provost, the evaluation side

   <Anita> [19]http://about.orcid.org/

   YolandaGil: you may not be able to anticipate the credits you want to
   include
   ... prov model would provide a substrate

   Tim: there's a little workshop on this

   <Anita> [20]http://about.orcid.org/civicrm/event/info?id=4&reset=1
   ORCID workshop

   <Anita> Tim: Yolanda can you attend the ORCID event?

   <jodi> definitely a good time to hear about this!

   Anita: many thanks. you'll hear from us

   <boycer> one sec

   <YolandaGil> Thank you for inviting us to talk about provenance!

Use Case 1

   <pgroth> thanks everyone

   boycer: on track wrt milestones

   <Anita> Use case 1: developing demo:
   [21]https://docs.google.com/document/d/1QpW-axtGL7Tuhd_Zcaf30a4-s4S_lve
   IJtpKL5QBLfQ/edit?hl=en_US

   boycer: Anita, Joey and Anita, we've developed a proof-of-concept of
   the product inserts

   <pgroth> If you have any questions, let us know. We really want
   feedback

   boycer: linked to claims at clinical trials.gov, medical pathways, and
   drug-drug interactions

Use Case 2

   <Anita> Use case 2: Boycer is presenting at C-SHALS this week about Use
   Case 1 - [22]http://www.iscb.org/cshals2012-program/

   DavidShotton: no progress since last meeting

   <Anita> Sorry that's Use case 1: Boycer is presenting at C-SHALS this
   week about Use Case 1 - [23]http://www.iscb.org/cshals2012-program/

   <Anita> DavidShotton: no news on UC 2, need to talk to TIm

   <Anita> Tim: we have model we've been developing with classes and
   object properties, not yet datatype props get started on discussions,
   happy to chat

   Tim: i can share with you the latest and greatest, added object
   properties and starting on data properties

Use Case 3

   Anita: joanne and I have been developing UC3 for a class which Deb
   McGuinness teacing
   ... students will provide a portal to link adolescent antidepressants
   to drug interaction and some proprietary Elsevier data

   DavidShotton: there are two meetings in Cambrige MA related Roles:
   ... Welcome Trust and then ORCHID the next day
   ... 16 and 17 May

   Tim: they are specifically joined

   Anita: next joint meeting in 4 weeks
   ... soliciting thoughts for presentations

Summary of Action Items

   [End of minutes]
     __________________________________________________________________


    Minutes formatted by David Booth's [24]scribe.perl version 1.136
    ([25]CVS log)
    $Date: 2012/02/20 16:17:07 $

References

   1. http://www.w3.org/
   2. http://www.w3.org/2012/02/20-hcls2-minutes#agenda
   3. http://www.w3.org/2012/02/20-hcls2-minutes#item01
   4. http://www.w3.org/2012/02/20-hcls2-minutes#item02
   5. http://www.w3.org/2012/02/20-hcls2-minutes#item03
   6. http://www.w3.org/2012/02/20-hcls2-minutes#ActionSummary
   7. http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf
   8. http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf
   9. http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/meetings/20120220
  10. http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf
  11. http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html
  12. http://imageweb.zoo.ox.ac.uk/pub/2012/cerif/Shotton&Peroni_PRO-and-PSO.ppt
  13. http://www.w3.org/2011/prov/wiki/TavernaProvenance
  14. http://www.wf4ever-project.org/
  15. http://www.wf4ever-project.org/
  16. https://github.com/lucmoreau/ProvToolbox
  17. http://purl.org/spar/pwo/
  18. https://github.com/INCF/ProvenanceLibrary
  19. http://about.orcid.org/
  20. http://about.orcid.org/civicrm/event/info?id=4&reset=1
  21. https://docs.google.com/document/d/1QpW-axtGL7Tuhd_Zcaf30a4-s4S_lveIJtpKL5QBLfQ/edit?hl=en_US
  22. http://www.iscb.org/cshals2012-program/
  23. http://www.iscb.org/cshals2012-program/
  24. http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
  25. http://dev.w3.org/cvsweb/2002/scribe/

-- 
-ericP

Received on Monday, 20 February 2012 16:24:19 UTC