- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Tue, 29 Jan 2013 13:59:02 +0000
- To: Antoine Isaac <aisaac@few.vu.nl>
- Cc: public-openannotation@w3.org
--- this is getting off topic, but it's good to hear there is interest! On Mon, Jan 28, 2013 at 10:13 PM, Antoine Isaac <aisaac@few.vu.nl> wrote: > Without even criticizing the model a single second, I see indeed > distinctions like "digital resource", "digital artifact", etc. I've fought > with these for too long in my domain, and I can see cans of worms flashing > around and long reading and discussions coming... Yes, that is a big can of worm, not too dissimilar from the HTTP Range 14 discussions (about resources and their representations being the same 'thing' or not). In PAV we simply try to say that authorship/contribution has to do with the knowledge or content that is represented ("IP" if you like, although I hate the term), and "creation" has to do with making the digital form this take (not necessarily the exact representation like RDF/XML vs Turtle). How this split is realized, if at all, is domain and application specific. For instance it's quite straight forward for a Word document where I typed in a chapter from Lord of the Rings, then that word document was pav:authoredBy J. R. R. Tolkien and pav:createdBy Stian, and it was pav:createdWith Word. In PROV terms, you can think of authorship as something that belongs to a more general, abstract entity that the "digital resource" is a prov:specializationOf. Similarly for annotations, if I take the author's handwritten notes in the original Lord of the Rings manuscript and formalize them as oa:Annotation's, then those annotations are pav:authoredBy :Tolkien and pav:createdBy :Stian. However this gets trickier the moment the knowledge itself is a digital thing rather than something which is merely represented with digital concepts; for instance an ontological model, an RDF dataset, a spreadsheet that calculates mortgage payments. For simple cases the creator and author is just the same person, so there is no problem, and you might want to only represent one of those. The distinction can come into play when one talks about transformations of formats and similar, which PAV provides more specialized terms for, like pav:importedFrom and pav:importedBy. So if you made the spreadsheet in excel and I just copy it and put it on my website, then you are still both the author and creator, and I mark the provenance to the orginal using pav:retrievedFrom and my role using pav:retrievedBy. If I then saved it in OpenOffice format, then you are still the author of my OO spreadsheet, while I am now the creator. (as here I consider the workings of the spreadsheet as the 'knowledge'). retrievedFrom changes to importedFrom. However if I also needed to fix a formula in the spreadsheet to make it work in Open Office, then I also become a curator (pav:curatedBy). ( In a different domain it could be that a spreadsheet contains survey data imported from a CSV which was extracted from a survey database ; here the authorship relates to the survey data, while creation might deals with making it into a tabular format, no matter if it has been converted from CSV to XLS.) If I add a bit of new functionality, then I am a contributor (pav:contributedBy), and the OO spreadsheet is now just pav:derivedFrom the original rather than imported from it. If that functionality is "significant", then I would now also be an author. If your bit is superseded by my 3d version, then now you remain only as an author of the spreadsheet that my spreadsheet was pav:derivedFrom. .. and with that I think I explained almost the whole model... *copy to paper*. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Tuesday, 29 January 2013 13:59:54 UTC