- From: Satya Sahoo <satya.sahoo@case.edu>
- Date: Thu, 19 Apr 2012 11:35:45 -0400
- To: Paolo Missier <Paolo.Missier@ncl.ac.uk>
- Cc: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
- Message-ID: <CAOMwk6zB_+BDhVm_ASJw8v2cX_Hh9hNTvNEuR1EssgKqq87n2w@mail.gmail.com>
Hi Paolo, Similar to Jun, my use case also requires 2 and not just 1. On Thu, Apr 19, 2012 at 3:55 AM, Paolo Missier <Paolo.Missier@ncl.ac.uk>wrote: > Good morning. Catching up by picking on latest mail for continuity. > My thoughts: > > - I like Tim's proposal to rename the current form of Collections as > Dictionaries, as it's what they are -- and the current DM text does > acknowledge that. > > So let's consider: > > 1- static sets (or multisets), like those in Satya & Jun's examples. These > are necessary but I will argue, not sufficient. They can answer the > question: "what does set s contain?", and "is x member of s?" > > 2- sets (or multisets) that are subject to updates. In general the > question you want to support is: "how did this set reach its current > state?" > Specific examples given in the past include: programs that manipulate > lists (or even simpler sets), or more mundane ones: monitoring people going > in and out of a building, counting people who board a plane, etc. > > 3- dictionaries, which are sets with more interesting properties (you can > use the keys for indexing, you can simulate ordered lists by encoding the > position in the key, etc.) > Incidentally, I find that user-defined keys are more generally than > conventionally-imposed key names as in RDF (rdf:_1 etc.) > you can now track dependencies of the form: "entity with key k1 in > dictionary d1 was derived from entity with key k2 in dictionary d2", where > keys guarantee uniqueness. > > (All of the above can be nested structures simply by assuming, as we have > done, that elements can be sets themselves) > > So what I observe is that 3 subsumes 2 subsumes 1. > > I did not understand this - as I see it, 1 subsumes 2 subsumes 3 (since 3 is the most specialized version of 1). > One possibility is to have a Set type for 1 and 2 (I see no point having a > specific type for 1), and Dictionary for 3. This is done using prov:type. > > But then again, why not just have Dictionary. It minimizes the number of > definitions. If all I need is a set (2), I can just have pairs (e,e) as > members --no need to invent keys. If I only need (1), I don't use > insert/removal. > > I would say we should have to more generic version and allow users to define their own specialized constructs. Thanks. Best, Satya > Additional thoughts? > > -Paolo > > > > > On 4/19/12 6:31 AM, Luc Moreau wrote: > > Hi Tim, > > Your position in favour of prov:dictionary is really clear. > > Two questions: > > 1. Is prov:dictionary an essentially feature of prov-dm and should stay > in the prov-dm document? > > 2.. What about Jun/Satya's request for a simple membership property? > Should it be added to prov-dm? > > Professor Luc Moreau > Electronics and Computer Science > University of Southampton > Southampton SO17 1BJ > United Kingdom > > On 18 Apr 2012, at 23:08, "Timothy Lebo" <lebot@rpi.edu> wrote: > > Luc, > > On Apr 18, 2012, at 4:19 PM, Luc Moreau wrote: > > Dear all, > > I just wanted to throw a few ideas/questions to defend collections as they > currently are. > > 1. prov:Collection is similar to rdfs:Container [1] : > the properties rdf:_1, rdf:_2, ...[2] map naturally to keys in > prov:Collection. > > > I don't see how these map. > In prov:Collection, keys have values chosen by the user -- rdfs:Container > imposes the rdf:_N "value" for the "key". > rdfs:Container doesn't support keys. > > I think there is consensus that prov:Collection as it stands is _more_ > than set membership. > I argue that this more expressive construct is incredibly useful but > misleadingly named. > > > 2. RDF collections [3] can also be described by prov:Collection, using > rdf:first and rdf:rest > as keys for a collection of two elements, and allowing nesting of > collections. > > > Although it's true that one can reproduce an rdf:List using the current > definition of prov:Collection, > I'm not sure this provides "nesting" in any useful form. > It also shows how prov:Collection is a more general construct than > rdf:List. > > > > So a few questions: > > 1. Is it being suggested that rdfs:Container and rdf:List are not > appropriate, and we > should look at other forms of "collections"? > > > > I'm suggesting we rename "collection" to "dictionary". The confusion is > occurring when people read prov:Collection definitions as if it is set > membership, which it is not optimized for. > The capabilities that it _is_ optimized for are very useful, should stay, > will be used heavily, but should be renamed to something less misleading. > > > > 2. Has the prov-o ontology encoded prov-dm collections in a way that is > lightweight enough? > Could we for instance restrict the keys to be mapped to properties > such as rdf:_1, rdf:_2? > > > I'm not sure why we want to contort the eloquence of the Dictionary into > something that is less expressive (rdfs:Container), and which has been > disregarded for practical uses during the decade that it has been available. > > > > > > I however acknowledge that prov:Collection is not "natural" to model a > set. > > > prov:Dictionary! > > > I suppose that > like "rdf:Bag class is used conventionally to indicate to a human reader > that the container is intended to be unordered", > we would need a similar notion for expressing sets with prov:Collection. > > > We should leave modeling sets to SIOC and RDFS and focus on giving the > community something that it doesn't have -- a construct that lets us encode > the provenance of function calls with multiple inputs and multiple outputs. > > We don't have a set membership construct and we shouldn't encourage > people to misuse a dictionary to model a set. > > > -Tim > > > > Cheers, > Luc > > [1] http://www.w3.org/TR/rdf-schema/#ch_container > [2] http://www.w3.org/TR/rdf-schema/#ch_containermembershipproperty > [3] http://www.w3.org/TR/rdf-schema/#ch_collectionvocab > > > On 18/04/12 19:39, Stephan Zednik wrote: > > > On Apr 18, 2012, at 12:24 PM, Timothy Lebo wrote: > > I've had similar concerns that the definitions for collections are "too > heavyweight" to manage the membership of sets. > > But while ignoring is name and looking at the modeling construct it > provides, it's clear that this construct will be very useful in many real > provenance problems (for example, the very ubiquitous need for provenance > of function calls with their argument names and bindings). > > Perhaps we can avoid the "too heavyweight for set membership" concerns > raised by Satya and Jun by renaming what we have (prov:Collection) to > something more appropriate, like prov:Dictionary? > > > +1 > > Jim is right that you can model collections with enumerated classes, but > I am not sure about stating the provenance of a collection defined by an > enumerated class. > > We could also define a much simpler prov:Collection class that does not > force map/dictionary conventions to go along with prov:Dictionary. > > --Stephan > > > -Tim > > On Apr 18, 2012, at 2:12 PM, Jim McCusker wrote: > > I think a set of key-value pairs is what's known as a map or dictionary. A > collection is a set of things with a defined membership. In OWL it would > probably be represented as an enumerated class. > > Jim > > On Wed, Apr 18, 2012 at 1:20 PM, Jun Zhao <jun.zhao@zoo.ox.ac.uk> wrote: > >> >> Dear all, >> >> I concur with what Satya wrote. And the example I had in mind is >> collection type of entities on the blog sphere of the Web. >> >> As we all know SIOC is a widely used vocabulary to describe entities in >> the online community sites, like blogs, wikis, etc. It has the concept of >> sioc:Container, which is defined as "a high-level concept used to group >> content Items together". The relationships between a sioc:Container and the >> sioc:Items or sioc:Posts that belong to it are described using >> sioc:container_of and sioc:has_container properties. >> >> The provenance of a sioc:Container could be who is/are responsible for >> the container, who created this container, and when. >> >> The provenance of a sioc:Post could include when the posted was >> published, when it was modified, by whom, based on which other posts, >> document or data. >> >> As you see, I am struggling to see how the key-value pair kind of >> structure could play in the above simple scenario. But please correct me if >> I am wrong. >> >> HTH, >> >> Jun >> >> >> >> >> On 18/04/2012 18:35, Satya Sahoo wrote: >> >>> Hi all, >>> The issue I had raised last week is that collection is an important >>> provenance construct, but the assumption of only key-value pair based >>> collection is too narrow and the relations derivedByInsertionFrom, >>> Derivation-by-Removal are over specifications that are not required. >>> >>> I have collected the following examples for collection, which only >>> require >>> the definition of the collection in DM5 (collection of entities) and they >>> don't have (a) a key-value structure, and (b) derivedByInsertionFrom, >>> derivedByRemovalFrom relations are not needed: >>> 1. Cell line is a collection of cells used in many biomedical >>> experiments. >>> The provenance of the cell line (as a collection) include, who submitted >>> the cell line, what method was used to authenticate the cell line, when >>> was >>> the given cell line contaminated? The provenance of the cells in a cell >>> line include, what is the source of the cells (e.g. organism)? >>> >>> 2. A patient cohort is a collection of patients satisfying some >>> constraints >>> for a research study. The provenance of the cohort include, what >>> eligibility criteria were used to identify the cohort, when was the >>> cohort >>> identified? The provenance of the patients in a cohort may include their >>> health provider etc. >>> >>> Hope this helps our discussion. >>> >>> Thanks. >>> >>> Best, >>> Satya >>> >>> >>> On Thu, Apr 12, 2012 at 5:06 PM, Luc Moreau<L.Moreau@ecs.soton.ac.uk >>> >wrote: >>> >>> >>>> Hi Jun and Satya, >>>> >>>> Following today's call, ACTION-76 [1] and ACTION-77 [2] were raised >>>> against you, as we agreed. >>>> >>>> Cheers, >>>> Luc >>>> >>>> [1] https://www.w3.org/2011/prov/**track/actions/76< >>>> https://www.w3.org/2011/prov/track/actions/76> >>>> [2] https://www.w3.org/2011/prov/**track/actions/77< >>>> https://www.w3.org/2011/prov/track/actions/77> >>>> >>>> >>>> >>> >> >> > > > -- > Jim McCusker > Programmer Analyst > Krauthammer Lab, Pathology Informatics > Yale School of Medicine > james.mccusker@yale.edu | (203) 785-6330 > http://krauthammerlab.med.yale.edu > > PhD Student > Tetherless World Constellation > Rensselaer Polytechnic Institute > mccusj@cs.rpi.edu > http://tw.rpi.edu > > > > > > > -- > ----------- ~oo~ -------------- > Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org > School of Computing Science, Newcastle University, UKhttp://www.cs.ncl.ac.uk/people/Paolo.Missier > >
Received on Thursday, 19 April 2012 15:36:27 UTC