Re: PROV-ISSUE-65 (domain-specific-info): How is domain specific data combined with the generic model [Conceptual Model]

Hi Luc,

OK, I'll give it a go. I can raise these as separate issues if you
think they are substantial and independent enough to warrant that.

Requirement 1.

Add a section on Expressing Domain-Specific Information (or similar).
I think this might be helpful to keep the model itself clean and
general and bring together all the information on how users might
include domain-specific data in their assertions. I'll refer to this
as EDSI in the requirements below.

Requirement 2.

Decide how domain-specific types of entities, process executions,
derivations, etc. can be included in PIL data, and express this in the
EDSI section. At the moment, this typing is only shown for entities,
defined in attributes '[ type: "File" ]', but process executions,
derivations etc. do not have attributes.

We could resolve this in different ways.
 (a) All concepts have attributes and the "type" attribute key is
standardised, e.g. isDerivedFrom(e2,e1, [ type:
"ex:isLaterVersionOf"]).
 (b) Only entities have attributes (to keep the ivpOf concept clear),
but all concepts have annotations and the "type" annotation key is
standardised, e.g. entity (e0, [ location: "/shared/crime.txt" ], [
type: "File" ]).
 (c) All concepts have identifiers and the type information is outside
PIL, e.g. give an example of using RDF for these external typing
assertions: "e0 rdf:type ex:File; derivation1 rdf:type dc:isVersionOf"
 (d) PIL allows sub-types of concepts to be expressed and then used in
place of those concepts, e.g. subtype (ex:File, pil:Entity); ex:File
(e0, [ location: "/shared/crime.txt", creator: "Alice" ])

I can see pros and cons to each.

Requirement 3.

Decide whether attribute names are from a limited set defined in the
model, or are domain-specific and extensible, and express this where
attributes are specified (e.g. with the definition of Entity). If
domain-specific, show the namespace in the examples, e.g. entity(e0, [
pil:type="File", geo:location="/shared/crime.txt", dc:creator="Alice"
])

Requirement 4:

Use only one term "characteristic" or "attribute" in the document, if
they denote the same thing, or make the distinction clear if not.

Requirement 5:

Show how the "numerous ways in which location can be specified" should
be done by example. For example, once Requirement 3 is resolved, say
there can be domain-specific attribute names drawn from different
ontologies each denoting location in some form, e.g. [
geo:hasCoordinates="23426,4567" ].

Requirement 6:

State explicitly, in the EDSI section, that the only ways to include
domain-specific data into PIL assertions is through the mechanisms
described in the requirements above, if that is true. If all concepts
are given identifiers, then state that these identifiers can be used
to make additional (non-provenance) assertions outside of PIL.

Thanks,
Simon

On 22 August 2011 22:22, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote:
> Hi Simon,
>
> I think this is  a very important issue, but it's not clear to me, how
> to address it.
>
> Can it be broken in different requirements?
>
> Cheers,
> Luc
>
> On 29/07/11 16:00, Provenance Working Group Issue Tracker wrote:
>> PROV-ISSUE-65 (domain-specific-info): How is domain specific data combined with the generic model [Conceptual Model]
>>
>> http://www.w3.org/2011/prov/track/issues/65
>>
>> Raised by: Simon Miles
>> On product: Conceptual Model
>>
>> Any provenance data will be a mixture of PIL constructs and domain-specific information, e.g. file names, the Royal Society's membership, the event of the RS's foundation, etc. By domain-specific, I just mean things not defined in the conceptual model. It is not clear in the current document where this domain-specific information goes.
>>
>> There are a couple of hints about where it might go:
>>
>> 1. In the example, the attribute values appear to be domain-specific, e.g. "Alice" is not a generic part of the model. The attribute names might be domain-specific, as I don't think "type", "location", "creator" or "content" are defined in the model, but that might be a mistake in the model. Can attribute types be domain-specific?
>>
>> 2. Section 5.12 says that "there are numerous ways in which location can be specified", suggesting that it is made a domain-specific issue. I'm not clear whether the list of examples, "coordinate, address..." are examples of attribute types or something else. It is said that "Location is an OPTIONAL characteristics of BOB". I'm not sure if "characteristic" is related to "attribute", and if this is implying a generic attribute type called "location".
>>
>> But are there additional ways to include domain-specific information other than attribute types and values? It may be trivial to address, but seems important to make explicit, else it is not clear how to apply the language in practice.
>>
>> Thanks,
>> Simon
>>
>>
>>
>>
>
>



-- 
Dr Simon Miles
Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

Received on Wednesday, 24 August 2011 11:41:01 UTC