Re: comment: Defining N-ary Relations on the Semantic Web from Natasha Noy on 2004-08-10 (public-swbp-wg@w3.org from August 2004)

From: Natasha Noy <noy@SMI.Stanford.EDU>
Date: Tue, 10 Aug 2004 12:53:12 -0700
To: Frank Manola <fmanola@acm.org>
Cc: public-swbp-wg@w3.org
Message-Id: <E9BADAC2-EB06-11D8-AEA1-000A958B5C28@smi.stanford.edu>
> 1.  Under the "Representation Pattern" heading, the text between the 
> first two figures, if interpreted strictly, appears to only cover the 
> first two use cases.  Perhaps it could read something like:  "We would 
> like to have another individual or simple value C (and possibly 
> additional individuals or values in the case of Use Case 3) to be part 
> of this relation):"?

Good point -- will add in the next draft.

> 2.  Just below the figures:  "A common solution to representing n-ary 
> relations such as these is to create an individual which stands for an 
> instance of the relation and relates the things that are involved in 
> that instance of the relation."  And just below that: "In the first 
> case...one of the individuals in the relation (say, A) is 
> distinguished from others in that it is the *originator* of the 
> relation."
>
> Here the text introduces new terminology "instance of the relation" 
> and "originator" in places where there already is terminology to cover 
> these concepts ("object", "owner of the relation", and "relationship" 
> are also used in various places later in the text).  An instance of a 
> relation in RDF is a "statement" ("tuple" could be used too, as per 
> relational database terminology).  The "originator" of such a 
> statement or tuple is the "subject" (note also that if, as the text 
> says, you're choosing an individual, then even if you retained this 
> "originator" terminology it wouldn't be the "originator of the 
> relation", but rather the "originator of an *instance* of the 
> relation").

I agree that the fewer undefined or confusing terms we use, the better 
(hard to argue with that, I guess). On the "instance of relation," 
however, I think it is more clear than "statement" in this particular 
case. What we want to emphasize is not that it is any RDF statement, 
but rather that we create this relation instance here.

I am not crazy about the "originator" word myself. Perhaps, "subject" 
indeed would be more clear.

> 3.  Just below, in the initial description of pattern 1: "...here, the 
> instance of the relation itself is a property of A, with the value 
> that is a complex object in itself, relating several values and 
> individuals."  This is a bit confusing (particularly with reference to 
> the second of the two diagrams above it).  For one thing, it's not 
> clear how an instance of a relation (i.e., a statement) can be a 
> property (property value, perhaps).  For another, in the normal binary 
> relation, the "instance of the relation" is considered to include the 
> originator (A in this case).  But the new individuals being created 
> (in pattern 1) *don't* include the originator.  Having tried several 
> alternative descriptions of this sort myself, I appreciate how hard it 
> is to come up with concise descriptions of these patterns here.  I 
> suspect it may be better to simply jump directly to describing these 
> patterns using examples, as the material under the "Pattern 1" and 
> "Pattern 2" headings does, rather than trying for these abstract 
> summaries).

Well, these summaries do help to orient the reader sometimes. You are 
right though, that we should use "property value" rather than 
"property" iin that sense. Perhaps it would make things more clear.

> 4.  The first paragraph under the "Pattern 1" heading introduces 
> another new term "relation object" which should be introduced more 
> explicitly, assuming it's needed (NB:  this is not the same as "an 
> object of a relation", a phrase also used in the same paragraph).

You are right, there are too many "objects" in that paragraph -- we'll 
try to fix it.

> Also, under the "Pattern 1" and "Pattern 2" headings, introducing 
> concepts like "Diagnosis_Relation_1" and "Temperature_Relation_1" may 
> help emphasize that these in some sense represent instances of 
> relations, but I think that there should be some text pointing out how 
> often real life use cases often have corresponding concepts already.
>
> The "relation object" idea might be better introduced by something 
> like:  It is often possible to think of the relation among multiple 
> facts as a separate object.  Then the multiple facts can be 
> represented as describing that object.  This happens so often in real 
> life that there are often separate concepts (and names) for these 
> separate objects. Thus, it is possible to talk about a "diagnosis" 
> (instead of "diagnosis-relation").  This diagnosis can have various 
> properties that describe it (the value, probability, who made it, 
> when, etc.). Similarly, Steve may have a "temperature_reading".

This is certainly true, and one can choose different names for classes, 
etc. That said, I think it is important to distinguish the whole 
diagnosis relation with its attributes , etc., from the diagnosis value 
itself (such as breast_tumor). Using just "diagnosis" for one or the 
other will obscure the point, I think.

> 5.  It ought to be noted somewhere (it may be there and I've 
> overlooked it) that you can always reverse the "original" relation and 
> turn pattern 1 into pattern 2.  E.g., you can reverse "Christine 
> has_diagnosis diagnosis_1" to form "diagnosis_1 about_patient 
> Christine".  This is related to the bullet about inverse relations 
> under the "Considerations" heading, but makes a slightly different 
> point.

Good point. It's not there now. Should be added of course.

> 6.  Some people are naturally going to think of using RDF reification 
> in these situations and, rather than avoiding the subject, the text 
> should explicitly point this out, and then go on to say why this is a 
> bad idea.  The primary reason it's a bad idea is that explicitly using 
> the reification vocabulary involves talking about RDF (or OWL) 
> statements (e.g., individuals are introduced having rdf:type 
> rdf:Statement) and, as the examples illustrate, more natural concepts 
> from the actual problem domain can generally be used instead.  E.g., 
> instead of defining individuals that are statements, define 
> individuals that are "diagnoses", "temperature readings", "purchases", 
> etc.  This can be looked on as a kind of "reification", but it 
> shouldn't be confused with the RDF concept (and its vocabulary).

Thank you very much for this paragraph. As has been noted earlier on 
the list, the document should contain a more detail discussion of 
relation to reification in RDF. This idea should certainly be part of 
that discussion.

> 7. As this is part of a best practices activity, it seems to me that a 
> Note of this kind should explicitly point people to the relational 
> database design literature for examples and ideas (at least to a 
> standard textbook, such as Date's "An Introduction to Database 
> Systems").  On a more theoretical level, all the work on functional 
> dependencies and various "normal forms" is relevant to the sorts of 
> design practices being discussed here.  If a reference to the database 
> literature isn't considered relevant enough to this specific WD, it 
> certainly should be to any larger-scale document in which these 
> contents might be collected.  After all, the concepts being considered 
> here apply to more things than just those that people have classically 
> considered "ontologies".

I'll leave this to a larger document (perhaps that introduction to all 
OWP documents that we've talked about before).

Thank you very much for the through read and suggestions!

Natasha
Received on Tuesday, 10 August 2004 19:53:17 UTC