Re: [OEP] Comments on Specified Values Note

Alan,

Just to be sure you understand - Mike was actioned by the WG to review the 
document and confirm that only editorial changes had been made since the 
last working draft.  While Mike's suggestions look like improvements for 
the most part, you are under no obligation at this stage to make any more 
changes to the document, it was accepted by the WG as a note pending only 
the review from Mike OK'ing changes since the last WD.

Mike, 

Thanks for the careful review, however you have not yet completed your 
action, you must send a note to the WG indicating that only editorial 
changes have been made since the last WD - the last WD is at 
[http://www.w3.org/TR/swbp-specified-values/] (3 Aug).

-Chris

Dr. Christopher A. Welty, Knowledge Structures Group
IBM Watson Research Center, 19 Skyline Dr., Hawthorne, NY  10532     USA   
 
Voice: +1 914.784.7055,  IBM T/L: 863.7055, Fax: +1 914.784.7455
Email: welty@watson.ibm.com, Web: 
http://www.research.ibm.com/people/w/welty/



"Uschold, Michael F" <michael.f.uschold@boeing.com> 
Sent by: public-swbp-wg-request@w3.org
03/05/2005 12:37 PM

To
<rector@cs.man.ac.uk>
cc
<public-swbp-wg@w3.org>
Subject
Re: [OEP] Comments on Specified Values Note







GENERAL comments
Are you going to redo [some of] the figures using a nicer tool? Some of 
the text is literally undreadable on my 1600x1200 laptop screen. 
Contentwise, they are fine. Perhaps Natasha and her MAC-only diagramming 
tool has set too high a standard - I'm now spoiled :-) .

Below are a bunch of highly specific recommended changes, most are very 
minor to do with fine tuning.  In a few places, I have added a bit more 
detail.  You may disagree with some, due to different writing styles. Take 
what you like, and leave the rest.

It would have been easier and faster to edit the file directly, but I did 
not want to edit a non-latest version, in case you are also working on it 
still. Also, it  makes it easy for you see exactly what my suggestions 
are, and ignore the ones you don't agree with.

NB: I did not check any of the referenced full code examples. Has someone 
else done that?

Do we want to have this and Natasha's note use the same naming 
conventions? You use underscore to separate names. In one sense, it is a 
good idea to use different conventions, showing that this is real life. It 
could also be confusing for folk reading more than one of our notes.  This 
is a general question for the OEP TF.  I lean towards a consistent naming 
convention, it would even make sense to say exactly what it is. Also, we 
should point out that we are nto recommending this as the right one, just 
one among variuos that work just fine.
SPECIFIC comments
Abstract: 
Replace "officers and restricted" with "officers and be restricted"
---

Replace "specifies the constraints on the values that the property can 
take on" with 
"specifies what values the property can take on"
---

Replace "methods to represent" with "methods for representing"
OR: replace "methods" with "ways". IMHO: 'way' is preferred, since you use 
that phrase later: "There are at least three different ways to represent 
such specified collections of values"
---

GENERAL ISSUE
Replace "health person."   with "healthy person." 
---

You say: 
"It is a common requirement in developing ontologies to be able to 
represent notions such as a "small man", a "high ranking officer" or a 
"health person."  There are many such "features"..."

"health person" should be "healthy person". More substantively, 

This wording suggests that 'small man' is a feature. Is this what you 
mean?  Later you say: "it is necessary to specify the constraints on the 
values for the "feature" - e.g. that size may be "small", " This suggests 
the feature is 'size'. Its the difference between the property itself 
(size) and a property=value assertion (size=small).  I don't know what you 
mean. I'm not sure whetehr you want to gloss over this distinction here 
(with carful wording), or whether it is worth making. As it stands, it is 
a bit confusing.
---

Replace: 
*  As individuals whose enumeration makes up the parent class representing 
the feature;   (See pattern 1). 

with: 
*  As an enumerated set [or list?] of individuals (e.g. small, medium, 
large) which makes up the parent class representing the feature;   (See 
pattern 1). 

Also, add an example to the next bullet like I did for this one.

ALSO: in the phrase "representing the feature"I can't figure out which 
meaning of feature you are using (size? size=large?). It seems neither. 
Rather it seems to be referring to the value set of the the property size. 
  Frankly, I don't know what the phrase: "the parent class representing 
the feature"means. Parent of what? Do you mean "the set of all possible 
values that the feature(e.g. size) can have"? This is the feature space, 
which you define later. What does parent have to do with this?
---

Because of this terminology confusion, I think you need to either:
                 1.              move the vocabulary section much earlier 
in the document.
                 2.              include a forward reference to the 
vocabulary section very early in the document.
I prefer 1.
---

USE CASE EXAMPLES:

Replace "e.g. it should be inconsistent (unsatisfiable) to be both slender 
and obese or in good health and poor health" with 

"e.g. an inference engine will flag an error if an individual is declared 
to be both slender and obese or in good health and in poor health"
I replaced the theorem proving lingo with everyday language to convey the 
point.
---

Replace: "The others follow analogously." with "The solutions for the 
other examples are exactly analogous."
---

NOTE CONVENTIONS 

Much of this section could also go in an appendix. I don't have a 
preference, works ok either way. Your call.  Ideally this will be pulled 
out and used by more than one note.

I suggest adding a formal TODO on this point. See also my note to the list 
on this.
---

I would put the vocabulary first, in the conventions section, it is more 
important.
In fact, it might be better to put the vocabulary in the very beginning, 
because you use the terms right away.

Also, put partition last, as it is less important (as a bonus, it would 
also be in alpha order...)
---


Diagramming: 
I still don't know what an open  or closed arrow means.  It is not obvious 
to me by looking at the different arrows. Perhaps you mean the shape of 
the arrowhead? The terms open and closed dont convey that to me. Is this a 
standard terminology for arrows that I am not familiar with? 
---

'arrowsor' and 'arrowsindicate'need spaces to separate words.
---

In: "*  Downwards facing braces are used to indicate pairwise disjointness 
between subclasses or owl:allDifferent for individuals. (All sibling 
classes are disjoint and all individuals of each type are different in 
these examples.)"

What is the owl symbol if the pairwise disjointness is between classes? 
Does it ever appear in a diagram? You might address this something like as 
follows:

Downwards facing braces are used to indicate pairwise disjointness between 
subclasses or between individuals. These are represented in OWL as owl:?? 
and allDifferent respectively. (All sibling classes are disjoint and all 
individuals of each type are different in these examples.)"

This might be an incorrect wording for what you are trying to say.
---

"SWBP" --> "SWBPD"  (tut tut)
---

Pattern 1: 

To focus on the postive first, replace: 
"The second is more complex but is more flexible.  Some classifiers also 
work more reliably with Pattern 2 than Pattern 1."

with

"The second pattern is more flexible, and some classifiers work more 
reliably with it than with Pattern 1. However, Pattern 2 is more complex."
---

PATTERN 1:

"In this approach, the class Health_Value is". 

Why a definite article? This is first mention of this class. Below I 
suggest a rewording for this pattern which says what to do in more detail. 
 I feel that the readers of this note will find it easier to create their 
own examples if we give more details on exactly what needs to be done, 
rather than mainlu just describing what is the case, once you have the 
pattern.  Both are needed. 

You may think this too wordy... I made similar changes to approach 4 in 
the classes as values note. I notice that you have very nice wording in 
the comments to this effect when presenting the code, that is excellent.

The sentence: "Values are sets of individuals." is slightly ambiguous. 
'values' here refers to the feature space, all possible values. But each 
individual is also a value. I suggest using the term feature space here. 

Overall, I suggest the following rewording. 
Replace:
"In this approach, the class Health_Value is considered as the enumeration 
of the individuals good_health, medium_health, and poor_health. Values are 
sets of individuals. To say that "John is is in good health", is to say 
that "John has the value good_health for health_status" This assumes that 
a value is just a unique symbol, and a value set is just a a set of such 
symbols."

with 

"In this approach the feature space is represented as an enumerated set of 
individual values.  The feature of interest is represented as an OWL 
property (e.g. health_status).  We wish to restrict the range of this 
property to be one of three values corresponding to good, medium and poor 
health. To achieve this, we create a class that represents the set of all 
possible values for the given feature, and we enumerate each value.  This 
class (e.g. Health_Value)is defined to consist of exactly three enumerated 
individuals: good_health, medium_health, and poor_health. The name 
includes the string "_value" to indicate that this class was specifically 
created for representing values. Each enumerated individual is an instance 
of the value class. 

To say that "John is is in good health", is to say that "John has the 
value good_health for health_status". For this pattern, a value is just a 
unique symbol, and a value class is just a set of such symbols." 
---

Are the values 'sets' or 'lists' of individuals? Sometimes order DOES 
matter, so from a common sense perspective, they are not sets (although, 
if the order is not formalized, then from a model-theoretic viewpoint, I 
guess they are sets). Anyway, be consistent.
---

replace: 
{{The value set and make it equal to the enumeration of the three 
individual values}} 

with

{{Create the value class and make it equal to the enumeration of the three 
individual values}} 
---

The comment: {{Define each of the individual values as an individual of 
type Health_value}}
should be boldface, like the others. Check for other examples of font 
inconsistency.
---

Replace: "Many people find this the more intuitive approach." with
"Many people find this approach very intuitive."  because you have not 
introduced the other pattern yet.
---

Replace 
"There is no possibility of further subpartitioning of values." with 
"There is no possibility of further subpartitioning of values (e.g. 
good_health into very good and pristine health)."

Replace: 
"It is not possible, as it is for classes, to say that one individual is 
equivalent to the the union (disjunction) of two other individuals."

with 

"It is not possible to say that one individual is equivalent to the the 
union (disjunction) of two other individuals.  An analogous statement can 
be made with classes."
This may just be style preference; you may not like it. It is longer, but 
[possibly] slightly more accurate.  The problem is, strictly speaking, you 
cannot, for classes, "say that one individual is equal to the union of two 
other individuals" You CAN say something very analogous.
---

Replace 
"Because individuals cannot overlap, if Health_Value is defined as 
equivalent to enumeration of one list " 

with 

"Because individuals cannot overlap, if Health_Value is defined as being 
equivalent to the enumeration of one list "
---

Replace 
"To do so would cause the reasoner to indicate a contradiction. (i.e that 
Health_Value was "unsatisfiable".)"

with

"To do so would cause the reasoner to find a contradiction and flag an 
error."

Unless you really want this theorem-prover speak (is it germane to the 
note?). Whatever you decide, use the same convention throughout, this 
occurs in more than one place.
---
The last disadvantage starts off sounding like an advantage. 
Replace: 

"The representation is in OWL-DL, and DL reasoners should eventually be 
expected to make correct inferences with individuals used in this way. 
However, neither FaCT nor Racer (the two most widespread open source 
reasoners in use today) perform all the expected inferences reliably. "

with

"Although the representation is in OWL-DL, currently, neither FaCT nor 
Racer (the two most widespread open source reasoners in use today) 
reliably perform all the expected inferences.  In the future, reasoners 
will likely be able to make correct inferences with individuals used in 
this way. "

Or shorten last sentence even more:
"This limitation will likely go away with improved reasoners.

---

PATTERN 2: 

In the title of this pattern, and in the first sentence, you say 'feature' 
when I think you mean 'feature space'.
---

I don't think the word 'continuous' is accurate because it might not be 
continuous, it could still be discrete. Or if strictly speaking it is 
accuarate, it might be misleading.
---

You say: "the Good_health_values partition". The vocabulary definition 
does not define partition as a noun, just as a verb.  You need to amend 
the definition if you want to use 'partition' as one of the mutually 
disjoint collectively covering subclasses. 
---

The sentence: "Theoretically, there is an individual health value, 
Johns_health, but all we know about it is that it lies someplace in the 
Good_health_value partition. " will likely confuse people. I had to puzzle 
for a while about it, and after reading the code, realized that these two 
variants are just like the two variants of approach 4 in the classes as 
values note [exactly analogous, I think].  I think it may be clearer to 
present pattern 2 here as I [re-]presented approach 4 in the other note. 
That is:  Just describe the first variant as if it is THE approach. And 
then say, by the way, there is a minor variant... This would entail 
1) removing the sentence: "There are two variants presented: one in which 
the individual Johns_health is explicitly represented, the other in which 
it is implied by an existential restriction." or moving it to later, as a 
summary remark. 
2) removing the header: "Representation variant 1: Using a fact about the 
individual"
3) adding an introductory remark for the 2nd variant.
4) do something about the Venn diagram. It will need to be moved to after 
the variant, or simplified to leave out the restriction. The restriction 
is pretty confusing, in that diagram, so I suggest the latter option. Have 
every arrow represent an simple instance of the binary relation: 
has_health_status.  Maybe add more named individuals? 

btw: The 'J' in "Johns_health" is a different font.
---

The Healthy_Person class seems an optional, rather than essential part of 
this pattern.  Perhaps this could be indicated?
---

Here is a suggested rewrite taking into account the above comments; it is 
modeled after the rewrite for pattern 1.  I also emphasize what is similar 
and different about this pattern and the first.
Replace:
"In this approach we consider the feature as a class representing a 
continuous space that is partitioned by the values in the collection of 
values. To say that "John is in good health" is to say that his health is 
inside the Good_health_values partition of the Health_value feature. 
Theoretically, there is an individual health value, Johns_health, but all 
we know about it is that it lies someplace in the Good_health_value 
partition. The cass Healthy_Person is the class of all those persons who 
have a health in the Good_health_value partition."
with 
===
"In this approach, the feature space is represented as a value class 
partitioned by a group of subclasses (e.g. Good_health_value 
Poor_health_value).  As for pattern 1, we represent the feature of 
interest (has_health_status) as an OWL property, and we create a value 
class (Health_Value) that is used to restrict the range of the property. 
The key difference is in how we represent the values and the value class. 
The allowed values (good, medium and poor health) are represented as 
subclasses, not individuals. The value class is defined as the union of 
these subclasses.

Using this pattern, to say that "John is in good health" is to say that 
the value representing his health_status (e.g. Johns_health)  is contained 
in the Good_health_value partition of the Health_value feature space. 
Formally, Johns_health is an instance of the value class: 
Good_health_value.

We can also create a class called Healthy_Person, defined to be the class 
of all persons whose health_status has a value in the Good_health_value 
partition. In this case, the classifier can infer that John is in that 
class."
===

NB: Although I spent a lot of time on this, I am still not entirely happy 
with this wording, it loses some of the simplicity of your text, in the 
attempt to be more precise. Perhaps there is a best of both world middle 
ground that you can find?
BTW: you are using plural for the names of some of the classes. I fixed 
them, they are correct in the figures (singular). 
---

Not all of the comments in the Code for this example are the same font.
---

You use the term 'parent value class'. It is a parent class, from a 
graph-theoretic point of view, but from the user's point of view, it is 
the class representing the main/whole feature space. The 'child' value 
classes represent portions of the whole feature space. 
I suggest not using the word parent. [minor style point]
---

Replace: "you make the individual a type of the restriction."
with "you make the individual an instance of the restriction."
---


Fig 3: you refer to "lists of values" change to sets?
---

The wording: {{The disjoint axioms make the subclasses partitioning}} 
seems a bit akward. Maybe you were doing a dance with terminology...
Perhaps try: 
to {{The disjoint axioms make the subclasses partition the value class: 
Health_Value}} 

Also, move this comment to immediately after the followign comment,because 
it applies to all three subclass definitions:
"{{Define each of the subclasses that make up the partitioon and make them 
pairwise disjoint}}"

Better yet, skip that comment, it is probably unnecessary, given the one 
above. 

And fix: 'partitioon'
---

There is a missing full stop after the definition of 
"medium_health_value".
---

In the comment for defining healthy person, note that it is not mandatory 
for the pattern (or if it IS, can you tell me why?). 
---

Suggested indenting tweak to make things more clear what the two classes 
are that are being intersected. It took me a while to puzzle it out. This 
might be too much work in general, if automatically generated...
:Healthy_person
   a    owl:Class ;
   owl:equivalentClass
        [ a       owl:Class ;
          owl:intersectionOf (:Person 
                              [ a       owl:Restriction ;
                                owl:onProperty :has_health_status ;
                                owl:someValuesFrom :Good_health_value
                               ])
              ] .
---

Figure 4: minor tweak. Highlight exactly what is different with this 
figure, compared to figure 2.  Make the new part blue, or in bold?  Maybe 
add text somewhere or caption the figure to say exactly what changed with 
this variant:
                 1.              instance now implied, not explicit (noted 
by dotted lines)
                 2.              new arrow from john to good_health_value
[maybe overkill, it is pretty clear just  by looking at it...]
---

reasoner and reasonable in same sentence (stylistic quibble). maybe use 
inference engine or classifier isntead of reasoner?
---

Give a simple example of alternate partitionings for the same feature 
space. A real life example that turned up in Boeing was dividing up the 
fuselage into [front, middle, rear] and [front, back]. Each was a 
partition. The problem arose because we needed to do a semantic mapping, 
which of course is inherently ambiguous.
---

The following seems contradictory:
"If variant 2 is to be used as part of a database schema or similar, then 
a convention for creating anonymous instances in the database is required. 
(Logicians call such anonymous instances "skolem constants".) In practice, 
this can usually be ignored."

If you can ignore it, then a convention is not required. 

This point is fairly lost on me, I can't follow what you are getting at.
---

WELL, I don't know what happened, someone must have slipped a much finer 
toothed comb in my pocket - I didn't think there would this much more 
additional feedback...

Despite all the verbiage, I really like the note, all this is just fine 
tuning.

Mike

Received on Monday, 7 March 2005 00:17:44 UTC