RE: [OEP] Classes as Values - A detailed review

Comments inline, [MFU]  

-----Original Message-----
From: Natasha Noy [mailto:noy@smi.stanford.edu] 
Sent: Wednesday, March 02, 2005 11:45 PM
To: Uschold, Michael F
Cc: public-swbp-wg@w3.org
Subject: Re: [OEP] Classes as Values - A detailed review


Mike,

Thank you so much for your excellent and insightful comments! I agree  
with most of them, as always.

> If you think this translates to a short review, you are in for a shock
> :-) .

Indeed :) 12 pages, it's longer than the document itself! :) 

[MFU]  I was hoping you would not notice that. Don't you dare do a word count comparison!!
--

Maybe it  
would almost have taken you less time to just re-write the document :)  
I've posted a new draft in the location referred by Editor's draft on  
the OEP page [1]

[MFU] the link below is broken, I found it from the OEP page.
--

I'll start with the one major suggestion that I did not address at the  
moment and would like to get some feedback from the group on. I'll  
reply to the rest of your comments (most of which I simply followed)  
after that.

> The considerations for each approach should be identified up front as
> evaluation criteria, and then each considered for each approach.

[excellent suggestions on how exactly to do this, snipped]

> It would be good to have a table listing the approaches and the 
> criteria summarizing how each approach does according to the various 
> criteria.
>
> Going this route will entail re-structuring and re-writing much of the
> text in the considerations sections.  Most or all of the points being 
> made will remain. I believe it will make the note significantly more 
> clear and useful to readers.

I completely agree -- it would certainly make for a much better  
document! Where were you 10 months ago? :) Seriously though, as you  
mention, following this suggestion would mean a complete rewrite of the  
prose in the document, with major possibility for new contentious  
points, re-review, etc. If you are willing to take on that task, I'll  
be happy to help. I am not however sure I feel motivated enough to do  
it myself, given that the document was in the WD status for more than 6  
months now, and got few major comments.

[MFU] very understandable. I will take a stab at making some of the changes that can be made by mostly adding rather than major re-arranging.  For example: I can change all the summary sections to be specific, as in my comments. I can just swap them in as they are, even if I do not add the explicit criteria up front.  Id also like to redo approach 4 as per my outline, unless you have strong objections.
---

I tried to address most of your other comments in the draft available  
now. My thoughts/questions on some of them below.


> Same comment as to Alan Rector on the specified values note: Bite the
> bullet and where it is reasonably clear, call the considerations pros 
> and cons. Put the pros first and the cons last, and order the 5 
> approaches so
> that
> [as much as possible] the cons for one approach motivate and naturally
> lead to pros of the next one.

They are ordered this way, as much as possible, anyway. Do you have  
suggestions for a different ordering? I would tend to leave things as  
"considerations" since in many cases these are neither pros nor cons.  
The only ones that I would classify as pros or cons are the ones that  
are so obvious already that the classification would be useless anyway.  
I would really want to remain non-judgmental (otherwise, this note will  
never be finished).

[MFU] This is a judgment call. I think most of the time, a consideration can be seen as a pro or a con, when taken alone. I feel that the notes will be more useful if we are upfront about that. Others may disagree. 
---

> Printing anomaly:  when my copy got printed, the second figure (which
> should be NUMBERED) was split across two pages. Other figures also got 
> split across pages.

I have no idea how to fix this. Are there any HTML gurus who could tell  
me how to put images into HTML document so that they don't get split  
during printing.

I've added numbers to figures.

> The font for the heading for the APPROACH sections is section is less
> prominent than that for the next-level subheading. Boldface dominates, 
> even though the font is smaller.

I am not sure I understand what you are referring to. Approaches is h2  
heading, the subheadings are h3 or simply bold. Could you be more  
specific? Is this browser specific?

[MFU] This is because I was reading a hard copy w/ no color. In that format, the 3rd level heading: "Summary for Approach 1" is mor e prominent than the 2nd level heading: "Approach 2: ...".  When viewed from a browser, it looks fine because the latter is in blue, so that distinguishes it.
---

> Ideally, all the notes would use the
> same heading/subheading font conventions.

Ideally, yes :) I basically just use h1/h2/h3, etc, and let the W3C  
stylesheet figure it out. This should be pretty standard, no?

[MFU] Yes it should,and probably is. 
---

> Also, I prefer numbered
> headings, easier to reference portions in the document, especially
> when there are no page numbers.

It's a matter of taste I guess: I don't particularly like numbering  
sections with 4 sentences in them, which most of them are in this  
document. Thus subheadings. You can reference each of the approaches  
through anchors though, e.g.:
<url>#1 for approach 1. Is that sufficient?

[MFU] Ugh. I have no idea how to do that. I just read the document online, and cut/paste things that I need to quote. Also, a url is not going to be any help if I'm reading the document offline.
---

> Also, it would be good for the figures to better reflect the
> similarities and differences between the different approaches. For 
> example, approaches 1 and 5 are virtually identical. This should be 
> evident in the figures as follows: if you had two powerpoint slides 
> and showed one right after the other, the only thing the viewer should 
> see that is different is:
> 	1.	the addition of the annotationProperty class with links
> from the dc:subject arrows
> 	2.	the arrow for dc:subject turned green
> Everything else should be exactly aligned.

I fixed this where I could. In some cases, trying to conform to this  
idea makes the diagrams either too strange or too cluttered. If you'd  
like to take a stab at this, please, go ahead.

[MFU] Fair point. I may have over-constrained the problem. There will be tradeoffs. It is a realtively minor enhnacement, which might be a lot of work.  What format did you edit the diagrams in? I don't want to start from scratch. If the originals were in, sa powerpoint, then I could fiddle around spending minimal amounts of time.
---

> It would also be good to have a summery table with all the figures 
> 'side by side' for easy comparison. Make sure to put the ones that are 
> most similar to each other next to each other.

I am not sure how this would look in a single table. The figures need  
to be pretty large for any text to be readable. Again, if you'd like to  
mock it up, please, go ahead.

[MFU] Maybe all on the same page, ditto above remarks.

> SPECIFIC section by section comments
>
> ABSTRACT: give a specific example so reader can immediately relate to
> the issue. 'using classes as property values' is hard to relate to, it 
> is very abstract.
>
> I suggest something like this to replace the first part of the
> abstract:

I've re-written the abstract (also to allay your other concern below).

> stray quote in: "behavior of the" African lions"

Doesn't look stray to me.

[MFU] Strictly, I now see that you are right. It looks a bit odd, but seems correct. Does it really need to be quoted? This is a femto-point.
---

> It is not immediately clear what is being said in the following
> paragraph. I'm having trouble parsing the key sentence (with "own" in 
> it). ===
> "One goal of the web publisher is to enable maximum reuse of published
> information. It will be common on the Semantic Web to import and reuse
> other published ontologies. In doing so, it is important for web
> developers to preserve the original semantics of imported resources.
> Therefore, an important consideration in choosing a representation
> pattern in this case is the following: If the pattern requires a
> different interpretation of classes to be used as values, does the
> designer "own" the definitions of these classes (in this case, the
> hierarchy of animals) to change them according to the new
> interpretation? Are others already using this hierarchy of animals in
> their applications and will this change affect those applications?"
> ===
>
> I'm going to have a go rewording it, reflecting my best guess.

Your rewording makes a slightly different point than the one I was  
trying to make. I've changed the paragraph (borrowing some sentences  
from your rewording) - hopefully it's better now.

[MFU] I read it, it seems fine.

> OTHER USE CASE SCENARIOS
> This is very important, I suggest making this more prominent somehow, 
> as it is, it could easily be missed, especially since approach 2 
> refers to a hierarchy of subjects anyway (see also comments under 
> Approach 2 below).

Any suggestions on how to do this?

[MFU] The whole note and all the examples talk about subjects, and then you have a paragraph buried that says the note is not really about subjects. A way to help would be to use include some examples (if not full code, then at least in text) that do NOT refer to subject at all.  It would also help to mention it several times for emphasis. But even so, you can say over and over "This note not just about subject, it is more generally about classes as values". But frankly, after going over this note in great detail, I DON'T REALLY BELIEVE YOU. The note IS about subject, every single example (that I noticed) is about subject. Can you convince me of otherwise? 
---

> It is slightly jarring that you name the class of all books about
> animals as singular 'BookAboutAnimals' - although I see this is the 
> consistent naming convention. Might be worth mentioning this in a 
> footnote? [very minor point]

I think this is rather common. I don't think it's worth mentioning

[MFU] Fair enough. It might be worth putting somewhere, in a place applicable to many notes, that we use the convention of using singular for classes; as well as any naming conventions in general? Are there any that we want to talk about?
---

> The definition is a good example of why it would be useful to have a
> distinction between properSubclassOf and subclassOf. Then you don't
> need
> the union clause. I'm not sure about my OWL. If there is such a
> distinction, then this definition could be shortened by only using the
> subclassOf which includes the class itself (Animals).

This separation doesn't exist in OWL, thus we can't shorten it,  
unfortunately.

[MFU]  Fair enough. Minor point.

> I'm a strong advocate of fairly literal, but also readable English
> translations of all/most N3 expressions in these documents. For
> example,
> the bookaboutanimals would read:
>
> The class BookAboutAnimals is an OWL class. It is a subclass of Book. 
> It is also a subclass of the class of all things whose subject 
> property
> has
> a value that is either an animal, or is a member of the class of all
> things that are subclasses of Animal.
>
> If anyone questions the need for this, let me just say that it took me
> 3-4 minutes to puzzle out this English text from the N3. As a way to 
> teach people about OWL, having English definitions of all code 
> alongside, is going to be very helpful indeed. It helps me a lot, and 
> I already know a lot about OWL (though I am not so great at reading any
> of
> the raw syntaxes).

I agree on this particular example, and tried to add some text there.  
For the rest of the examples, I tried to give a "high-level"  
description of the OWL code, but not a literal one. I am a bit lazy and  
am not convinced that putting literal translations for all OWL code is  
useful everywhere. Mike, do you want to take a stab at it if you feel  
this is necessary?

> You don't say whether this is OWL-DL or OWL-Lite.

What does "it" refer to here?

[MFU] I don't see an "it" anywhere you must mean the "this" :). I mean to say that you don't say whether approach 1 is in OWL-Lite or just OWL-DL.

> Maybe you could avoid 'subject' and say that this approach is about
> creating a parallel hierarchy of annotation individuals?

I changed the title for the approach. I am not sure its better though. 

> Minor point: you say
> "The resulting ontology is compatible with RDF Schema and OWL Lite
> (and hence OWL DL)" I had to think for a moment about the hence 
> clause, it seemed backwards at first, but then I saw it was correct. 
> For this audience, it might be better to keep it simple. It is also 
> compatible with OWL-Full, which you do not mention. I suggest 
> rewording to:
>
> "The resulting ontology is compatible with RDF Schema and all variants
> of OWL (Full, DL, and Lite)."
>
> It is not germane to this discussion that:
> IF it is compatible with OWL-Lite,
> THEN it is compatible with OWL-DL.
>
> If you wish to give the reader this information, make it a separate
> comment or footnote.  It would really belong in a discussion with the 
> definition of 'compatible' which is a blue term targeted for a 
> glossary definition.

I am not sure I agree. The key point here is that, unlike the previous  
approach, it *is*compatible with OWL DL and OWL Lite

[MFU] Precisely, which is exactly what my rewording says. I think it is sufficient. I don't think making the above IF/THEN point is  necessary here.

> APPROACH 3
> Overall, this approach is very clearly described.
>
> This approach assumes that there is a subject hierarchy. Is that true?
> Is there a more general view that is not subject-specific? Can it be 
> generalized? Or do we bite the bullet and say this note IS about 
> representing subject hierarchies.

No, it is not about subjects. Subjects are used as an example. Consider  
genre for annotating CDs, diseases for annotating clinical guidelines,  
others under the "Other use cases" section.

[MFU] OK  I'm starting to believe you (maybe). So the classes are genre, say Jazz, HardRock, etc. So you have a hierarchy of these categories, and represent them as classes. What what is an instance of 'Jazz'?  If you say it is a jazz cd, then you don't need classes as values, you just classify the CD's dirctly as instances in the hierarchy. I don't see this as an example where you are likely to want classes as values. Am I missing something? Even if this is a good example that you can elaborate on and convince me, it is not sufficient to convince the reader in the current state. Such elaboration would be necessary.

> Remove last word in: "We can create a single class Subject and make
> all the subjects to be individuals that are instances of this class
> Subject"

Why?

[MFU] It is redundant, the meaning is totally clear from context.

> Might: "using individuals as surrogates for classes" be a better name
> for this approach? The current one may be too specific?

I don't know. Is it?

[MFU] Chris is going to help think of some better names, I will give that some thought too.

> This approach also entails creating a parallel hierarchy.  I think it
> would be good to show the parallel hierarchy in the figure, in a way 
> that really looks like a parallel hierarchy (e.g. the layout should be 
> more or less identical, so it is immediately obvious). This will
> involve
> a major re-structuring of the diagram.

This would be nice, but I honestly don't see how I can do it in a  
single diagram and still get the main point across.

[MFU] Fair enough, this is a small point anyway, fixing it will have minimal impact.

> If the DL reasoner cannot infer that "a book that has LionSubject as 
> the value for dc:subject is also about Animals" then what is the point 
> of mentioning that "Most DL reasoners will be able to infer transitive
> relations between subjects". Is this useful by itself?  If so, can it  
> be
> factored into the requirements/criteria for evaluating the different
> approaches?

 Yes, if/when the document is refactored.

[MFU] Is it useful by itself? I don't get it. Suggest either skip that comment, or elaborate on it more.

> What is the import of this: "The resulting hierarchy of subjects is
> not related to or dependent on the class hierarchy representing the 
> same topics (in this case, animals), except through an annotation 
> property rdfs:seeAlso."  Is it good? helpful? Why? Relate it to one of 
> the evaluation criteria.

It depends on the requirements for your application. As it stands, it's  
just a fact.

[MFU] I was trying to understand why you chose this particular fact to include here. Can you say more about when/why a user might need to know about this. What kind of application would this be good/bad for?

> What is the import of this consideration:
> "This approach explicitly separates the subject terminology from the
> corresponding ontology. Many consider this separation a good modeling
> practice: the semantics of a subject Lion can be different from the 
> semantics of the class of lions. Having subjects in a separate 
> hierarchy, would allow us to define for example that the subject 
> Africa is a parent subject of the subject AfricanLion." Relate to one 
> or more evaluation criteria, does it relate to supporting a desirable 
> inference? does it impact on maintenance?

Again, depends on your requirements and application.

[MFU] Again, it would be good if this could be related explicitly to something that would matter to some users.


> APPROACH 4
> I found that this example was hard to grasp. Focusing on "unspecified
> members of a class" seems very obscure, and must be missing the main 
> point, which is ??? - I'm not sure.  The main thing seems to be that
> the
> actual value that the property dc:subject has is an [implicit]
> unidentified instance of the class Lion and that the relationship of
> this [nonexistent implicit] value to the class Lion is rdf:type.  This
> is IMHO, rather obscure and many are likely to have little idea what  
> you
> are talking about. The main problem is that the instance DOES NOT  
> EXIST,
> so it needs to be explained differently. You might at the end mention
> that this representation approach corresponds to there being an  
> implicit
> instance, but otherwise it is likely to be far to confusing.

I was first tempted to change the title of the approach to include  
"implicit". But then I am not sure "implicit" is the right word here.  
We don't actually know if the instance exists or not. All we are saying  
is that for a book about lions, one value for the subject property will  
be an instance of Lion. This is enough to classify it, but we don't  
actually say anything about whether or not this instance exists and is  
named, etc..

[MFU] another name to think of...
>
> Specifically, there is nothing corresponding to the following (from
> approach 2) :AfricanLionBook
>       a       :BookAboutAnimals ;
>       dc:subject :AfricanLionSubject .
>
> If there was, it would be:
> :AfricanLionBook
>       a       :BookAboutAnimals ;
>       dc:subject :UnidentifiedAfricanLion .

There won't be: we describe AfricanLionBook as a book where at least  
one subject is an instance of Lion (regardless of what else we know  
about that instance)

[MFU] Precisely my point, the description does not make this very clear. I will gladly re-draft this section, and you can see what you think of it.

> This approach is designed to make it easy to leverage a DL reasoner to
> infer, for example that a book whose subject is Lion also as subject 
> Animal. In this approach, we create a parallel hierarchy of types of 
> books consisting of classes such as: BookAboutAnimals, BookAboutLions, 
> BookAboutAfricanLions. We then say that various instances of Books are 
> explicit members of one or more of these subject classes.

Actually, here we don't need to create a full parallel hierarchy:  
technically, we create only for the classes that have a book with this  
subject there. Thus, if we have no books about African Lions, we don't  
need to create the class, we can always do it on demand. This is not  
quite true for the subjects themselves in previous approaches.

[MFU] True, my wording is not quite right, it is close though... in that there is a parallel hierarchy that emerges as the user creates explicit classes.  If they choose the variant of using anonymous classes, then it does not exist at all. That is one reason I wanted to separate out more clearly that variant. 

> [before the Alternatively clause, put this text in:]
> By saying that LionsLifeInThePrideBook  is an instance of 
> BookAboutLions we are saying that it is a member of a class, all of 
> whose members have as their subject, at least one instance of the 
> class Lion. [this text might need fixed so it is strictly and 
> literally true, I might have a misreading of someValuesFrom, all the 
> more reason that these examples need English every time.]  In OWL, it 
> is not necessary to create any explicit instances of these classes.  
> In the figure, we list them as if the were explicit, and use dotted 
> lines to denote that they may not actually exist.

Added - thanks! I haven't followed the rest of your suggestion here,  
since I thought it was just reiterating the point once again, perhaps  
making it a bit more confusing.

[MFU] See what you think with my forthcoming new draft of this approach.

> This approach is agnostic to the following issues:
> 	*	how to limit the range of the dc:subject
> <http://purl.org/dc/elements/1.1/subject>  values

Not really - it limits it explicitly in each case.

[MFU] ok.

> There are really two variants here, perhaps make that more explicit,
> figures for each? N3 and RDF/XML representations for each?
>
> The figure does not have the class(es): BookAbout(African)Lions. I
> think it should, to show the parallel hierarchy.

See above: you don't necessarily have a parallel hierarchy.

[MFU] ok.

> APPROACH 5
>
> Add the consideration: there are no non-standard semantic
> interpretations of this approach. Or no semantic interpretations that 
> differ from the original intent of an existing ontology that is being 
> re-used. Well, maybe it is non-standard to view dc:subject as an 
> annotation property...

Indeed, that's unclear. I am not sure we want to say that this is  
definitely standard.

Again, thanks a million for doing it this carefully! I look forward to  
the discussion tomorrow. Too bad I won't be able to be there in person!

[MFU] Good point. 

Natasha

[1]  
http://smi-web.stanford.edu/people/noy/ClassesAsValues/ClassesAsValues 
-2nd-WD.html

Received on Thursday, 3 March 2005 22:18:26 UTC