RE: [SE] Suggestion of new note from Uschold, Michael F on 2005-09-20 (public-swbp-wg@w3.org from September 2005)

From: Uschold, Michael F <michael.f.uschold@boeing.com>
Date: Mon, 19 Sep 2005 20:07:51 -0700
To: "Holger Knublauch" <holgi@stanford.edu>, <public-swbp-wg@w3.org>
Message-ID: <4301AFA5A72736428DA388B73676A381B4C8D0@XCH-NW-6V1.nw.nos.boeing.com>
Here are some comments on your note.

Mike


============================================
Mike Uschold
Tel: 425 865-3605              Fax: 425 865-2965
============================================


GENERAL: ============
This is turning out to be a very useful note indeed, much good work has gone into it, and what is there so far, is mostly very good. A stark contrast from earlier efforts. Congratulations to the TF. 
--

There is a problem with the ambiguity in the term 'restriction' which is used in  common English sense (it to limit), and in the strict OWL sense of a class defined on the basis of a 'restriction' (in the common sense English sense). Tough problem to deal with.
--

All of the examples of the differences in the table at the end, should be illustrated with simple examples.

RDF is used all over the place when it should be: RDF Schema.
--

The opening example is very good, illustrating how independently  build ontologies might be used together. It is also very good to use it throughout the note.
--

The note should have a better introduction. It starts by diving right in, w/o any context setting. Say much earlier what is the storyline/contents/overview of the paper as well as outline the specific objectives. The latter can be accomplished by the sentence used in the email announcing this draft (see below).

Here is some sample text that attempts to describe the overall story and motivates the note: 

==
Great progress has been made in the use of models in software engineering, the benefits are (blah blah blah).  Recent MDA-based software development tools move this forward significantly, addressing some of the common issued in software engineering such as: models being use only at the beginning and getting out of date as code develops. However, there are still challenges. <name them, like interoperability>

Independently the Semantic Web community has been developing and maturing technology that can be used to build and reason with models.  The languages offer some advantages over UML, the de facto standard modeling language for object-oriented software development.

This note is intended to act as a Semantic 
Web Primer for software developers with background in object-oriented 
languages like UML and Java.  Our goal is to clarify the differences 
between RDF/OWL and OO languages, and to attract more mainstream 
developers to add Semantic Web technology to their routine tool kit.
==
--


The INTRODUCTION is overall very good, many of the classic problems are introduced and described, thus motivating the note overall. It would be great to pull out the key points and put them in a list of bullets.  The main items in the list should  be mentioned explicitly at the beginning, and again at the end. It is the main motivation for using models/ontologies and is thus central to the note.
--

SECTION 2: APPLICATIONS:
Much good material, I especially like the figure. I felt a bit lost reading this section, because there is so much material, it is hard to know what to do with it. It could use a better story line. One way is to organize this section according to the benefits you claim can be had, then you can use that to motivate specific content. There may also be a need for a general overview of certain things, not attached to any specific objective. 
--

SECTION 3: RDF/OWL
I think this section would be improved by a more systematic introduction to the similarities and differences.  That might look like this:
The main comparisons between Semantic Web and object-oriented languages are:
classes vs. classes
properties vs. attributes
domain constraints vs. attaching an attribute to a class
range constraints vs. ??
open world vs. closed world

Actually it is pretty good as it is, this might be a minor point, either ignored, or tidied up fairly easily.  You do this sort of thing later, with more chatty text earlier.

For example, you talk about classes, then get a bit sidetracked, then come back to properties and attributes.  The current presentation is nice and chatty, so that may be a tradeoff.
--


============ SPECIFIC: ============
INTRODUCTION: 
In the opening example, there is a lot of 'we may do this' and 'one may do that', which sounds weak. Better to say: here is one typical software construction scenario, we do this, we to that we do the next thing.  Say Re-phrase accordingly.  
--

Do you want this sentence to suggest that if the program is successful, you may not want to reuse parts of it? If not, reword.
"If our system is not so successful or the system it ported to a different platform, we may at least want to reuse parts of it."
--

Wording glitch:
"The highest potential for reuse and interoperability in our example scenario would have the UML diagram."

Perhaps you mean: 
"The UML diagram offers the highest potential for reuse and interoperability in our example scenario."
--

I'm the following sentence, you say 'could'. Isn't this done routinely now with MDA software support packages, like Rational Rose? 

"The UML model is on a higher level of abstraction and could be used to derive implementation code for various purposes."

You also say: 
"Furthermore, UML diagrams are typically only maintained as intermediate artifacts in the development life cycle, used as the foundation for the implementation but then put into drawers, where they are inaccessible to other developers."

This is not true for MDA software development, as I understand it. That's the whole point.  Perhaps my facts are off?
--

Replace "crafted from the scratch"  with "crafted from scratch at the start" or similar.
--

Awkward sentence, hard to figure what you are trying to say.
"In a nutshell, Semantic Web based development suggests to design domain models in Web-based object-oriented languages such as OWL and RDF."

Do you mean:  
"In a nutshell, Semantic Web community has produced an alternative set of languages and tools for developing, maintaining and using domain models for software engineering. At the core are the languages: OWL and RDF.
--

This sentence is potentially very misleading:
"The OWL models themselves encode much of their meaning (also known as semantics), so that applications can discover and access appropriate models dynamically."

What exactly do you mean? Can you give an example? What this sentence is suggesting is in general, just plain wrong, it is very difficult for HUMANS to look at another OWL model and decide exactly what the intended meaning is for defined concepts and relations.  No computer is going to be able to do that. Also, the 'intended' meaning is different from the actual meaning, in the formal sense, i.e. as determined from the axioms.
The matter is complex and subtle, getting into model theory etc, which you surely want to avoid in this note.

I suggest to back off from the above statement.  Give and example of what EXACTLY you mean, and then say it in a clear way;  the result will likely be a very weak version of the above statement.  
--

The following sentence may need more elaboration as a way to introduce important concepts.
"The richness of the Semantic Web representation languages makes it easier to build reusable, quality domain models, because additional reasoning services such as consistency checking and classification can be exploited. At the same time, OWL and RDF operate on similar structures like object-oriented languages, and therefore can be relatively seamlessly integrated with traditional software components."

As it stands, it is a bit too full with buzzwords etc.  Address the following questions that readers may ask:
*	what is 'richness' and how does richness make it easier to build reusable, quality domain models?
*	what is a quality domain model?
*	how does classification help?

What the sentence does say, but is clouded by the sentence being so long with some much content is that consistency checking helps build more accurate models. 
--

SECTION 2: Application Development 

"be easier analyzed" à "more easily analyzed"
--

"internet contents" -> "internet content" (several occurrences)
--

It seems you should make a distinction between object-oriented software languages like Java, C++, etc. and object-oriented modeling languages like UML [and frame-based representation languages that pre-dated OWL].
--

"For example, RDF can be used to define that the class Product has a property hasPrice which takes values of type float."

I think you mean RDF Schema, not RDF (this occurs countless times)
--

"a HTML page" -> "an HTML page" - this occurs in various places.
--

" can be linked into the Semantic Web"  is a highly ambiguous statement, not least because 'Semantic Web' is ambiguous. Try to say something less contentious, and still accurate. e.g.
"can be published on the Web just as any HTML page is published".
--

You say: 
"For example, a HTML page showing a certain product could encode metadata to link back to the corresponding entity in an RDF model. Or, providers of certain products can instantiate the RDF classes to announce their portfolio to shopping agents."

but you fail to motivate why doing so is useful, i.e. so what? Relate the answer back to the advantages of Semantic Web for software development from the introduction.
--

Figure 2:  Overall, very nice figure illustrating some key things at a very general level.

Minor point: you seem to be suggesting that OWL files will be displayed to look like UML models. This is fine, but potentially misleading - they might assume such is the standard way to view them.  Be clear about the fact that UML diagrams are only one way to show [parts of] an OWL file. There is a lot that is different about OWL and UML, as well as many similarities.
--

"While some of this could also be achieved using traditional XML-based approaches"

Some of what, exactly?
--

This is a very dangerously misleading statement, even though it is in a sense true:
"Since [the] basic structure [of Semantic Web languages] is in a sense object-oriented, it is possible to define subclasses and generalizations of concepts "

The sentence is backwards: it does not follow that because OWL is object-oriented, it has these things, rather, because OWL has these things, it has a similarity with object-oriented models.  The basic common threads for both OWL and object-oriented languages include: classes, subclasses, properties, inheritance.  

The truth is, at its heart, OWL is NOT AN OBJECT-ORIENTED LANGUAGE, but the precise difference is somewhat subtle.  One core difference is:
OBJECT-ORIENTED: a property is fundamentally attached to a class
OWL: a property is NOT fundamentally attached to a class, it has its own existence. Associations with a classes are via the domain and range constraints. There are other key differences that I see are in a later section.
--

This sentence is in a sense, true, but also potentially very misleading and thus an easy target for being accused of being hype:
"This means that whenever a model of a certain domain has been published on the Web, then others are able to build upon it, and thus to establish a network of domain knowledge."

While this is the promise of the Semantic Web, the truth is that willy nilly connecting things up is by itself, not that useful, you need to make sure that the semantics of things is consistent. This topic is taken up more in the semantic integration and interoperability note of the OEP group.
--

"much easier" "much more easily"
--

"Furthermore, it is far more likely that an application-independent reusable component (such as a shopping basket application or a credit card handling Web Service) can be integrated."

Far more likely than what?
--

This is another promise of the Web, that is fantastically hard to achieve in practice, be careful not to hype to much: 
"This means that OWL models are not only limited to defining classes and their attributes, but can also encode the intended "meaning" of these classes, so that the classes can be unambiguously shared between groups of humans or machines."
--

What is important is that the LANGUAGE has a formal semantics, so that the precise meaning of what subClassOf or transitivity is unambigous (at least to a logician who can read and understand a model theory).  Removing such ambiguity is important for getting interoperability when people build tools and reasoners to support the language. 

It is AN ENTIRELY DIFFERENT MATTER to claim that defining a term like "process" or "product" in your OWL ontology will be unambiguous to some agent on the Semantic Web.

On this point, here are a few paragraphs from a position statement [to be?] published in the first issue of the Journal of Applied Ontology:

===
It is often blithely assumed that representing the semantics explicitly and formally, removes all ambiguity in meaning, thus enabling machines to be programmed to automatically discover the meaning and behave appropriately.  Such claims are very misleading, or just plain false.  For example, terms defined in a logic-based representation language with a model-theoretic semantics are highly ambiguous.  Adding more axioms can rule out more and more models, however for many concepts, such as 'human being' or 'car' there are fuzzy boundaries. It will never be possible to add enough axioms to stamp out all ambiguity. Even if you could, it is not likely to be useful to have many dozens or hundreds of axioms for each concept - how would they be used?  

Another important point is that in practical terms, meaning often bottoms in natural language rather than formal definitions. For example, to understand the precise meaning of terms defined in ontology representation languages, one needs to learn about model-theoretic semantics. This requires talking to someone or reading logic textbooks. Even if a full axiomitization was possible, it is computationally intractable to determine say if the semantics of two terms is the same.  It is not even clear what it means for an agent to automatically determine the semantics of a term from a formal definition. How would the agent internally represent that meaning? 

Because full axiomitizations are neither possible nor desirable in many cases, we must again rely on natural language definitions to be more sure of what something means.  No agent is going to automatically determine what 'author' means from looking at an RDF Schema definition of the term.  Humans have to look at the natural language definitions and use that information to build the correct behavior into software to be compliant with the Dublin Core. 

A challenging research issue is to explore the boundaries of automated semantic processing, vs. the need to manually encode the meaning in software. Can we understand and begin to automate how meaning is built up from smaller pieces, what needs to be assumed? What needs to be done by humans, and what can be automated? How far can we go by simply agreeing on what things mean, and not having to rely on machine-processible semantics [see Uschold 2003].

[Uschold 2003] Uschold, M. Where are the Semantics on the Semantic Web?AI Magazine, 24(3), pp 25-36. Fall 2003.
===
--


Re: Protégé and screenshots of tools.
It is certainly fair to use a screenshot of Protégé, based on its pervasive use. It might be good to also show one from another tool, one that has diagrams that look much more like UML. One example is Construct, from Cerebra. Dunno if it being a commercial product is problematic.
--

The following is not true, for MDA software development:
"In contrast to traditional object-oriented design methodologies, where analysis and design only leads to intermediate artifacts for code generation, the Semantic Web approach uses the same models for all stages from analysis, design, implementation to testing and even at run-time."
--

Why the 's' in 'RDFs'?  -- "so that all RDFs files could directly refer to each other."
--

This is an odd wording, using 'fill': "and fill the object with a specific price". I guess you mean assign a value?
--

How does open world contrast with UML / object-oriented? Give an example of a practical situation where this difference is important, the discussion in the note may be true, but is too general.
--

This is judgmental: 
"However, there is little beyond that, and RDF alone would be a rather poor domain modeling language."
Perhaps better to say that RDF Schema has the basics for modeling, and is useful in some circumstances.  If there are specialized needs, then they can use a richer language. There was a lot of demand for OWL-Lite, which is nearly the same in expressivity as RDF-S.
--

Instead of a laundry list of added expressivity, motivate the need for them with some examples. This brings it to life.   Start with an example with some depth and detail to it, and show in that single example how the various features of the language are used and how they help. It is one thing to merely be ABLE to express something, it is another for that to add value somehow.  
--
Quoted Paragraph: 
"OWL adds language elements to express complex logical relationships between classes and properties. The central building blocks of these relationships are so-called restrictions, which are used to describe the characteristics of the property values at a certain class. OWL supports various types of restrictions"

The 1st sentence is not quite right, or does not emphasize the key aspect.  A cardinality constraint has nothing to do with a class. Also, what does it mean to be "at a class"? The rest of the paragraph is not quite true,  or is somewhat misleading. It seemingly jumbles together two very different things: constraints on a particular property (e.g. cardinality), independent of a class, and properties that are used to define classes i.e. restrictions.  

Perhaps the word 'restriction' is ambiguous. The heart of the matter (i.e. central building blocks) is indeed restrictions (in the common English usage) on aspects of properties. But 'restrictions' as per OWL, are class definitions, and class definitions are not the central building blocks, though they are quite important. Properties are at the heart of the matter.

Indeed, it is unfortunate that  the term 'restriction' in OWL is defined to mean class defined by the specification of a restriction of the possible values that one or more properties can have. It makes talking about this stuff in readable English nearly impossible.  

I would replace the paragraph above with something like: 
===
OWL adds the ability to express more information about properties, for example, you can specify that the value of the hasWheel property is equal to 2; that the tallerThan relationship is transitive, etc.  

OWL also allows you create classes based on property values.  For example, you could define the class: DogTrainer to be exactly those individuals who participate in the trainerOf relation where the participating individual is a member of the class Dog.  Classes may be defined using complex logical relationships between classses and properties. For example, you could define a class DogTrainerFromNewYork by adding an additional condition to the one used to define DogTrainer. The new condition would state that the individual is in the cityBornIn relationship with the instance of city: NewYork.  
==

Here is a good example of the confusion that arises with the technical usage of 'restriction' to mean class.

"Cardinality restrictions limit the number of distinct values that a property can have at a certain class. "

What does this mean literally? It is hard to say.  What does it mean for a class to limit something? What does it mean to be "at a class?"  It is perhaps more accurate to say:

===
Cardinality can be used in a class definition to limit (i.e. restrict) the number of distinct values (of a certain class) that a property can have. A class defined in this way is called a cardinality restriction.
===

This is very challenging wordsmithing, and the concepts themselves are rather peculiar, at first sight. After 15 years doing KR and ontologies (but never having used a DL in earnest), it took me quite a while to get my head around these ideas. Therefore, it is critical to get this explanation right, if readers are to able to understand it.  The key is that an OWL Restriction, is a class defined by restricting values of properties in certain conditions. 
--

The following sentence should come earlier in the description:
"It is important to understand that OWL restrictions are themselves classes (so-called anonymous classes)."
--

This is quite odd, I don't know what you mean:
"OWL reasoners can be used to determine the most appropriate types of individuals"
Most appropriate for what???
--
The idea of an anonymous class is probably best left unmentioned here, it is a bit subtle, and potentially very confusing.
--

The example below is not very compelling, can you think of a better one?  Why would a user care about "the most specific class that the particular order belongs to".  A simple rewording to be more user-focused, instead of reasoner focused would probably do the trick. 

"Now assume, a new user logs into the online shop and starts putting items into his shopping basket. Internally we will create blank instances of the Customer and PurchaseOrder classes. Later, when the user proceeds to the check out and enters his delivery address, we can ask a reasoner to classify the PurchaseOrder. This will give us the most specific class that the particular order belongs to (here, a DutyFreeOrder)."
--

In general, you need more examples that an object-oriented software developer can relate to, why do they want to classify anything? It is easier to see why they might want to validate the model, but even then, that presumes that the model is useful for something else.
--

The following seem to be saying the same thing:
Each individual has one class as its type.  	Each individual can belong to multiple classes.
Classes cannot share instances. 		Individuals can belong to multiple classes.
--

Perhaps replace: "The list of classes is known at compile-time." with "The list of classes is fully known at compile-time, and cannot change after that."   
Because the classes for an ontology are also known at compile time, but more can come later.
--

"built-time" should be "build-time"
--

I'm not sure what this distinction is:
OBJECT-ORIENTED: Instances can only take values for the attached properties. Values must be of the correct types defined for the properties.  	

OWL: Any instance can take arbitrary values for any property, but this may affect what reasoners can infer about their types.

For the first, do you mean: An instance of a given class can only take values for the properties attached to that class.

It is true for OWL as well that:  "Values must be of the correct types defined for the properties.", i.e. they must obey range constraints.

I really don't get how the 2nd statement relates to the first. Maybe each of these differences needs a nice simple example to illustrate it.
--

OBJECT-ORIENTED: Classes encode much of their meaning and behavior through imperative functions and methods.  	
OWL: Classes make their meaning explicit in terms of OWL statements. No procedural attachment is possible.

Use language consistently, if procedural attachment is the same as imperative functions and method, then use just one phrase in both statements. Makes comparison more explicit.
--

This does not make sense:
"Closed world: If something is not part of the model, then it is assumed to be false."

In what sense is a pickle false, just because it is not in the model? You mean if there is not enough information to prove a statement true, it is assumed to be false.
--

Overall this list has a ton of useful information. It could be improved by teasing out different distinctions a bit better. In various cases, it is not clear what is being compared.
--




>  -----Original Message-----
>  From: Holger Knublauch [mailto:holgi@stanford.edu] 
>  Sent: Monday, September 19, 2005 4:55 AM
>  To: public-swbp-wg@w3.org
>  Subject: [SE] Suggestion of new note
>  
>  
>  
>  The SETF is currently working on a note intended to act as a 
>  Semantic 
>  Web Primer for software developers with background in 
>  object-oriented 
>  languages like UML and Java.  Our goal is to clarify the differences 
>  between RDF/OWL and OO languages, and to attract more mainstream 
>  developers to add Semantic Web technology to their routine tool kit.
>  
>  The current draft of this note is available at
>  
>  http://www.knublauch.com/oop/2005/09/19
>  
>  We welcome comments of any sort.
>  
>  Please note that due to recent problems with the SMI email 
>  server, many 
>  messages addressed to me have been lost.  I therefore had to 
>  switch to a 
>  different stanford.edu address.  If you had sent a message 
>  to me in the 
>  last few days, please consider to resend it again.
>  
>  Regards,
>  Holger
>  
>  
>
Received on Tuesday, 20 September 2005 03:08:13 UTC