Re: Learning from other disciplines? from Pat Hayes on 2009-03-02 (public-awwsw@w3.org from March 2009)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 2 Mar 2009 00:23:08 -0600
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: "Booth, David (HP Software - Boston)" <dbooth@hp.com>, Michael Hausenblas <michael.hausenblas@deri.org>, Jonathan Rees <jar@creativecommons.org>, AWWSW TF <public-awwsw@w3.org>
Message-Id: <F713B89D-951B-46D2-84E4-EC146E3BF83A@ihmc.us>
On Feb 26, 2009, at 10:29 PM, Alan Ruttenberg wrote:

>
>> , and yes, I
>> do think there are some rules of good modeling
>>
>> Can you write them down? Seriously, if you really think these  
>> exist, it
>> would be a valuable resource.
>
> There are some for the OBO Foundry: http://www.obofoundry.org/crit.shtml
> I often summarize:
>
> Document what terms mean by tracing them to elements in reality.

Not sure what that means. One problem generally with appealing to  
reality when discussing modeling, is that any account of reality is  
already a model and hence reflects a modeling choice. Thus, this  
advice can be circularly self-justifying: the model thinks of the  
world as containing, say, forms because forms are found in  
reality ...as perceived or conceptualized by the modeler, of course;  
but not perhaps as conceptualized by someone else. In this way, what  
are basically philosophical stances get justified as having an  
empirical grounding which in fact is entirely absent. OBO has a number  
of such arbitrary modeling choices incorporated into it.

> Or
> say when you are not.
> Have a theory of what an instance is.

That doesn't make sense to me, I'm afraid. (An instance of what?)

> Work out how instances stand in relation to one another – what are
> their properties.

Well, yes. But that is hardly a rule for good modeling, as it is a  
rule for any modeling at all.

> Define classes as instances with shared properties

Do you mean that a class is defined by properties shared by  
individuals, or held by individuals in common? If so, again that is  
simply a fact of logic, rather than a rule of good modeling. If you  
mean something else, I'm not following you.

> Figure out how to document and organize all this knowledge in a way
> that can be managed in a distributed manner.
> The product of this effort is an ontology
> Use the ontology to structure knowledge and data you wish to share  
> with others
>
> Others:
> Don't confuse words with the entities they denote

Agreed.

> (one of David's mistakes).
> Every term deserves a little thought. (how often is this one ignored?)

Actually, not all that often, in my experience. But several people can  
all think and still not all come to the same conclusion, and there are  
no empirical tests for deciding one over another.

>
> There are a bunch of things I've learned about building ontologies
> with others. For instance, don't argue about names - figure out
> definitions and then give names later.

I agree that this is good methodology, as arguments about named  
entities can go on for ever. (The same applies to drawing, by the way,  
which is quite amusing. One of the early lessons when learning to draw  
is to actively forget what it is that one is drawing, to refuse to  
allow oneself to classify it. Only then can you focus on drawing the  
appearances, which are far more complex and visually interesting than  
the schemata that we all use to classify the visual world.)

>
> Yes it would be great to have a resource of the various lessons we've
> all learned.
>
>>
>> , of which thinking is a
>> first prerequisite. This AKT example has been mentioned too many  
>> times
>> and it is clear that independent of any 'introduced' ambiguity, the
>> modeling choice is unexplained and incoherent from the getgo. No
>> amount of patching will fix it.
>>
>> Well, it sounds like the basis of the issue here is whether to model
>> something as a class or an individual. As a general question, this  
>> arises
>> quite frequently. The "best" thing to do has no obvious answer,  
>> because it
>> depends on the formalism you are using. There is no single, clear  
>> answer
>> which works in all cases. If you use a decently expressive  
>> language, the
>> question is meaningless: so it evidently is not an ontological  
>> question.
>
> The question is certainly not meaningless.

The question of whether something is a class or an individual is  
(literally) meaningless in a sufficiently rich first-order language.  
It is meaningless in CL, in particular: everything is an individual  
and everything is, or can be, a class. RDF and OWL-Full share this  
ontological liberality.

The Class/Individual distinction is a restriction built into  
conventional logics in the first half of the 20th century in order to  
avoid Russell-style paradoxes. We now know enough about logic to see  
that it is not in fact needed in order to avoid paradox, and can in  
fact be simply ignored. (I have found that this idea is often resisted  
most strongly by those who have become used to using the logical  
distinction to encode an ontological difference of some kind, along  
the lines Barry Smith outlines in his 'against fantology' paper. But  
that is just a category mistake; and once one gets used to using an  
radically untyped logic, the freedom of expression it allows you is  
very hard to give up.)

> It goes to the question: of
> what are you speaking.
>
>> I note that David given *no* definitions for any of the terms he  
>> uses.
>> Shall we at least agree that minimal documentation of the terms one
>> uses, in natural language, is a universally applicable good modeling
>> practice? At least when attempting to write in a scholarly way about
>> modeling?
>>
>> Documentation, but not definition.
>
> Hardly. It reads to me like a recipe for a magic potion.

To ask for definitions of everything is naive. Most natural kind terms  
do not have definitions, and definitions are not required for writing  
ontologies.

>
>> If you believe that ambiguity can be entirely eliminated by "good  
>> practice",
>>
>> then I venture to suggest that your advice will be one of the  
>> enduring
>>
>> problems for future knowledge workers.
>>
>> I don't believe I said that. I said that in this case one needs not
>> bring in any new terminology to see what's gone wrong - that applying
>> basic knowledge of how to model (starting with saying what your
>> instances mean) leads to a situation in which one needs not introduce
>> Dave's hacky solution.
>>
>> And this will be true pretty much
>>
>> independently of the advice itself, because whatever you tell  
>> people to do,
>>
>> other people will choose, for excellent reasons, to do the opposite.
>>
>> Sure, and there will be people claiming that the earth was created
>> 5000 years ago. They just shouldn't be doing science.
>>
>> Bad analogy. Modeling isn't doing science; and there isn't a  
>> science of
>> modeling (yet).
>
> Modeling has to head in that direction. Maybe I think there's more of
> a shot at that than you do. At least we have to act like we're trying.

It doesn't have to head in that direction; in fact, I don't even think  
that "direction" makes sense. It would be pseudo-science at best, and  
we have altogether too much of that already.

> When I read some of your messages they strike me as advertising: don't
> worry, model how you like, no standards necessary, don't use any
> criteria to evaluate what you've done....

No standards of how to model, right. There are no such standards, and  
anyone who tries to set any up should be resisted with every means  
possible, because any such attempt will be one philosophical position  
attempting to dominate others, and there are no possible objective or  
empirical reasons to give any one philosophical position any special  
status. None of them are right or wrong in any objective sense, any  
more than OO programming is more objectively right than logic  
programming. They all have their strengths and weaknesses.

But as for using criteria to evaluate, there indeed we can make some  
useful comparisons, provided that we are clear as to what exactly it  
is that we are trying to achieve. I am all for pragmatic tests of  
utility and complexity of ontologies. But these have to be genuinely  
empirical in order to be more than just reiterations of prejudices;  
and part of the reason for the impression you get from my messages is  
that I feel quite strongly that it is far too early in the general  
game for any of us to be drawing hard empirical lessons about what is  
best, as we simply don't have enough accumulated experience yet. And  
one of the best ways to get some experience is for as many people as  
possible to be trying to model things in as many ways as possible, and  
we will all see which ways work well and which ways don't. Which is  
exactly what people developing SWeb applications using RDF are trying  
to do. No doubt many of these will be flops, because of poor modeling  
decisions. But I want to discover what works and what doesn't, rather  
than try to influence or, God forbid, legislate by fiat what I or  
anyone else thinks are "good" modeling styles. None of us know yet  
what is good and what isn't. The formalisms continue to surprise us.

Pat


>
> -Alan
>
>> Pat
>>
>> -Alan
>>
>>
>> Pat Hayes
>>
>>
>>
>>
>> Otherwise we land up with a same as historical mess of incommensurate
>>
>> data dressed up in a brand new syntax.
>>
>> -Alan
>>
>>
>> I've updated the scenario to make this clearer:
>>
>> [[
>>
>> It is worth noting that in this particular scenario, the problem that
>>
>> Jann and Luke face could have been averted if they had initially  
>> modeled
>>
>> these genes as classes rather than as individuals, because then  
>> AKT1, AKT2
>>
>> and AKT3 could simply have been subclasses of AKT.  Indeed,  
>> modeling things
>>
>> as classes does help avert -- or at least postpone -- this kind of  
>> issue,
>>
>> though it may create other issues.  However, the point of this  
>> scenario is
>>
>> not to debate Jann's modeling decisions, it is to illustrate how this
>>
>> ambiguity issue can be addressed when it does arise.  Thus, we need  
>> to
>>
>> assume that, for whatever reason, Jann did what he did, and Katie's
>>
>> application then depended on Jann's definitions.
>>
>> ]]
>>
>>
>> Separately one has terminology. The word "AKT" was initially only a
>>
>> label for AKT. Later it also became a label for AKT1. No surprise
>>
>> again - words are notoriously ambiguous.
>>
>> Yes, that happened historically, but it is irrelevant to the point  
>> of the
>>
>> AKT scenario I described.  I've clarified the note at the beginning  
>> to
>>
>> indicate more clearly that the scenario I describe was merely  
>> *inspired* by
>>
>> the history of AKT, but the details are fictional: "This scenario was
>>
>> inspired from the actual history of AKT.  However, the names and  
>> other
>>
>> details are completely fictional."
>>
>>
>> There is no need to introduce these *completely undefined* relations
>>
>> s:isBroaderThan etc. There *is* a need to understand and use existing
>>
>> *defined* mechanisms, such as rdfs:subClassOf and rdfs:label, in this
>>
>> case.
>>
>> rdfs:comment wouldn't be a bad idea while we're at it.
>>
>> I don't understand why you say that these are undefined.  They are
>>
>> defined in a later section of the document, as noted, using  
>> rdf:comment:
>>
>> http://dbooth.org/2007/splitting/#isBroaderThan
>>
>> [[
>>
>> s:isBroaderThan a rdfs:Property ;
>>
>>   rdf:label "isBroaderThan" ;
>>
>>   rdf:comment """s:isBroaderThan indicates that the subject URI
>>
>>       has a URI declaration that is broader than some URI declaration
>>
>>       of the object URI.  (See s:isBroaderThanDeclaration.)
>>
>>       This is a convenience property:
>>
>>       Since a URI could have more than one URI declaration,
>>
>>       this property makes weaker statements than
>>
>>       s:isBroaderThanDeclaration. """ ;
>>
>>   rdfs:domain xsd:anyURI ;
>>
>>   rdfs:range xsd:anyURI .
>>
>> s:isNarrowerThan a rdfs:Property ;
>>
>>   rdf:label "narrows" ;
>>
>>   rdf:comment """isNarrowerThan is the inverse of  
>> s:isBroaderThan.""" ;
>>
>>   rdfs:domain xsd:anyURI ;
>>
>>   rdfs:range xsd:anyURI .
>>
>> ]]
>>
>> Did you want some other kind of definition?
>>
>>
>>
>> David Booth, Ph.D.
>>
>> HP Software
>>
>> +1 617 629 8881 office  |  dbooth@hp.com
>>
>> http://www.hp.com/go/software
>>
>> Statements made herein represent the views of the author and do not
>>
>> necessarily represent the official views of HP unless explicitly so  
>> stated.
>>
>>
>>
>> -Alan
>>
>> On Thu, Feb 26, 2009 at 4:39 PM, Michael Hausenblas
>>
>> <michael.hausenblas@deri.org> wrote:
>>
>> Thanks David!
>>
>> Re http://dbooth.org/2007/splitting/ - yes, I'm aware of it
>>
>> (actually
>>
>> bookmarked it on delicious on 26 Jan 2009 ;) and of course
>>
>> I read it.
>>
>> I must admit that when I read your note I didn't really
>>
>> get/see this point.
>>
>> My bad, sorry.
>>
>> @Jonathan: as there are at least two people around that
>>
>> think into the same
>>
>> direction and maybe some more that could imagine this can
>>
>> solve some of our
>>
>> issues around httpRange, IR, etc. - how about adding it to
>>
>> the TAG F2F
>>
>> agenda? Or is it too late? Too vague?
>>
>> Cheers,
>>
>>     Michael
>>
>> --
>>
>> Dr. Michael Hausenblas
>>
>> DERI - Digital Enterprise Research Institute
>>
>> National University of Ireland, Lower Dangan,
>>
>> Galway, Ireland, Europe
>>
>> Tel. +353 91 495730
>>
>> http://sw-app.org/about.html
>>
>> http://webofdata.wordpress.com/
>>
>>
>> From: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
>>
>> Date: Tue, 24 Feb 2009 22:06:53 +0000
>>
>> To: Michael Hausenblas <michael.hausenblas@deri.org>, AWWSW TF
>>
>> <public-awwsw@w3.org>
>>
>> Subject: RE: Learning from other disciplines?
>>
>> Michael,
>>
>> That sounds similar what I've been arguing for quite a while:
>>
>>  (a) Ambiguity is unavoidable. (Pat Hayes has articulated
>>
>> this point much
>>
>> better than me though.)
>>
>>  (b) The ambiguity involved in failing to distinguish
>>
>> between an IR and a
>>
>> non-IR is not fundamentally different than other kinds of
>>
>> ambiguity.
>>
>>  (c) Something that is adequately clear and unambiguous to
>>
>> one application may
>>
>> be ambiguous to another application, because different
>>
>> apps have different
>>
>> needs.  A URI such as http://markbaker.ca/ that denotes
>>
>> both a person and a
>>
>> web page may be perfectly fine for an application that has
>>
>> no need to
>>
>> distinguish between IRs and non-IRs, but it may cause
>>
>> confusion and havok to
>>
>> an application that relies on such a distinction.
>>
>>  (d) Therefore, there is no need to view such
>>
>> IR-versus-non-IR ambiguity as a
>>
>> violation of web architecture, though it may be a
>>
>> violation of good practice.
>>
>> These points are explained a but further in
>>
>> http://dbooth.org/2007/splitting/#httpRange-14
>>
>>
>>
>> David Booth, Ph.D.
>>
>> HP Software
>>
>> +1 617 629 8881 office  |  dbooth@hp.com
>>
>> http://www.hp.com/go/software
>>
>> Statements made herein represent the views of the author and do not
>>
>> necessarily represent the official views of HP unless
>>
>> explicitly so stated.
>>
>>
>> -----Original Message-----
>>
>> From: public-awwsw-request@w3.org
>>
>> [mailto:public-awwsw-request@w3.org] On Behalf Of Michael
>>
>> Hausenblas
>>
>> Sent: Tuesday, February 24, 2009 8:39 AM
>>
>> To: AWWSW TF
>>
>> Subject: Learning from other disciplines?
>>
>>
>> All,
>>
>> This is a crazy idea, but please give it a thought before
>>
>> rejecting it ...
>>
>> As far as I gather 'we' sort of fail to agree if we
>>
>> should/can define IR and
>>
>> non-IR or even if we need to differentiate between documents
>>
>> and abstract
>>
>> things at all. One could now try to understand the problem
>>
>> from a totally
>>
>> different point of view by learning from quantum mechanics.
>>
>> You are surely aware of the waveparticle duality [1]? So why
>>
>> can't we try
>>
>> to apply the same idea here. We can say, for example,
>>
>> that for a given
>>
>> application/use case the distinction between IR and non-IR
>>
>> makes no sense at
>>
>> all and hence is useless; all that counts at the end of the
>>
>> day are some
>>
>> bytes and maybe some metadata that we can get over the wire.
>>
>> In other cases
>>
>> one thing may be abstract or one thing may be a document. The
>>
>> Web version of
>>
>> the 'waveparticle duality'-equivalent would then render sort of:
>>
>> ===
>>
>> The 'document-thing duality' addresses the inadequacy of
>>
>> classical concepts
>>
>> (from the operating system domain, software development,
>>
>> etc.) like
>>
>> "document" and "abstract thing" in fully describing the
>>
>> behaviour of
>>
>> Web-scale objects.
>>
>> ===
>>
>> Comments, anyone?
>>
>> Cheers,
>>
>>      Michael
>>
>> [1] http://en.wikipedia.org/wiki/Wave-particle_duality
>>
>> PS: Jonathan, thanks a lot for your detailed comments re the
>>
>> dependencies
>>
>> visualisation - I will address them in a separate mail (esp.
>>
>> the n^2 table
>>
>> approach - I like it ;)
>>
>> --
>>
>> Dr. Michael Hausenblas
>>
>> DERI - Digital Enterprise Research Institute
>>
>> National University of Ireland, Lower Dangan,
>>
>> Galway, Ireland, Europe
>>
>> Tel. +353 91 495730
>>
>> http://sw-app.org/about.html
>>
>> http://webofdata.wordpress.com/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------
>>
>> IHMC                                     (850)434 8903 or (650)494  
>> 3973
>>
>> 40 South Alcaniz St.           (850)202 4416   office
>>
>> Pensacola                            (850)202 4440   fax
>>
>> FL 32502                              (850)291 0667   mobile
>>
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494  
>> 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>
>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 2 March 2009 06:24:33 UTC