Re: Integrating Disparate Information Systems

On Tue, Nov 9, 2010 at 11:39 AM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
> On 11/9/10 10:23 AM, John F. Sowa wrote:
>
> John,
>
> Great response.  I am cc'ing in LOD mailing as your comments are poignant
> re. systems integration and the need to separate Logic from Syntax etc..
>
> Others: I encourage you to read on, and digest.

I have read it, and while it mentions a number of historical points
that might be of interest to younger folk, I also find that it clouds
a number of issues. Comments in line.

>
>> On 11/9/2010 1:24 AM, Alex Shkotin wrote:
>>>
>>> What do we need for our information systems to communicate properly?
>>> Integration? Alignment? Unification? Information system education?
>>
>> The first point I'd emphasize is that IT systems have been successfully
>> communicating for over a century.  Originally by punched cards, then
>> by paper tape, magnetic tape, direct connection, and telephone.

For a very limited set of pairs of systems. The movement now is to
make it much more likely that a pair of systems can communicate
meaningfully. That is new.
So I don't see the point that is being made by this statement.

>> When Arpanet was started in 1969, there had been a long history
>> of experience in data communication.  And the latest conventions
>> for the WWW are still based on extensions to those protocols.
>>
>> Those physical formats and layouts are very important for the
>> technology.  And they will remain buried in systems for ages
>> upon ages.
>>
>> But you never, ever want those formats to have the slightest
>> influence on the semantics.

Where do you see the influence of format on semantics being an issue
here? Any language is going to need an encoding. Here we are trying to
arrange things so that, at a minimum, there is at least one syntax
that any communicator can handle - a common denominator. How many of
the historical (and current) systems failed to communicate because of
stupid differences in syntax - bit ordering, choice of delimiters,
other arbitrary choices. We need, effectively, at least on arbitrary
choice that we all agree to work with.

But OWL, at least, has a straightforward translation to and from the
portion of logic it is capable of representing. That portion is not
constrained by the syntax, but by issues you discuss below.

>> The decision to force OWL into the
>> same straitjacket as RDF was hopelessly misguided.

I see only minor inconveniences.


>> In fact, even
>> the decision to force decidability down the throats of every
>> ontologist was another profoundly misguided technology-driven
>> decision.  (Note the subtle semantic distinction between profound
>> and merely hopeless.)

There was no global decision to do anything of the sort. There was an
effort to create some standard. When a standard is created, people who
make decisions get the people that work for them to work within that
standard, in the interest of interoperability. So there were thousands
of such decisions.

There are other standards. To my view it is interesting to analyze why
they are not as successful. Suggesting that this is due to some
conspiracy or choice of few doesn't give me confidence that a deep
analysis has been undertaken.

>>> What kind of language and dictionary we need to write question? SPARQL?
>>> What kind of language  and dictionary we need to write answer? XML, CSV?
>>
>> Use whatever notation is appropriate for your application.

Here we agree.

>> But you must design the overall system in such a way that the choice for one
>> application is *invisible* to anybody who is designing or using some
>> other application.

The overall system? I really don't understand what you are referring to.
There is a standard syntax. Anyone is able to now write a tool that
takes their favorite syntax and translate it into some other syntax
for which a translator has been written to RDF/XML. We are in a
culture of open source. Over time there will be enough translators
that, for all intents and purposes, there will be no reason why what
you suggest is not feasible. But are you suggesting this could or
should have happened from the outset? Standardization that serves all
needs?


>> Of course, there may be some cases where real-time constraints make it
>> necessary to avoid a conversion routine between two systems.  But that
>> is a very low-level optimization that should never affect the semantics.
>> For example, when was the last time that you thought about the packet
>> transmissions for your applications?  Some system programmers worry
>> about those things a lot.  But they're invisible at the semantic level.

As is the case for our current stack.

>>> Where is your SPARQL end point at least?
>>
>> When you are thinking about semantics, any thought about the
>> difference between SPARQL, SQL, or some bit-level access to data
>> is totally irrelevant.

Yes. Unfortunately we need a way to get to the semantics, and that way
is via syntax. So having one syntax to learn is much better than
having many to learn.

>> Please remember that commercial DB systems
>> provide all those ways of accessing the data if some programmer
>> who works down at the bit level needs them.  But anybody who is
>> working on semantics should never think about them (except in
>> those very rare cases when they go down to the subbasement to
>> talk with system programmers about real-time constraints.)

And everyone who works in a commercial environment knows that you
can't only work on semantics. Inevitably there are other issues, like
performance, interoperability, teachability, maintainability... that
need to be worked on concurrently.

>>
>>> JS: "but every application will have... different vocabularies, and
>>> different
>>> dialects." Inside. But with a stranger we usually change language to
>>> common.
>>
>> Not necessarily.  Sometimes you learn their language, they learn
>> your language, or you bring a translator with you.
>>
>> But it's essential to distinguish three kinds of languages:
>> natural languages, computer languages, and logic.
>>
>> For NLs, translation is never exact because they all have hidden
>> ontology buried down in their lowest levels.  For computer languages,
>> the level of exactness depends on the amount of buried ontology.
>>
>> Some computer systems (such as the TCP/IP protocols) do translation
>> from strings to packets very fast because they don't impose any
>> constraints on the ontology.  Therefore, programmers above the
>> lowest system levels never think about those translations.
>>
>> For other systems, such as poorly designed software, the ontology
>> changes in subtle ways with every release and patch to any system.
>> (I won't name any names, but we've seen such things all too often.)
>>
>> But first order logic was *discovered* independently by Frege and
>> Peirce 130 years ago, and *exact* translation between their notations
>> and all the modern notations for FOL is guaranteed.
>>
>> Note the word 'discover'.  Frege and Peirce did not *invent* FOL.
>> My comment is that FOL was standardized by an authority that is
>> even higher than ISO -- namely, God.  (Please note the Bible,
>> John 1,1:  "In the beginning was the logos, and the logos was
>> with God, and God was the logos.")
>>
>> Nobody has to learn FOL, because it's buried inside their native
>> language, whatever it may be.  But some notations for FOL are less
>> readable than others.  That's why I recommend controlled NLs for
>> many purposes.

And I would tend to agree. But having FOL doesn't mean you can
communicate. FOL doesn't give you a mechanism to associate my name
with me, i.e it doesn't come with a set of symbols that refer to stuff
we care about out here in the world. So in addition to FOL we need to
have the ontology so that when I say "desk" there is a chance that
receiver of that message will understand desk. And, even if you have a
shared ontology, if what you are writing amounts to full unrestricted
FOL, then there are severe practical issues with actually using those
expressions computationally.

>> But learning to write FOL is nontrivial, even in a controlled NL.
>> The reason for the difficulty is that people are used to the
>> flexibility of their native languages with all that built-in
>> ontology.  To write pure FOL requires a very strict discipline
>> to distinguish the logic from the implicit ontology.

What implicit ontology?
We are trying to make ontology explicit. And understandable. And shared.

>> Bottom line:  The distinction between logic and ontology is so
>> important that you should never confuse people with extraneous
>> issues about bit strings, angle brackets, or even decidability.

Certainly if you are trying to teach people the difference between
these two ideas then these things are irrelevant. But after we have
taught them someone needs to deal with these things.

IMO, a rather large confusion arises from somewhere else: Namely that
we have named the thing we call OWL the "Web Ontology Language". There
is little that knowing OWL tells you about ontology. OWL is an
encoding of a decidable portion of logic. It should no more be called
an ontology language than a logic be called an ontology. There are
more errors that arise from the implication that learning OWL teaches
you something about ontology, than there are problems cause by any of
the bit strings, angle brackets, or design constraints such as
decidability or the effort to remain compatible with RDF.

-Alan


>>
>> John
>>
>> _________________________________________________________________
>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>> Unsubscribe: mailto:ontolog-forum-leave@ontolog.cim3.net
>> Shared Files: http://ontolog.cim3.net/file/
>> Community Wiki: http://ontolog.cim3.net/wiki/
>> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>> To Post: mailto:ontolog-forum@ontolog.cim3.net
>>
>>
>
>
> --
>
> Regards,
>
> Kingsley Idehen
> President&  CEO
> OpenLink Software
> Web: http://www.openlinksw.com
> Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca: kidehen
>
>
>
>
>
>
>

Received on Wednesday, 10 November 2010 01:49:36 UTC