RE: OWL "Sydney Syntax", structured english from John McClure on 2006-11-30 (public-owl-dev@w3.org from October to December 2006)

From: John McClure <jmcclure@hypergrove.com>
Date: Thu, 30 Nov 2006 13:00:54 -0800
To: "Adrian Walker" <adriandwalker@gmail.com>
Cc: <public-owl-dev@w3.org>
Message-ID: <MGEEIEEKKOMOLNHJAHMKMEDBEDAA.jmcclure@hypergrove.com>
Surely you don't mean to imply that I am not creating material aligned with
TimBL's vision! I am building (the second version of) an OWL ontology for
legal documents; and I am annotating XHTML markup of public statutes and
other legal material using a variant of RDF/A based on an ECMA syntax....
all vanilla but with a dash of nutmeg. See Legal-RDF.org for more
information if you want.

Your approach sounds not vanilla at all if, indeed, you make the claim that
there is "no vocabulary or grammar construction" -- I think that building
ontologies is hard work! which, I am hearing you suggest, is an unnecessary
task on the road towards fruition of the "vision". A little hyperbole
perhaps?

Anyway I suspect we'd agree that within document prose are definitions of
classes and properties. For instance, our Constitution defines a concept
called "Citizen".... should we ontologists be defining that concept within
our ontologies, referencing our own definition in any annotations of the
Constitution, or should our ontologies be referencing the concept as defined
in that document? I think that the proper answer is the latter if, indeed,
TimBL's Trust layer is ever to be more than a marketing idea.

Thanks,
John
  -----Original Message-----
  From: public-owl-dev-request@w3.org
[mailto:public-owl-dev-request@w3.org]On Behalf Of Adrian Walker
  Sent: Thursday, November 30, 2006 11:41 AM
  To: John McClure; public-owl-dev@w3.org
  Subject: Re: OWL "Sydney Syntax", structured english


  John --

  Thanks for your note, and congratulations on the design of the
hypergrove.com web site.

  It seems to me that there are two, partially overlapping Semantic Web
visions.

  The first concerns the kind of work you are doing, which I believe is
mainly about bringing order and accessibility to text documents.

  The second concerns what I take to be Timbl's other vision -- a web-wide
database of RDF triples.  So the data is structured (as triples), rather
than textual.

  I guess there is some commercial success in parsing text documents to
extract (meta)data.  However, automatically parsing English knowledge and
converting it to logic for reasoning seems to be a much harder task, at
least at the industrial strength level.

  Our  Internet Business Logic work, with its lightweight approach to
English knowledge input, is mainly directed to reasoning over structured RDF
and other data, although there are  some examples such as [1] that reason
about documents.

  So, the aspect of RDF that we mainly care about is that it allows you in
principle to freely mix and match structured data from different sources on
the web.  There's actually more to it than that, though [2].

  The example [3] is the closest we have got to document exchange so far.
As you may see, the ontological aspects are in executable English rules**,
rather than in OWL.

  There are also some small OWL-related examples, such as [4].

  Perhaps one place where our respective approaches begin to overlap is
this.  Wwe do a form of information retrieval to try to tie an English
question that  a  user has typed in to the concepts that are currently
loaded into the system.

  Best regards,  -- Adrian

  **  As previously mentioned, the rules are open vocabulary, open syntax.

  [1]  www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent

  [2]  www.semantic-conference.com/program/sessions/S2.html

  [3]  www.reengineeringllc.com/demo_agents/SemanticResolution1.agent

  [4]  www.reengineeringllc.com/demo_agents/{OwlTest1 OwlResearchOnt
FeaReferenceModelOntology2}



  Internet Business Logic (R)
  Executable open vocabulary English
  Online at www.reengineeringllc.com
                                  Shared use is free
  Adrian Walker
  Reengineering
  Phone: USA 860 830 2085





  On 11/30/06, John McClure <jmcclure@hypergrove.com> wrote:
    Adrian,

    I am curious about this fascinating approach -- may I ask

    (1) if there is no ontology (your words: "no vocabulary or grammar
construction"), why do you care about the RDF which depends completely on
class and property definitions?  If your response is that "the approach
creates classes and properties as a consequence of the text analysis" then
is the resultant ontology ever stored? or re-used? or shared?

    (2) is "document exchange" out-of-scope (inapplicable) for this
approach, since there appears to be no contractual reference ontology
between publisher and consumer?

    Thanks much for your reply,
    John
      -----Original Message-----
      From: public-owl-dev-request@w3.org [mailto:
public-owl-dev-request@w3.org]On Behalf Of Adrian Walker
      Sent: Thursday, November 30, 2006 9:00 AM
      To: Pat Hayes; public-owl-dev@w3.org
      Subject: Re: OWL "Sydney Syntax", structured english


      Pat --

      You wrote...

      There have been several proposals for English-like
      syntaxes for logic, see for example John Sowa's 'structured English'.
      Again, one can make these look quite convincing by a deft choice of
      basic vocabulary, but they always become incomprehensible when one
      uses a slightly divergent one. The problem is that when it reads
      *almost* like English, any non-English constructions - nouns in place
      of verbs, the wrong preposition, etc., - become very intrusive and
      awkward. Some object-oriented programing notations claim similar
      transparency, and there have been proposals for English-y syntaxes
      for KRep notations, such as various frame-based systems which allow
      things like (Every Person who owns a donkey beats the donkey of
      self). I confess to not having citations ready for this, but such
      systems were developed at U. Texas, for example.

      Yes, there are many proposals to try to model enough of ordinary
English usage to make writing and running knowledge easier than with formal
notations.  The underlying idea in all of these is to parse with a grammar,
translate automatically to some form of logic, and to execute that.  There
are brave folks who also attempt the reverse translation, from logic to
English.

      As has been pointed out many times, this approach does not seem work
outside of natural language research projects.   If it did work, it would
surely by now be a huge commercial success. It appears to encounter several
roadblocks, including the ones you mention.  The fact that English is a
moving target  does not help.

      There is a different approach.   The approach is lightweight,  and
seeks to go around the deep NL research problems involved, rather than
tackling them head on.  Roughly speaking, the approach is to assign an open
vocabulary, open syntax string to each predicate symbol in the underlying
logic.  If a predicate is n-ary, the corresponding string has n place
holders (or variables) such as "some-person" or "that-time".    There's more
to it than that, but that's the basic idea.

      This allows one to label predicates with strings such as

          so far as is known at this-time there is no evidence to suggest
that this-person is a terrorist

      (Actually the approach starts with the string, and invents an
arbitrary corresponding predicate say,  p33(x,y), for computation)

      This  lightweight approach means that there is no dictionary or
grammar construction -- at least in the usual 'structured English' sense.
It also means that one can use jargon, government acronyms, 'google' as a
verb, and so on.  Of course, this violates all sorts of expectations about
how one should compute using English syntax and  semantics.   And it's of
zero interest to NL researchers, rightly so.

      But, if one is willing to accept the trade off involved, it actually
seems to be useful!

      As you may know, this is the approach taken for the author- and
user-interface of the Internet Business Logic system [1].  The system is
online, and shared us is free, so folks can check for themselves that they
can write this kind of English to a browser, and then run it.

      BTW, my PHD thesis subject was Chomsky grammars, and like many other
folks I have banged my head dutifully against the 'structured English' wall.
Great research topic.  Very hard to make it work at industrial strength.

      With apologies to Kendal,     -- Adrian

      [1] Internet Business Logic (R)
      Executable open vocabulary English
      Online at www.reengineeringllc.com
                                      Shared use is free
      Adrian Walker
      Reengineering
      Phone: USA 860 830 2085
Received on Thursday, 30 November 2006 21:00:51 UTC