Re: Rules WG -- draft charter -- NAF from Stefan Decker on 2003-11-18 (www-rdf-rules@w3.org from November 2003)

From: Stefan Decker <stefan@ISI.EDU>
Date: Tue, 18 Nov 2003 12:15:47 +0000
To: Jim Hendler <hendler@cs.umd.edu>, Benjamin Grosof <bgrosof@mit.edu>, Sandro Hawke <sandro@w3.org>
Cc: adrianw@snet.net, www-rdf-rules@w3.org, phayes@ihmc.us
Message-Id: <5.2.1.1.2.20031118113158.05be7558@nitro.isi.edu>
Hi,

as far as I understand the discussion there are different issues:

1) Reasoning with the Closed World Assumption on a given graph
2) Naming a given graph.
3) Collecting (or completing) a graph (data transclusion)

I think we all agree that doing 1) is easy.

Regarding 2.:
My 2 cents:
An observation is that the notion of a graph is independent of the notion 
of a  document.
Jim gave the example of a document changing over time, where the Closed 
World Assumption
delivers different results today and tomorrow.
Therefore I argue that one should distinguish:
  - the location of a document (given by its URL)
  - the identifier of the graph

Following this route the identifier of the graph (which I would like to 
call context (represented by an URI)
should change whenever the graph changes. The context identifier could be 
given by a (bijective) function
of location (e.g., the URL, or a set of URLs) and time, or any other
parameter which influences the graph in a given application. Please note 
the the parameter of the function
are entirely application dependent - one application might take time into 
account, another might not.

A well known way of doing these kind of bijective functions are skolem 
functions, used to generate new IDs
(which can then be bijective mapped to URIs). E.g., TRIPLE is using skolem 
functions to  compute new
IDs for contexts.

An example (using TRIPLE syntax):

stefan[rdf:type->foaf:person]@uri("11/18/2003","http://www.decker.cx/stefan/").

This would take care of the naming issue, and Jim's FOAF example.

Regarding 3 (Collecting the graph)

The second observation is that Jim's first example:


Rule1 - if person(shoesize) != large then A
Rule2 - if person(shirtsize) != large then B
RULES-CLOSED-OVER http://www.foo.bar/document1.r

Joe[owl:class->person;shoesize ->large;nickname->"the gorilla"].
person[rdf:type->foo:human].

works because he assumes the graph has semantics - that somehow somebody
should know that the namespace foo: should be taken into account.
I think this is (as with all semantic issues) controversial at least - and 
as most controversial issues
this should be layered. Maybe there are semantic languages on top of RDF
(and OWL will not and is not the only one) which need data transclusion 
(homage to Ted ;-)
But for sure it is not a given for all RDF data - which means people need a 
choice.
If we build a rule language for RDF (which should enable the layering), 
data transclusion
could be an option, but it is not a given thing for all RDF datasets.

So my suggestion is to start with no data transclusion in a the rule 
language, and allow
layers and primitives that enable data transclusion - but this is language 
dependent.
OWL might require transclusion, UML might not. TopicMaps in RDF might 
require transclusion, VCARD in RDF not.

Does this resolve the issues?

Best,
         Stefan














At 02:23 AM 11/18/2003, Jim Hendler wrote:

>At 1:46 +0000 11/18/03, Stefan Decker wrote:
>><x-flowed>Please apologize my ignorance - what is hard about doing closed 
>>world
>>reasoning on a giving  RDF graph?
>>
>>Best,
>>          Stefan
>>
>
>Tell me how you define what a given RDF graph is on the WWW --  if all the 
>facts are on a single document, then it isn't too hard - but if it is 
>linked to other documents (other than the RDF namespace) it gets 
>hard.  RDF is designed to be open, merged, and able to reference other RDF 
>-- those are its main features -- but once in a graph form, and using 
>pointers elsewhere, then you get a lot of issues that need to be resolved 
>-- in an earlier message I gave a simple example -- here's another one -- 
>suppose I point you to the Foafbot results and claim they are closed -- 
>but then tomorrow a new scrape is made and some new stuff is there, is 
>that the same closed document?  This could be solved using some sort of 
>timeouts and etc - but how do you do that?  Without a normative way of 
>handling time, how do you represent the graph being closed at time T?
>  In short, I repeat, there's nothing unsolvable about doing this -- but 
> it is not as obvious as it appears, and without some sort of solutions 
> already offered, I worry it is premature to try to standardize.
>
>
>
>>At 01:38 AM 11/18/2003, Jim Hendler wrote:
>>>Ben - I think you miss my point - I didn't say figuring out a way to do
>>>NAF would be a bad thing, I said it would be a very HARD thing, and one
>>>for which there is no current de facto solution -- WOWG looked for a way
>>>to do this, and realized we would not be able to do it -- I don't see why
>>>the rules group would expect success unless they could start from an
>>>existing solution -- and I've seen no proposal with a solution that seems
>>>workable.  If it's going to be part of the charter, then I would want to
>>>see at least 1 workable solution before the WG starts...
>>>   -JH
>>>p.s. WOWG's objective, which we didn't achieve, is mentioned in our
>>>requirements [1]
>>>
>>>O3. Ability to state closed worlds     Due to the size and rate of change
>>>on the Web, the closed-world assumption (which states that anything that
>>>cannot be inferred is assumed to be false) is inappropriate. However,
>>>there are many situations where closed-world information would be useful.
>>>Therefore, the language must be able to state that a given ontology can be
>>>regarded as complete. This would then sanction additional inferences to be
>>>drawn from that ontology. The precise semantics of such a statement (and
>>>the corresponding set of inferences) remains to be defined, but examples
>>>might include assuming complete property information about individuals,
>>>assuming completeness of class-membership, and assuming exhaustiveness of
>>>subclasses.     Motivation: Shared ontologies, Inconsistency detection
>>>
>>>[1] http://www.w3.org/TR/webont-req/#section-objectives
>>>
>>>
>>>
>>>At 12:13 -0500 11/15/03, Benjamin Grosof wrote:
>>>  ><x-flowed>Hi Jim and all,
>>>  >
>>>  >At 03:18 PM 11/14/2003 -0500, Jim Hendler wrote:
>>>  >>Ben-
>>>  >>  I agree w/Sandro - NAF requires identifying a set of facts it 
>>> works over
>>>  >> (the domain) - but RDF graphs,  but their very nature are open -- 
>>> so what
>>>  >> sound easy suddenly becomes very hard.  We attcked this problem in 
>>> WebOnt
>>>  >> (see our reqs document and issues lists - sorry, I'm on slow connection
>>>  >> don't have the URIs, but they are one link from
>>>  >> http://www.w3.org/2001/sw/WebOnt) - we wanted a way to have a local
>>>  >> unique names assumption - but couldn't solve the problem -- I bet the
>>>  >> local domain naming is at least as hard, probably harder
>>>  >
>>>  >Would you please send me specific links when you can? I looked at the OWL
>>>  >requirements and issues list documents and I couldn't   easily figure out
>>>  >which parts of them you were referring to.
>>>  >
>>>  >>  here's an example, tell me whaty you would do
>>>  >>
>>>  >>You say
>>>  >>  Rule1 - if person(shoesize) != large then A
>>>  >>  Rule2 - if person(shirtsize) != large then B
>>>  >>  RULES-CLOSED-OVER http://www.foo.bar/document1.rdf
>>>  >>
>>>  >>and that seems fine,  but document1 includes
>>>  >>   :Joe owl:class :person.
>>>  >>   :Joe shoesize :large.
>>>  >>   :Joe nickname "the gorilla".
>>>  >>  :person rdf:type foo:human.
>>>  >>
>>>  >>now, foo is a namespace document which contains a bunch of facts about
>>>  humans.
>>>  >>It is clear that A is false, because the document you're closed over 
>>> says
>>>  >>his shoesize is large
>>>  >>But what about B being true?   We see that this document doesn't include
>>>  >>that his shirtsize isn large, but what is on foo:?  Maybe it says anyone
>>>  >>with the nickname "the gorilla" where's a large shirt, maybe it 
>>> refers to
>>>  >>another document, ad infinitum.
>>>  >>  So when there is a web of graphs refering to terms in other 
>>> graphs, etc
>>>  >> - how do you know where things stop?  (see www-sw-meaning for a lot 
>>> more
>>>  >> dicussion of this issue!)
>>>  >>  this is also only one simple manifestation of this problem -- when you
>>>  >> talk about documents that are changing, scraped, etc. (all of which 
>>> come
>>>  >> up on the web) it gets even uglier
>>>  >>
>>>  >>  Sandro put it well - it's not that we cannot do NAF, it's that 
>>> designing
>>>  >> the mechanism for definining the bounds of a graph on the web is 
>>> still an
>>>  >> unsolved problem --
>>>  >
>>>  >Thanks for the example, it helps.
>>>  >I think you've put your finger right on the nub of the problem.
>>>  >I was indeed presuming that there is a mechanism to define the bounds of
>>>  >the knowledge base / graph, i.e., to well-define the set of premises.
>>>  >
>>>  >>  if the rules group has to solve it to make progess, that is risky
>>>  >> business....
>>>  >
>>>  >I think the Semantic Web needs to solve it in an initial fashion, and 
>>> quite
>>>  >soon.  There's a tremendous overambitiousness in thinking that this is
>>>  >*not* critical path.  It's not so hard to do, either -- in the following
>>>  >sense.  Programming languages "solved" it long ago with mechanisms that
>>>  >check transitively for inclusion (such as the "make" facility in C).
>>>  >The obvious approach is to just use that type of idea for the Semantic
>>>  >Web.  Thus if the transitive closure of the "import" chains cannot be
>>>  >determined and meet the usual criteria of well-definedness then there 
>>> is a
>>>  >KB scope violation of a "system-ish" nature.  This will force people to
>>>  >define more carefully exactly which portions of other KB's that they are
>>>  >importing -- including via more contentful module mechanisms within 
>>> KB's --
>>>  >and to do integrity checking on transitive closures of inclusion both
>>>  >initially when KB's are developed and periodically/dynamically as 
>>> KB's are
>>>  >maintained/updated.
>>>  >
>>>  >  I know that some don't like the idea of having to do this.  I think the
>>>  >alternative of not being allowed to define such scoping is, however,
>>>  >extremely undesirable.  The idea of "all RDF anywhere on the web" as
>>>  >something I would want to always *have to* use as my KB's scope is a
>>>  >complete non-starter practically -- consider issues of data/knowledge
>>>  >quality alone!  (I'm tempted to say it's ridiculous.  People talk about
>>>  >"trust" on the Semantic Web.  The most basic mechanism for trust is 
>>> simply
>>>  >to know what set of premises the inferences were drawn from.  We'll be
>>>  >laughed out of town in most practical IT settings if we don't have a good
>>>  >story about this aspect of things.)
>>>  >
>>>  >If we take the approach I'm suggesting (and others have suggested it too)
>>>  >then we don't have to get fancy about deep philosophy and unplumbed
>>>  >territory of "social meaning", or wait for more research on "trust",  to
>>>  >just get going on doing over the Web the kind of KR that has been 
>>> proved in
>>>  >useful in decades of practical applications (and for a number of years in
>>>  >multi-agent systems).  We can then proceed incrementally/evolutionarily
>>>  >over time, as we develop further use cases and techniques, to open things
>>>  >up by having more implicit and relaxed mechanisms for importing / scoping
>>>  >the KB's/graphs.   We should start with what we know works, in short, and
>>>  >then work to improve upon it in the direction of reducing the burden of
>>>  >defining inclusion/import scoping.  As a practical matter, if there 
>>> is a KB
>>>  >scope violation cf. above, then that doesn't mean we can't/won't do
>>>  >inferencing, depending on the purpose and kind of inferencing -- some 
>>> kind
>>>  >of inferencing may be useful even when there is a violation.
>>>  >
>>>  >If we do it that way, we can have/do nonmon/NAF on the Semantic Web
>>>  >essentially today, and develop additional techniques later for making the
>>>  >scoping more flexible and convenient.
>>>  >
>>>  >
>>>  >>  -JH
>>>  >>p.s. Note that the OWL group rjected the solution that we could use the
>>>  >>imports closure and define everything else as not included, because that
>>>  >>would limit you to only those things defined in the DL profile, not all
>>>  >>OWL and all RDF documents
>>>  >
>>>  >I'm confused by this.  "All OWL and all RDF documents" is way too big --
>>>  >see above my comment about "all RDF on the Web".  When you say "DL 
>>> profile"
>>>  >I presume you mean the set of OWL imports statements.  What's the 
>>> point of
>>>  >an imports mechanism in OWL if everything else is included?  Perhaps I'm
>>>  >not understanding what you're saying.
>>>  >
>>>  >In any event, the way to go is to define (a given KB as) importing of RDF
>>>  >as well as OWL (and soon, more generally, semantic web rules 
>>> knowledge base
>>>  >modules as well), in the imports profile, and stick to the transitive
>>>  >closure for most purposes.  Does that require extending the current 
>>> imports
>>>  >mechanism of OWL, e.g., to define a boundaried RDF graph as imported?
>>>  >
>>>  >>-- the rules language would have to face that same issue, but also deal
>>>  >>with all things findable by Xquery ... yow!
>>>  >
>>>  >I don't see what XQuery has to do with it (at least not directly), if 
>>> we're
>>>  >talking RDF stuff.  XQuery is certainly related to Semantic Web Rules
>>>  >(indeed, I was one of the first to press this point to the W3C team; back
>>>  >in March 2001 I presented to them about it), but I don't see that Rules
>>>  >"have to... deal with all things findable by XQuery".  More pertinent to
>>>  >the main topic here is that XQuery deals quite ambitiously with very 
>>> large
>>>  >scale databases and as I understand it (from early versions I looked at)
>>>  >has a well-defined boundary of what is queried over.  That's thus 
>>> probably
>>>  >further evidence towards the usefulness of my scoping suggestion about
>>>  >imports closure.
>>>  >
>>>  >Benjamin
>>>  >
>>>  >>--
>>>  >>Professor James
>>>  Hendler                   http://www.cs.umd.edu/users/hendler
>>>  >>Director, Semantic Web and Agent Technologies     301-405-2696
>>>  >>Maryland Information and Network Dynamics Lab.    301-405-6707 (Fax)
>>>  >>Univ of Maryland, College Park, MD 20742          240-277-3388 (Cell)
>>>  >
>>>  >_________________________________________________________________________
>>>  _______________________
>>>  >Prof. Benjamin Grosof
>>>  >Web Technologies for E-Commerce, Business Policies, E-Contracting, Rules,
>>>  >XML, Agents, Semantic Web Services
>>>  >MIT Sloan School of Management, Information Technology group
>>>  >http://ebusiness.mit.edu/bgrosof or http://www.mit.edu/~bgrosof
>>>  >
>>>  ></x-flowed>
>>>
>>>
>>>--
>>>Professor James 
>>>Hendler                   http://www.cs.umd.edu/users/hendler
>>>Director, Semantic Web and Agent Technologies     301-405-2696
>>>Maryland Information and Network Dynamics Lab.    301-405-6707 (Fax)
>>>Univ of Maryland, College Park, MD 20742          240-277-3388 (Cell)
>>
>>
>>
>>--
>>http://www.isi.edu/~stefan
>>
>>
>></x-flowed>
>
>--
>Professor James Hendler                   http://www.cs.umd.edu/users/hendler
>Director, Semantic Web and Agent Technologies     301-405-2696
>Maryland Information and Network Dynamics Lab.    301-405-6707 (Fax)
>Univ of Maryland, College Park, MD 20742          240-277-3388 (Cell)



--
http://www.isi.edu/~stefan
Received on Tuesday, 18 November 2003 07:16:11 UTC