Re: types of conformance from Ed Barkmeyer on 2007-01-04 (public-rif-wg@w3.org from January 2007)

From: Ed Barkmeyer <edbark@nist.gov>
Date: Wed, 03 Jan 2007 20:43:47 -0500
To: Bijan Parsia <bparsia@cs.man.ac.uk>
CC: W3C RIF WG <public-rif-wg@w3.org>
Message-ID: <459C5BD3.8010806@nist.gov>
Bijan,

I think we are largely on the same page.  I need to think more about some 
parts of this, but I will try to clarify some things now.

you wrote:

> On Jan 3, 2007, at 6:23 PM, Ed Barkmeyer wrote:
> [snip]
> 
>> This is not at all what I meant.  My expectation was that we would  
>> define specific sets of features as well-defined "sublanguages" of  
>> RIF that have known well-defined semantics.  E.g., a "core RIF",  and 
>> "core + features X and Y" and "core + feature Z"  and  "everything" = 
>> core + W, X, Y, and Z.  A tool can conform to one or  more of the 
>> standard sublanguages, but not to an arbitrary  collection of RIF 
>> features.
> 
> Do you mean that a tool can "implement" such a sublanguage (in  
> Michael's terms)?

Yes.  That was badly stated.  I meant that a tool can conform by exhibiting 
the required behaviors with respect to a given standard sublanguage.  And 
while we have not agreed as to exactly what "required behavior" might be, it 
seems to be what Michael calls "implementing" the sublanguage.

> Presumably, a document could "conform" to such simply by using those  
> features. You mean we should call out specific combinations as "key"?

A document can conform by containing only those features, and conforming to 
the representation requirements for them.

I did mean that, for tools, it is better to have "key combinations" of 
features that have a high probability of multiple implementations that include 
all of the features in a specified combination.  If we allow arbitrary 
combinations over a sizeable set (10+) of optional features, we may expect 
most RIF rulesets to be readable only by the tool that wrote them.

>> The "bottom line" is what I think of as the "core" sublanguage --  
>> minimum compliance.  If we have just the "core" and 12  independently 
>> optional features, we will have in fact defined 2^12  distinct 
>> sublanguages, and in all likelihood they will have no  consistently 
>> definable semantics.
> 
> I don't understand this. In most cases, if core + all 12 options  forms 
> a coherent language, which doesn't seem hard to do, then all  the 
> subsets are coherent as well. (Assuming some sane notion of  "feature" 
> of course :))

It did seem to me to be hard to do, based on the RuleML experience, but I bow 
to your superior knowledge in this area.  It may also be that we can define a 
coherent language which defies implementation -- if you have all these 
features, it is extremely difficult to build an engine that correctly 
processes all reasonable rulesets that use certain combinations of them.

> I see many reasons why one might reject this approach, but the  likihood 
> of a lack of "consistency definable semantics" eludes me  still. Clarify?

I guess my real problem is ensuring that when I create a ruleset and post it 
on the Web, a potential user can know whether his tool will actually be able 
to process it as intended.  In many cases, there may be more than one way for 
me to construct an effective ruleset for the purpose at hand, but each of 
these involves different combinations of features.  (If the tool doesn't have 
X, there is a work-around that uses Y.)  If some sufficient set of features is 
a defined sublanguage, I will use that set in creating the ruleset.  But if I 
have to guess whether more potential users will have tools that support X and 
Z vs. Y and Q, I have a problem.  Put another way, I would like to know what 
features I should NOT use if I don't absolutely need them, because using them 
reduces the number of implementations that can process my ruleset.

It may be that the issue is not so much the complexity of the semantics or 
even the complexity of the supporting implementation, but just the relative 
commonality of support for given feature combinations.  But I would prefer to 
see those combinations defined as specific sublanguages.  The partial ordering 
arises from the possibility that such "useful" combinations may not be totally 
ordered by inclusion.  I simply observed that in the process of defining 
sublanguages, it is not a requirement that they all share a single consistent 
semantics, unless there is to be one "superlanguage" that has all features 
simultaneously.

> Lostville here. Mixing model theoretic and e.g., operational  semantics 
> is, of course, tricky (at best) but I thought they were all  going to 
> have model theoretic semantics.

This is the part I need to think a bit more about.  I need to look more 
closely at the list of candidate features.

> [snip]
> 
>> The greater problem is in trying to define the behaviours of a  
>> conforming tool.  We think of a conforming tool as a reasoning  engine 
>> that implements a useful algorithm that covers and is  consistent with 
>> the standard semantics for a specified  sublanguage.  But what of a 
>> tool that simply reads a conforming  document and outputs the ruleset 
>> in the Jena form?
> 
> Document conformance seems to be enough for that, IMHO.

I can agree.  The tool does not conform, but what it produces does.

> Er...I'd strongly suggest being *very restrictive* in the sorts of  tool 
> behavior you strive to regulate. 

You and me both.  That was what prompted my original email.
It is very hard to define the conformance requirements for a tool that reads 
RIF rulesets.  What exactly is it one wants to require it to do?

> OWL has species validation and  
> consistency checking and I don't think you really needed more.  Document 
> conformance is grand. Marking that a reasoner is sound,  complete 
> and/or, if possible, a decision procedure is also grand. If  not 
> implementing a decision procedure, it could be helpful to define  
> certain equivalence classes for effective computation (see <http:// 
> www.cs.mu.oz.au/research/mercury/information/doc-release/mercury_ref/ 
> Semantics.html#Semantics> for an example of doing this).
> 
> I can see wanting to go even further and say something about the set  of 
> answers (e.g.) under certain general resource bounds, but that is  an 
> even trickier area to get into so I'd hold off.
> 
>> That is, the ability to perform reasoning from a RIF ruleset is a  
>> "feature", and one I might add that we may find it difficult to  define.
> 
> Er...I'd be minimal and do the standard thing. Dialect detection and  
> entailment/answers/reasoning seem more than enough. (I.e., (document)  
> conformance and implements in Michael's terms; producing and consuming 
> (for reasoning) in my terms).
> 
> Those seem useful enough and done right not too burdensome. These  don't 
> really distinguish between mix and match and named points, afaict.

I need to think more about this as well.  My experience is that mix and match 
of features really does distinguish the abilities of existing tools to 
"consume for reasoning".

-Ed

-- 
Edward J. Barkmeyer                        Email: edbark@nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."
Received on Thursday, 4 January 2007 01:44:25 UTC