State of the N-ary

Here's a analytic summary to help drive the discussion.

First, I'm working with two users (Sebastian Brandt and Robert  
Stevens) to come up with some understandable but realistic example of  
(at first) linear equations used in TBoxes to drive subsumptions.  
There has been doubt expressed as to whether n-ary is a "really  
needed" feature (first by HP (withdrawn) and then by Boris (at the  
first f2f)) or whether some sort of rule mechanism would be better  
(though, even a standard DL-Safe rule mechanism would most likely  
require constraint reasoning). I intend for these examples to be  
runnable on the current version of RacerPro, so we'll be able to  
examine them in detail. I hope to get these working in the next week  
or so.

Second, there are still open issues with the current proposals, some  
of them resting on the current datatype discussion. E.g., as with  
facets, we have to be careful to restrict the types of arguments to  
the equations we support to "sensible" types, or rather, how we  
handle nonsensical types. For example, we need to say something about  
what happens if the possible value of an argument to a polynomial is  
a string (or a double! or an integer!) (Nonsense can be driven by  
violations of user intutions, extreme difficulty of implementation,  
or fundamental theoretical problems (e.g., undecidability)).

Third, expressivity:

Let's consider numbers (and, indeed, reals) only for the moment. We  
can see the proposal as containing a hierarchy of more and more  
comprehensive support:
	1) Syntactic hooks (what we have now) plus appropriate datatype  
(e.g., reals)
		Since we only have unary datatypes (>5) there's nothing we can  
actually do with the hooks.
	2) Comparison operators only; that is, we can say (x>y).
		Now we can use the hooks, but we have no equations
		There are use cases for this
	3) Linear (in)equations; e.g., x > 2y
		This suffices for many applications; some version of this is  
supported by the current release of RacerPro (and has been so for  
years). We allow rational constants but (algebraic) real solutions  
(with no way to force integer solutions).
		There are existing solvers which I believe could be fruitfully  
adapted (based on experience using them for probabilistic reasoning).  
They are able to handle rather large and complex systems of equations.
	4) Polynomial (in)equations; e.g., x = y^2 + 3
		This is still decidable (given the right restrictions). I have a  
version of RacerPro which supports these, but it's not the released one.

The current proposal sketchily says that up through 3 is required and  
4 mandatory. I do believe that 3 is a reasonable burden to spec and  
implement and will hammer out the various issues that arise. At least  
3 is current supported by Racer and on the to do list for Pellet and  
FaCT++ (basically, they're waiting for a spec!). The intent for 3 is  
to allow for a fairly straightforward modular implementation where  
existing solvers can be plugged in.

Fourth, syntax:
	The XML/functional syntax is easy, though we could add a bit of  
sugar to make writing equations nicer. I don't see any reason not to  
use MathML.
	For RDF, I thought equations could use MathML too (as a literal or  
data uri) for inline equations. We should also allow  naming predicates.

(The situation is pretty similar for strings. But I think this gives  
enough of the flavor of the situation for fruitful discussion.)

Fifth, naming and conformance:
	Since datatypes and predicates are extensible, perhaps we should  
follow the DL conventions and have an extensible  naming scheme. This  
would help implementations that wanted to support more modest data  
reasoning.

Cheers,
Bijan.

Received on Wednesday, 2 July 2008 08:44:16 UTC