Comments & bugs on the abstract syntax and its sourceforge implementation

I have been trying hard to use well-crafted, highly formalized 
ontologies but this has caused me a lot of
pain and schedule problems with tools that sometimes break for obscure 
reasons. Out of exasperation,
I have decided to look a bit deeper into the causes of these annoyances 
and in doing so, I found some
bugs in the OWLAPI and discrepancies among the normative reference and 
the OWL API reference
parsers & renderers for abstract syntax.

1) The OWLAPI abstract parser uses non-compliant definitions of URI and 
URI references.

See: owlapi/abstractparser/grammar/{lexer,abstract_owl}.g
Here are the abbreviated ANTLR grammar & lexer rules:

  uriReference : URI_ID | qname;

  qname : REF_PREFIX LOCAL_NAME;

  URI_ID: LANGLE
      ("urn:"|"file:"|"http://")
          ( ~('\n'|'\r'|'\t'|' '|')'|'>') )*
      RANGLE;

  REF_PREFIX : ( LETTER )+ COLON ;

  PREFIX : ( LETTER )+;
   
  LOCAL_NAME : LETTER (LETTER | ARABIC_DIGIT | DIGIT_NON_ARABIC | 
UNDERSCORE )*;

These defintions are different than the normative definitions in RFC 
2396:  http://www.ietf.org/rfc/rfc2396.txt
Appendix A shows a collected grammar definition for parsing URI-related 
stuff.

Here is an example where this breaks down:

 Annotation(owl:versionInfo "test"^^http://www.w3.org/2001/XMLSchema#string)

The current parser parses "test"^^... as a data literal; ok.
The data literal currently supports strings only; ok

dataTypeString is defined as:
  datatypeString : QUOTED_STRING  ( DOUBLE_CARET uriReference | AT 
LANGUAGE )?;

As you can see, the uriReference won't parse the "#string" fragment.

2) There are 3 different definitions of the abstract syntax for an 
ObjectProperty.

a) In the normative definition of OWL abstract syntax  
http://www.w3.org/TR/owl-semantics/syntax.html#2.3.2.4
an object property has a required order for its constituent parts:


        'ObjectProperty(' individualvaluedPropertyID ['Deprecated'] { annotation } 
                { 'super(' individualvaluedPropertyID ')' }
                [ 'inverseOf(' individualvaluedPropertyID ')' ] [ 'Symmetric' ] 
                [ 'Functional' | 'InverseFunctional' | 'Functional' 'InverseFunctional' | 'Transitive' ]
                { 'domain(' description ')' } { 'range(' description ')' } ')'


b) The owlapi renderer as a different version
(see: owlapi/io/src/org/semanticweb/owl/io/abstract_syntax/Renderer.java) 
This version uses a different order for the constituent parts and can produce multiple parts of the same kind.

private void renderObjectProperty( OWLOntology ontology,
				       OWLObjectProperty prop ) throws OWLException {
	pw.print(" ObjectProperty(" + shortForm( prop.getURI() ));
	if ( prop.isTransitive( ontology ) ) {
	    pw.print(" Transitive");
	}
	if ( prop.isFunctional( ontology ) ) {
	    pw.print(" Functional");
	}
	if ( prop.isInverseFunctional( ontology ) ) {
	    pw.print(" InverseFunctional");
	}
	if ( prop.isSymmetric( ontology ) ) {
	    pw.print(" Symmetric");
	}
	for ( Iterator it = prop.getInverses( ontology ).iterator();
	      it.hasNext(); ) {
	    pw.println();
	    OWLObjectProperty inv = (OWLObjectProperty) it.next();
	    visitor.reset();
	    inv.accept( visitor );
	    pw.print( "  inverseOf(" + visitor.result() + ")");
	}	    
	for ( Iterator it = prop.getSuperProperties( ontology ).iterator();
	      it.hasNext(); ) {
	    pw.println();
	    OWLObjectProperty sup = (OWLObjectProperty) it.next();
	    visitor.reset();
	    sup.accept( visitor );
	    pw.print( "  super(" + visitor.result() + ")");
	}	    
	for ( Iterator it = prop.getDomains( ontology ).iterator();
	      it.hasNext(); ) {
	    pw.println();
	    OWLDescription dom = (OWLDescription) it.next();
	    visitor.reset();
	    dom.accept( visitor );
	    pw.print( "  domain(" + visitor.result() + ")");
// 	    if (it.hasNext()) {
// 		pw.println();
// 	    }
	}
	for ( Iterator it = prop.getRanges( ontology ).iterator();
	      it.hasNext(); ) {
	    pw.println();
	    OWLDescription ran = (OWLDescription) it.next();
	    visitor.reset();
	    ran.accept( visitor );
	    pw.print( "  range(" + visitor.result() + ")");
	}
	
	pw.println(")");
    }

c) The owlapi abstract parser implmentation is the closest version w.r.t the normative definition.
However, it does not allow the 'Functional' 'InverseFunctional' combination of the normative definition.
See: owlapi/abstractparser/grammar/abstract_owl.g

  objectProperty returns [OWLObjectProperty prop] throws OWLException
    {	
      URI id, superId, inverseId;
      OWLDescription descr;
      prop = null;
      OWLObjectProperty superProp;
    }
    : OBJECT_PROPERTY
  	  id = individualvaluedPropertyID
	  { idsTable.put(id.toString(), OBJ_PROP); 
            prop = builder.addObjectProperty(id); }
	  ( DEPRECATED { builder.setDeprecatedObject(prop); } )? 
          ( { OWLBuilder.AnnotationComponent annComponent; }
            annComponent = annotation
	    { builder.addAnnotationInstance(prop, annComponent); }
          )* 
          ( SUPER
                superId = individualvaluedPropertyID 
                { idsTable.put(superId.toString(), OBJ_PROP);
		  superProp = builder.addObjectProperty(superId); 
		  builder.addSuperProperty(prop, superProp); }
	      RPAREN
          )* 
          ( INVERSE_OF
	        inverseId = individualvaluedPropertyID
		{ idsTable.put(inverseId.toString(), OBJ_PROP);
		  builder.addInverse(prop, inverseId); }
              RPAREN )? 
  	  ( SYMMETRIC
	    { builder.setPropertyType(prop, OWLBuilder.SYMMETRIC); } 
	  )? 
          ( FUNCTIONAL
	    { builder.setPropertyType(prop, OWLBuilder.FUNCTIONAL); } 
	  | INVERSE_FUNCTIONAL
	    { builder.setPropertyType(prop, OWLBuilder.INVERSE_FUNCTIONAL); } 
	  | TRANSITIVE
	    { builder.setPropertyType(prop, OWLBuilder.TRANSITIVE); }
	  )?
          ( DOMAIN
	   		descr = description
		{ builder.addPropertyDomain(prop, descr); }
              RPAREN )* 
  	  ( RANGE
		descr = description
		{ builder.addObjectPropertyRange(prop, descr); }
              RPAREN )* 
  	RPAREN
    ;

These problems are difficult to avoid given the number of people involved,
the rapid pace at which the standard itself evolved, and the decentralized
nature of the process. I believe it would be useful to have a single
definition of the abstract syntax grammar in a form that can produce 
the 3 variants needed:

- a documentation for the normative definition of the abstract syntax standard
- a parser 
- a renderer 

There are two toolkits I can think of for avoiding
the problems of maintaining 3 or more different descriptions
of essentially the same thing, i.e., the OWL abstract language itself.

http://www.cwi.nl/htbin/sen1/twiki/bin/view/SEN1/MetaEnvironment
http://www.semdesigns.com/Products/DMS/DMSToolkit.html 

Has the use of these tools been discussed for simplifying the maintenance
aspects of this important standard?

-- Nicolas.

Received on Wednesday, 25 May 2005 20:36:24 UTC