Forward-Compatibility (Extensibility) Requirement & Proposal

I didn't have a clear notion of the extensibility requirement until
just after the F2F ended.  Then it started to make sense to me, and
since then I've talked it over with a few of you, and it seems like I
might have it right.

The requirement might be phrased like this:

    RIF must be extensible, so that implementations can be forward
    compatible, continuing to operate well when given RIF dialects
    which use extensions unknown to the implementation.

The wikipedia page on forward compatibility has some discussion of
this issue: http://en.wikipedia.org/wiki/Forward_compatibility (Their
page on extensibility refers only to extensible systems, not to
extensible formats.)

Forward compatibility is essential to allowing a format to grow in a
large, decentralized environment like the web.  Without it, the
decision to use an extension in some document is also a decision to
entirely exclude the base of users using software which does not
implement the extension.  

Document format evolution, without forward compatibility, proceeds in
ponderous steps where everyone has to install new versions of the
software.  In constrast, if RIF implementation are forward compatible,
the decision to use an extension can be based on the particular
characteristics of that extension, and an awareness (from the RIF
standard) of how implementation will handle your rules when they don't
implement the extension.  Forward compatibility means progress can be
incremental instead of revolutionary.

The simplest approach to forward compatibility, in general, is to mark
each extension as "must-understand" or "may-ignore".  For example, an
extension which introduces negation probably falls under
"must-understand", since if you were to ignore the syntactic elements
of a ruleset which used negation, the meaning of the ruleset would
probably be quite different from what was intended.  On the other
hand, an extension which annotates rules with the last date they were
modified would be a "may-ignore" one.  In general, the term
"annotation" is used to cover syntactic elements which can safely
ignored.  (This matches RDF well, since RDF is generally processed
with the notion that any triples you don't understand can be safely
ignored.)

I think RIF can do better than this general must-understand/may-ignore
approach.   Here are some intermediate categories:

   * an extension which affects the semantics, but ignoring it wont
     give you any incorrect results -- just fewer results.  (This is
     only possible if your dialect is monotonic, I think.)   Whether
     you ignore this or not is something to ask the user.

   * an extension which affects performance, so you'll get the same
     answers/actions, but the performance will be significantly worse.
     Again, you can probably ignore this, but you should warn the
     user.

   * an extension which affects the presentation, but not the
     semantics.   So a reasoner can ignore it, but a system which
     shows the ruleset to a user needs to warn the user.

   * an extension which offers syntactic sugar.  This
     overlaps the other categories -- ignoring it might affect the
     semantics, the performance, etc, -- but perhaps an XSLT script
     can be provided which rewrites out the sugar, so systems which
     don't implement the extension but do have XSLT handy can, in
     effect, download an implementation of the extension.
     (This suggests one could get extreme here and allow extension to
     include, say, JVM code for a reference implementation.  I'm not
     proposing that, however.)

I think that covers the basic problem space.   

Here's a rough proposal for an extensibility mechanism:

   * when you're parsing RIF XML and see a syntactic element you don't
     recognize, you turn it into a URI (by concatenating the namespace
     part and the local part of the element tag) and dereference the
     URI to get information about the syntactic element.  The
     information includes, at least, the consequences of ignoring the
     element (as above).

   * as a work-around for dereference not being available, RIF
     documents could include that information as well.

Thoughts?

    -- Sandro

Received on Wednesday, 8 November 2006 19:20:53 UTC