Re: extensibility (Agenda for RIF telecon 26 June) from Dave Reynolds on 2007-06-26 (public-rif-wg@w3.org from June 2007)

From: Dave Reynolds <der@hplb.hpl.hp.com>
Date: Tue, 26 Jun 2007 08:32:34 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: public-rif-wg@w3.org
Message-ID: <4680C112.2090807@hplb.hpl.hp.com>
Sandro Hawke wrote:
> Dave Reynolds <der@hplb.hpl.hp.com> writes:

>> One thing I would find valuable would be a set of test cases which 
>> illustrate some of the types of extension we would like the mechanism to 
>> cover. Then we could work through those to check out the details.
> 
> Absolutely.  We should do some brainstorming on the extensions people
> will some day want.   Is there a list I've missed somewhere, perhaps
> from the earlier requirements gathering?

I don't recall one.

>> For example. Consider an LP extension that adds "conjunction in rule 
>> conclusions" intending that earlier LP translators use Lloyd-Topor to 
>> transform that away (by duplicating rules). I'll call this component CRC 
>> for short.
>>
>> First, the Fallback in that case is a whole ruleset transformation not 
>> simply a replacement of a local syntactic element. So do we need an 
>> entire ruleset transformation language to describe Fallbacks? The 
>> earlier draft was perhaps going down that route with the 
>> MacroSubstitutions but other options would be an XSLT script or a 
>> RIFCore rule set acting over RIF-as-data. But that would be a lot of 
>> machinery to support and I'm not sure whether such syntactic 
>> transformations are going to give useful fallbacks in enough cases to be 
>> worth it.
> 
> Right.  I see three basic levels of fallback functionality we could do:
> 
>    0 -- the fallback procedure is that you just omit the unrecognized
>         element, pruning the parse tree as high as you need to, to get
>         something syntactically valid.  In many cases, that means the
>         rule which uses the thing will be omitted.  In other cases,
>         though, like when it's an argument to a function, you'd get even
>         more screwy results.
> 
>    1 -- the fallback procedure is a single pattern match replacement,
>         with wildcards but no variables.
> 
>    2 -- the fallback procedure is a RIF Core rule set, acting over RIF
>         as data, as you say.
> 
> I was aiming for level 1 at this point, but level 2 has the huge
> advantage of supporting syntactic sugar extensions, as you say. 
> 
> In a perfect world, we'd have implementations of each of them to try
> out.  If level 2 turns out to easy enough to implement, then it's
> probably the way to go.  (It's conceivable I'll have time to try to
> implement this over the next two months, but I can't commit to it right
> now.)
> 
>> Second, even if such a transformation fallback is useful it will only be 
>> applicable in certain dialects. Suppose some production rule dialect 
>> picks up CRC and incorporates it. In that case the fallback 
>> transformation would likely be invalid if the dialect includes negation 
>> and conflict set resolution. Though if there is no non-monotonic 
>> negation used in the rules then presumably it would be safe. Thus 
>> fallback safety depends on the presence of other components.
> 
> My thinking was that a given component could have multiple fallbacks,
> each with different impact.  So, typically, I'd expect a minor-impact
> fallback to a nearby dialect to be defined, along with a major-impact
> fallback to Core.

The issue is not how far you need to fallback but whether the fallback 
is applicable at all in the presence of other components; i.e. the 
fallback (or at least it's impact :-)) needs to be conditional.
This makes cases other than #0 potentially quite complex to express.

>> Third, it's not clear to me in what way CRC is a separable component in 
>> the first place. Certainly one could have a new conjunction operator 
>> intended for use only in rule conclusions but if we were putting that in 
>> the Core just now we'd just modify the Rule syntax and reuse the same 
>> Conjunction element as used in the condition language. Does this sort of 
>> extension (generalizing where certain constructs can be used) fall in 
>> the scope of the mechanism?
> 
> The approach that comes to my mind is to have a different kind of rule.
> Core has HornRule, and you're talking about a CRCRule, I think.

Yes, that seems right.

That does raise a related issue though. If my new extended production 
rule dialect has several such extensions about what can be said in the 
conclusions then I'd presumably have an EPRule clause which encapsulated 
all of those. In that case the different extensions wouldn't be separate 
components but one big component and the fallback transformations would 
be more complex and conditional.

>> So test cases would, I think, be a helpful way to clarify the scope of 
>> what the mechanism should and should not cover.
>>
>>
>> A couple of more minor comments:
>>
>> o I have been assuming that a RIF ruleset will include metadata which 
>> identifies the intended dialect (including version information). The 
>> discussion under "Dialect Identification and Overlap" doesn't seem to 
>> reflect that. The extension mechanism is only needed when a processor 
>> doesn't recognize the dialect/dialect-version in order to determine 
>> whether, despite that, it could still proceed.
> 
> I understand.  My suspicion is that identifying components instead of
> dialects will end up being much more comfortable in the long run.  In
> the most obvious case, you might implement D1 (C1,C2,C3) and recieve a
> document which uses only C1 and C2, but which would be labeled as being
> written in D2 (C1,C2,C3,C4).  The sender might not even know that D1
> exists, and so could not label it D1.   (But maybe the D2 author needs
> to know about D1 to ensure compatibility; maybe the content could be
> labeled as {D1, D2}.)

I wasn't objecting to also doing component level analysis but having the 
sender and receiver just agree to use the same dialect seems like the 
common case which should be supported by dialect-level metadata. That 
certainly doesn't preclude translators falling back on component level 
analysis.

I guess part of my worry is that it's not clear to me how often the 
components are going to be neatly semantically composable to make the 
componentization useful.

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Tuesday, 26 June 2007 07:32:47 UTC