Re: Approaching an XML syntax for RIF from Sandro Hawke on 2007-01-30 (public-rif-wg@w3.org from January 2007)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 29 Jan 2007 22:50:44 -0500
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: public-rif-wg@w3.org
Message-Id: <20070130035117.C1D0E4EECB@homer.w3.org>
> First, it looses, or at least weakens, the extensibility. For example if 
> your class currently only has one property so you stripe skip and then 
> some extension dialect adds a subclass with a new property you can't 
> stripe skip in the extension. That breaks the forward compatibility.

I'm assuming, I think, that one will always use the most specific class
information available.  It would be incorrect to use the name of the
superclass as the tag when serializing the subclass.    Does that not
solve the problem?    (I might need to work through some more examples
here.) 

> Second, any sort of dynamic skipping (which would be a partial way round 
> the above issue) causes confusion and painful mismatch with XML tools.

I agree.   I don't think dynamic skipping is worth the hassle.

> Third, without stripe skipping the parse structure is largely 
> self-describing - you can parse an unknown dialect and then go lookup 
> semantic annotations for the new classes and properties to see if the 
> bits you don't understand are semantically significant. I say "largely" 
> because you don't need type annotations for the literals to achieve that 
> benefit.

I believe it's possible to make the fully-striped syntax entirely
self-describing, so you can parse to triples/objects without
asn/schema/ontology information. 

> [The first and last issue may be the same :-)]
> 
> The cost of not stripe skipping is a little more verboseness but to me 
> that's an acceptable price in the XML format.

*shrug* I'm pretty ambivalent myself.  I think stripe skipping is cool,
but it may not be warranted for RIF, where the XML is going to be almost
impossible for humans to read no matter what we do.  I'm curious to see
if there will be WG members who have strong needs in one direction or
the other.   It may be we just need to flip a coin.

> By the way, if neatness of the XML is an issue then there is a different 
> trivial simplification possible: use an attribute for any property which 
> takes a single literal value (name, ref etc).

Yes, that's another potential coin flip.  I have a slight preference for
the consistency of just using element names, but I might feel
differently when the moon changes phase.   :-)

> [Something we shouldn't worry about at this stage but will need to 
> eventually resolve is whether names should be text (things which could 
> have language tags) or strings as you have it there. If names should be 
> round tripped and appear in rule editor UIs then they will need i18n 
> support and so should probably be lang taggable.]

Yeah.  What about markup in variable names?  :-) No, seriously, I
remember how RDF Core got stuck on this for so long.  If I understand
the result right, it's that text which is intended for general human
consumption (even just a word or two) needs to have associated metadata
saying what language it's in.  

> Those Con's don't look much like xsd:anyURIs to me, especially $49.

Indeed.  Heh.  :-) I didn't get a chance to talk to Harold about that.

> Or fully striped with the attribute short-cut you would have:
> 
>    <And>
>      <formulas>   # formulae?
>        <Exists>
>          <declare><Var name="Buyer" /></declare>
>          <formula>
>            <Atom>
>              <parts>
>                <Con ref="Purchase" />
>                <Var name="Buyer" />

In laying out fully-striped versions, I think we can put tags on the
same line if they are the only child, to save screen space and show
they're not very important.  It's kind of stripe-skipping Lite.  :-)

    <And><formulas>
      <Exists>
        <declare><Var name="Buyer" /></declare>
        <formula>
          <Atom><parts>
            <Con ref="Purchase" />
            <Var name="Buyer" />

This is missing one more feature for full self-desciption -- it doesnt
indicate that the atom parts form a list.  You could infer that from the
fact that there are two child elements, but what about when there is
only one (as with declare)?  My solution for a fully-striped
self-describing syntax is to add a <List> and <item> pair of pseudo
stripes (not in the RIF namespace).  You might be able to do with only
one pseudo-stripe, but I think you need <List> for the case of the empty
list, and <item> for when the items are strings.

So we get something like this:

    <And><formulas>
      <Exists>
        <declare><List><item><Var name="Buyer" /></item></list></declare>
        <formula>
          <Atom><parts><List>
            <item><Con ref="Purchase" /></item>
            <item><Var name="Buyer" /></item>

That's not too bad, either, maybe.

    -- Sandro
Received on Tuesday, 30 January 2007 03:51:19 UTC