Re: XML Syntax Strawman (ACTION-309)

Dave Reynolds <der@hplb.hpl.hp.com> writes:
> Sandro Hawke wrote:
> > 
> > This is an attempt to progress on ACTION-309 ("Work on unified
> > strawman proposal for asn->xml system").
> 
> Personally I'm happy with the approach and most of the details.
> 
> I just have a couple of small comments ...

Excellent.  :-)

> (a) Text content and leaf labelling
> 
> In an example like:
> 
>      <Const>book</Const>
> 
> then that doesn't follow RDF striping (Const is a class node and there
> is no property relation to the string "book").

Ooops!  Yes, I've pointed out this mistake a few times, and then missed
it here myself.

> Possible solutions retaining RDF/XML serialization are:
> 
>       <Const rdf:value="book" />
> 
> or
> 
>       <Const><name>book</name></Const>
> 
> the latter being equivalent in RDF/XML terms to:
> 
>       <Const rif:name="book" />

I'm proposing we stick with the all-elements form.  RDF/XML's use of
attributes there seems to make things significantly more complicated for XML
tools.

> In most cases I hope Consts will be IRIs not strings in which case
> presumably they would be:
> 
>       <Const rdf:about="http://example.com/myrules#book" />

I'm kind of confused and undecided about what Consts are or should be.
I've been trying to keep that a separate issue.   In fact, I'm going to
reply on that question in a separate e-mail.

> Adopting RDF/XML style does mean we can't use qname/curie notation for
> shortening IRIs. We can only use relative IRIs so the above could be:
> 
>       <Const rdf:about="#book" />
> 
> in a document with xml:base of "http://example.com/myrules".
> 
> (b) Single serialization
> 
> > Also, I'm constraining objects to be
> > serialized in one place in a document -- the value of rdf:about is not
> > allowed to occur twice in a file.  (This makes de-serializing and other
> > kinds of XML processing easier and more efficient, I believe.) 
> 
> If that is really a goal then there is a problem with Const nodes. All
> IRI Const nodes are likely to be serialized multiple times.
>
> I don't understand why that might be a problem. If it really is a
> problem then there are a couple of solutions which preserve RDF/XML
> compatibility.
>
> (i) Serialize each Const once (e.g. at first occurrence) then all
> references to the same Const would use rdf:resource:
> 
>     ...
>          <Const rdf:about="#book" />
>     ...
>       <Uniterm>
>          <op rdf:resource="#book">
>          <arg rdf:parsetype="Collection">
>             <Var rif:name="Author" />
>             <Const rif:name="LeRif" />
>          </arg>
>        </Uniterm>
> 
> (ii) Serialize Consts as blank nodes with a property giving their IRI
> 
>      <Const rif:iri="#book" />
> 
>  From an XML processor's point of view I would have thought the repeated
> rdf:about would be fine. I guess the issue is whether we permit metadata
> annotations on Const nodes.

Okay, yeah, I guess Consts force the parser output to be a lattice, so
we do need rdf:resource and/or rdf:nodeId.    I had that in a draft but
couldn't remember why I needed it.

I'm inclined to just use nodeId, even if the thing has an IRI, for code
simplicity.  That is: any object which occurs multiple times in the
serialization is assigned a nodeId, is serialized at the first
occurance, and is referred to via its nodeId at subsequent occurances.
This does require two passes in the serializer, but I think that's okay.

> (c) Metadata and schema validation
> 
> A primary attraction to me of using the RDF data model at the top level
> of rulesets and rules is that it gives us extensible metadata for free.
> 
>      <Ruleset>
>        <Forall rdf:about="#rule1">
>          <rdf:label>Rule1</rdf:label>
>          <dc:description>This is a cool rule</dc:description>
>          <dc:creator>Dave Reynolds</dc:creator>
>          ...
> 
> If that is indeed the way that we allow metadata annotation then the XML
> Schema will need to permit unbounded xs:any elements on the classes
> where we permit metadata annotation. Unexpected element data at that
> level will be treated as metadata and ignored. This seems OK to me and
> preferable to having a closed set of metadata terms.

In the extensibility strawman, the idea is that metadata is extensible
but still needs to be declared (in the RIF document or on the web).
That is, you can add whatever metadata you want, but you have to declare
it.  The declaration allows a schema to be constructed which can be used
to check the instance.

My thinking here is that people who want Schema validation don't want to
silently ignore things they don't understand.  Maybe I'm
misunderstanding their requirements.

      -- Sandro

Received on Friday, 13 July 2007 14:34:43 UTC