- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 13 Jul 2007 00:00:13 -0400
- To: public-rif-wg@w3.org
This is an attempt to progress on ACTION-309 ("Work on unified strawman proposal for asn->xml system"). The guiding principle behind the strawman is to have the syntax be both: 1. a basic XML object-serialization syntax and 2. a subset of RDF/XML. (I've taken to calling this approach "Semantic XML".) Part 1 means that XML tools will work on it fairly well, and the format will feel unsurprising to people comfortable with XML. Part 2 means that RDF-reading tools will work on it, the logical data model of the syntax will be well-defined (good for writing rules about RIF documents), and we get off-the-shelf solutions to some of the confusing "coin-flip" issues. It's also more self-describing than is typical for XML -- it can be de-serialized into frame (generic object) structures without knowing the schema. The cost of Part 1 is that RDF/XML output tools wont work unless modified; the cost of Part 2 is that the XML document has a few bits of RDF syntax in it, making it a little bigger and a little odd-looking. In informal XML terms, here are the details: 1. It's fully-striped object serialization (as Gary and Harold have shown already). The XML elements alternate (as you go deeper into the tree) between being the name of a class and the name of a property. 2. We wrap it all in an rdf:RDF element, which mostly serves to allow multiple rulesets (or other top-level objects) to be serialized in the same XML file (since XML only allows one top-level element). 3. When serializing a data value (except text strings), we use the rdf:datatype attribute to provide the datatype, like this: <Animal> <age rdf:datatype="&xsd;int">12</age> <born rdf:datatype="&xsd;datetime">1995-05-28</born> </Animal> (In this example, I'm using a defined XML entity for "xsd" to make the string more readable.) 4. For text strings, we just give the value, with an optional xml:lang <Animal> <name>Taiko</name> <name xml:lang="jp">ÀÝ</name> </Animal> 5. If a property has multiple unordered values, just repeat the tag as often as needed (as immediately above, with two values for the "name" property) 6. If the value of a propery is a sequence (if the order matters), then we have to tell the reader software this, using a special xml attribute, like this: [ Harold, note this difference from the Core draft [1] ] <Uniterm> <op><Const>purchase</Const></op> <arg rdf:parsetype="Collection"> <Var>Buyer</Var> <Var>Seller</Var> <Uniterm> <op><Const>book</Const></op> <arg rdf:parsetype="Collection"> <Var>Author</Var> <Const>LeRif</Const> </arg> </Uniterm> <Const>$49</Const> </arg> </Uniterm> 7. If an object being serialized has a URI, specify it with the "rdf:about" attribute, like this: <Ruleset rdf:about="http://example.com/myrules#set1"> ... </Ruleset> And I think that's it. For people familiar with RDF/XML, the subset I'm proposing is obviously very small. It's just what you see above. If the RIF abstract syntax tree ends up being really a lattice or graph, then we'll add in rdf:resource and rdf:nodeId. Also, I'm constraining objects to be serialized in one place in a document -- the value of rdf:about is not allowed to occur twice in a file. (This makes de-serializing and other kinds of XML processing easier and more efficient, I believe.) In general, I'm pretty sure this style will allow schema validation of the document and processing via XSLT and XQuery. So, that's the basic idea. I've been playing with an implementation, and a more precise specification, but my deadline for this action has arrived, and this level of detail is probably sufficient to see how we're doing. (Or is it? Does this make sense? What kind of text or examples or software would it be helpful? It's been a long time since we left off in Innsbruck, Gary and Hassan, and I don't remember exactly where we were on all the issues.) -- Sandro [1] http://www.w3.org/2005/rules/wg/wiki/Core/Positive_Conditions?action=recall&rev=205
Received on Friday, 13 July 2007 04:01:14 UTC