Re: Rework on SOAP 1.2 Part 2: Section 2 and 3 from Dan Brickley on 2002-03-16 (xml-dist-app@w3.org from March 2002)

From: Dan Brickley <danbri@w3.org>
Date: Sat, 16 Mar 2002 07:04:05 -0500 (EST)
To: Martin Gudgin <marting@develop.com>
cc: "Williams, Stuart" <skw@hplb.hpl.hp.com>, Noah Mendelsohn <noah_mendelsohn@us.ibm.com>, XML Protocol Discussion <xml-dist-app@w3.org>, <em@w3.org>
Message-ID: <Pine.LNX.4.30.0203160615010.18236-100000@tux.w3.org>
On Fri, 15 Mar 2002, Martin Gudgin wrote:


Hi

I'm not responding to the substance of this message, only its form. It
reminds me of the way we used to conduct discussions in the old RDF
groups, spending many happy hours creating ASCII-art representations of
intricate graph structures. More recently we've adopted a different
approach (see below).

Also I just wanted to say that the proposed reworking of the data model
text is (to my eyes) a big improvement. (I didn't previously realise that
edge order was always significant, for example. More or which briefly below.)

Anyway, re the ASCII art...

> Per the encoding rules your second example ( relabeled );
>
> <structA>
>   <edgeB>terminalB</edgeB>
>   <edgeC>terminalC</edgeC>
>   <edgeD>
>     <structE>
>       <edgeF>terminalF</edgeF>
>       <edgeG>terminalG</edgeG>
>     </structE>
>   </edgeD>
> </structA>
>
> the resulting graph would look like this ( per the encoding rules
> described );
>
>
>        edgeB +-----------+
>      +------>+"terminalB"|
>      |       +-----------+
>      |
> +----+-----+ edgeC +-----------+
> |"nontermA"+------>+"terminalC"|
> +----+-----+       +-----------+
>      |
>      | edgeD +----------+ structE +----------+ edgeF +-----------+
>      +------>+"nontermD"+-------->+"nontermE"+------>+"terminalF"|
>              +----------+         +----+-----+       +-----------+
>                                        |
>                                        | edgeF +-----------+
>                                        +------>+"terminalE"|
>                                                +-----------+

(I don't quite follow the detail example, since I don't see where
TerminalE and nonTermD come from, but going from the XML above...)

If we wanted to avoid having to exchange loads of ASCII art, the basic
(source node, edge, target node) triples from the example above might
could also be represented with something like:

nontermA	edgeB	"terminalB"
nontermA	edgeC	"terminalC"
nontermA	edgeD 	structE
structE		edgeF 	terminalF
structE		edgeF 	terminal

...ie we flatten the graph into its constituent node/edge/node triples

This is the way we've done things lately in the RDF Core WG: we atomise
the graph into a collection of indpendent 3-tuples, each composed of node
and edge ids/content. In RDF, edge order is insignificant; we use edges
named rdf:_1, rdf:_2 or other data structures to be explict about order in
the cases where it does matter. If the SOAP data model is more complex
than RDF's and requires edge-ordering to always be preserved, this simple
format might not entirely work (but then neither might ASCII-art).

You might find it worthwhile to look at
http://www.w3.org/TR/rdf-testcases/ for syntax test cases couched in terms
of a text format based on this approach ("NTriples"), and
http://www.w3.org/TR/rdf-mt/ for a formalisation of the RDF model that
accomanies this work.

The messages I'm replying to echoes the structure of the RDF Test Cases
work. We see an XML message (or RDF document) accompanied by an account of
the underlying edge-labelled graph structure that the angle brackets are
encoding. In http://www.w3.org/TR/rdf-testcases/ we used pairs such as the
following:

[eg3.4]
And RDF document, aka an edge-labeled graph serialized according to the
RDF Syntax encoding rules detailed in http://www.w3.org/TR/rdf-syntax-grammar/

 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description rdf:about="http://www.w3.org/2001/08/rdf-test/">
    <dc:creator>Art Barstow</dc:creator>
    <dc:creator>Dave Beckett</dc:creator>
    <dc:publisher>
      <rdf:Description>
        <dc:title>World Wide Web Consortium</dc:title>
        <dc:source rdf:resource="http://www.w3.org/"/>
      </rdf:Description>
    </dc:publisher>
  </rdf:Description>
</rdf:RDF>

N-Triples version:

<http://www.w3.org/2001/08/rdf-test/>
<http://purl.org/dc/elements/1.1/creator>    "Dave Beckett" .
<http://www.w3.org/2001/08/rdf-test/>
<http://purl.org/dc/elements/1.1/creator>    "Art Barstow" .
<http://www.w3.org/2001/08/rdf-test/>
<http://purl.org/dc/elements/1.1/publisher>  _:a .
_:a
<http://purl.org/dc/elements/1.1/title>      "World Wide Web Consortium" .
_:a
<http://purl.org/dc/elements/1.1/source>     <http://www.w3.org/> .



In these SOAP/XMLP discussions, I'm seeing ASCII art which encodes a
similar structure. Hopefully I don't sound to smug here about any alleged
superiority of our approach! The fact of the matter is that in the earlier
RDF working groups we *weren't* so careful to use a clear test cases
format (eg. we used ascii art a lot), and as a result the original
specification of the XML encoding for RDF graphs was somewhat
underspecified. The RDF Core WG is now using N-Triples (ie. the textual
test case graph format) as a way of clarifying the underspecified 1999 RDF
Model and Syntax spec.

Am I making any sense? Would an agreed text format for discussing the
Encoding and Data Model graph structures be useful to folk here? (I know
I've found it useful in an RDF context, but RDF folk are a funny lot... ;-)

cheers,

Dan



ps. as an aside, I might mention that I've got SOAP Encoding messages
stored and queryable in an SQL-backed RDF database, by virtue of mapping
SOAP Encoding messages to RDF's graph structure. I don't know of any
database tools specifically designed for storage, aggregation and query
of the SOAP Encoding Data Model Model, but there are a few for RDF so the
similarity of the graph models is rather intriguing.


pps. Stuart mentioned the 'striped' nature of RDF's graph encoding syntax.
Sometimes elements stand for a node (with all element names except
rdf:Description giving a type for the node); sometimes they stand for an
edge label. This can easily cause confusion, so a while back I wrote a
short document outlining the main principles of RDF syntax. Maybe relevant
here?
	http://www.w3.org/2001/10/stripes/
	RDF: Understanding the Striped RDF/XML Syntax


-- 
mailto:danbri@w3.org
http://www.w3.org/People/DanBri/
Received on Saturday, 16 March 2002 07:05:10 UTC