Re: Transforming RDF into (non-binary!) trees from Victor Porton on 2014-07-06 (semantic-web@w3.org from July 2014)

From: Victor Porton <porton@narod.ru>
Date: Sun, 06 Jul 2014 23:03:06 +0300
To: Paul Tyson <phtyson@sbcglobal.net>
Cc: Tim Berners-Lee <timbl@w3.org>, SW-forum Web <semantic-web@w3.org>
Message-Id: <5673441404676986@web23j.yandex.ru>
06.07.2014, 22:51, "Paul Tyson" <phtyson@sbcglobal.net>:
> On Sun, 2014-07-06 at 22:32 +0300, Victor Porton wrote:
>>  The thing I want is to fill a tree (represented as an Ada data structure) with data from RDF (validating it by the way).
>>
>>  The tree in Ada is a skeleton, it should fill with RDF data like meat.
>
> Then I did indeed entirely misinterpret your question.
>
> I don't know ADA, nor its tree data structure, so I should stay out of
> this discussion.
>
> But as TimBL pointed out, common serialization formats for RDF are
> already "tree-like". Why do those not work for you?

It is irrelevant that serialization formats are tree-like.

I am planning to use librdf (for which I am going to create Ada bindings) and it provides me data in plain triples (or quadruples).

I need to construct a tree from triples.

Ada probably does not provide a ready tree structure. I am just going to link together some records (as in a linked list but with a more complex structure).

That you don't know Ada, does not make you unable to understand what I mean. Just replace "Ada tree structure" with your favorite language tree structure (Pascal tree structure, C++ tree structure, etc.)

> Regards,
> --Paul
>>  I am not inclined to study SPARQL and use it (or any similar language).
>>   I want the data structure (and validator) to be represented as a
>>   "native" Ada tree data structure. One reason for this is to make it
>>   fast.
>>
>>  06.07.2014, 22:27, "Paul Tyson" <phtyson@sbcglobal.net>:
>>>  On Sun, 2014-07-06 at 17:14 +0100, Tim Berners-Lee wrote:
>>>>   On 2014-07 -06, at 16:46, Paul Tyson <phtyson@sbcglobal.net> wrote:
>>>>>   On Sat, 2014-07-05 at 22:35 +0300, Victor Porton wrote:
>>>>>>   I think we should write some code which would transform RDF into a tree
>>>>>>   (not necessarily binary! utilize nameless nodes as nodes with N
>>>>>>   childs) and also check the number of branches of a certain kind
>>>>>>   (usually 0..1 or 1..1).
>>>>>>
>>>>>>   Has anyone done a similar job?
>>>>>   I have not done that in RDF, but recently I had to generate optimal
>>>>>   spanning trees [1] from a directed acyclic graph (DAG). It occurred to
>>>>>   me that a similar technique could be applied to RDF if you first omitted
>>>>>   cycles from the RDF graph (perhaps by introducing blank nodes).
>>>>>
>>>>>   One approach would be to put the spanning tree (however you choose to
>>>>>   define it) in one named graph, and all the other "non-tree" triples in
>>>>>   another named graph.
>>>>>
>>>>>   This would make it easier to apply conventional block-and-line layout
>>>>>   styles (using XSL or CSS) to the spanning tree, and use the non-tree
>>>>>   links to "decorate" the format (e.g. using hyperlinks or other
>>>>>   interactive behavior).
>>>>>
>>>>>   Your use case might be quite different than mine. I am motivated by the
>>>>>   problem of applying formatting style to RDF graphs. Since conventional
>>>>>   layout techniques for screen and paper have a tree-based target model
>>>>>   (pages/screens,blocks,lines,characters), somewhere in the process you
>>>>>   must find or make a tree from your graph-based data. By specifying how
>>>>>   to construct one or more useful (i.e., "meaningful for formatting")
>>>>>   spanning trees from a given RDF graph, you achieve greater flexibility
>>>>>   and transparency in the process.
>>>>   Any serializer to turtle, etc, produces a tree in the process.
>>>  I assumed the original poster wanted a spanning tree of the RDF graph,
>>>  not just a tree-like serialization of the RDF graph. This would require
>>>  omitting all but one triple from each set of triples that have the same
>>>  object.
>>>>   For example, the serializer in rdflib.js uses the same algorithm for
>>>>    serializing turtle/N3, rdf/xml and also a form of graphical HTML
>>>>    layout the tabulator project uses for a "data view" of rdf resource.
>>>>    This latter also represents quoted graphs of N3 as rounded-corner
>>>>    bubbles around the graph, and is useful for vizualising at rule files.
>>>>   https://github.com/linkeddata/rdflib.js and specifically
>>>>    https://github.com/linkeddata/rdflib.js/blob/master/serialize.js for
>>>>    the serializer and
>>>>    https://github.com/linkeddata/tabulator/blob/master/js/panes/dataContentPane.js
>>>>   for the code which generates the graphical view.
>>>  Since it is trivial to construe any DAG as a tree, I did not think that
>>>  is what the original question was about. Rather, I took the question as:
>>>  "of all the possible spanning trees implicit in an RDF graph, are some
>>>  more useful (e.g., more 'meaningful') than others, and if so how best to
>>>  specify and construct them?". (It is quite likely I did not get the
>>>  question right.)
>>>
>>>  I interpreted the question thus because a problem that is looming in my
>>>  work is how to tame the "great blooming, buzzing confusion" that comes
>>>  at you from any nontrivial RDF query. Solutions such as Tabulator tame
>>>  the confusion by presenting the graph as linked hierarchical views of
>>>  property lists, which is fine for data geeks but not attractive or
>>>  optimal for many business uses. Custom queries and transformations can
>>>  provide effective interfaces but are tedious to build and maintain, and
>>>  can limit users' interaction with the data. By introducing the ability
>>>  to specify a meaningful spanning tree into the query-transform process
>>>  we get another control point with which to enrich and style the raw RDF
>>>  data for particular business purposes. We will also have provided a
>>>  declarative bridging mechanism between the web of data and the web of
>>>  documents (to the extent that our specified spanning trees are
>>>  "document-like").
>>>
>>>  Regards,
>>>  --Paul
>>>>   In general, a graph may have disconnected parts and so may have to be serialized to more than one tree.
>>>>
>>>>   (Note that if you allow N3's  reverse arc syntax  (   <#a> is :child of
>>>>     <#b> ) the you can serialize any acyclic graph to turtle without
>>>>    having to generate arbitrary identifiers for blank nodes, just using
>>>>    the turtle [ ]  syntax.   That is one reason why it was a shame that
>>>>    the reverse syntax was omitted from Turtle.    The serializer above
>>>>    does not use the reverse link syntax in its output, so it generates a
>>>>    tree of forward links.  This goes against a  maxim of mine that
>>>>    forward links are not treated special over backward links in RDF...
>>>>    but I digress.)
>>>>>   I suppose such a system could be implemented with SPARQL, but it would
>>>>>   be nice to have a non-SPARQL declarative syntax for specifying the
>>>>>   spanning tree. RIF might work.
>>>>>
>>>>>   Regards,
>>>>>   --Paul
>>>>>
>>>>>   [1] http://en.wikipedia.org/wiki/Spanning_tree
>>>>>>   I am working for bindings librdf for Ada2012. I could write such code
>>>>>>   directly in Ada (so it may be easier), but better would be to make C
>>>>>>   interface for this. I may write in Ada and leave TODO note "port it to
>>>>>>   C".
>>>>>>
>>>>>>   Any response?
>>>>>>
>>>>>>   --
>>>>>>   Victor Porton - http://portonvictor.org
>>  --
>>  Victor Porton - http://portonvictor.org

--
Victor Porton - http://portonvictor.org
Received on Sunday, 6 July 2014 20:03:44 UTC