W3C home > Mailing lists > Public > semantic-web@w3.org > July 2014

Re: Transforming RDF into (non-binary!) trees

From: Victor Porton <porton@narod.ru>
Date: Sun, 06 Jul 2014 23:03:06 +0300
To: Paul Tyson <phtyson@sbcglobal.net>
Cc: Tim Berners-Lee <timbl@w3.org>, SW-forum Web <semantic-web@w3.org>
Message-Id: <5673441404676986@web23j.yandex.ru>
06.07.2014, 22:51, "Paul Tyson" <phtyson@sbcglobal.net>:
> On Sun, 2014-07-06 at 22:32 +0300, Victor Porton wrote:
>> The thing I want is to fill a tree (represented as an Ada data structure) with data from RDF (validating it by the way).
>>
>> The tree in Ada is a skeleton, it should fill with RDF data like meat.
>
> Then I did indeed entirely misinterpret your question.
>
> I don't know ADA, nor its tree data structure, so I should stay out of
> this discussion.
>
> But as TimBL pointed out, common serialization formats for RDF are
> already "tree-like". Why do those not work for you?

It is irrelevant that serialization formats are tree-like.

I am planning to use librdf (for which I am going to create Ada bindings) and it provides me data in plain triples (or quadruples).

I need to construct a tree from triples.

Ada probably does not provide a ready tree structure. I am just going to link together some records (as in a linked list but with a more complex structure).

That you don't know Ada, does not make you unable to understand what I mean. Just replace "Ada tree structure" with your favorite language tree structure (Pascal tree structure, C++ tree structure, etc.)

> Regards,
> --Paul
>> I am not inclined to study SPARQL and use it (or any similar language).
>> I want the data structure (and validator) to be represented as a
>> "native" Ada tree data structure. One reason for this is to make it
>> fast.
>>
>> 06.07.2014, 22:27, "Paul Tyson" <phtyson@sbcglobal.net>:
>>> On Sun, 2014-07-06 at 17:14 +0100, Tim Berners-Lee wrote:
>>>> On 2014-07 -06, at 16:46, Paul Tyson <phtyson@sbcglobal.net> wrote:
>>>>> On Sat, 2014-07-05 at 22:35 +0300, Victor Porton wrote:
>>>>>> I think we should write some code which would transform RDF into a tree
>>>>>> (not necessarily binary! utilize nameless nodes as nodes with N
>>>>>> childs) and also check the number of branches of a certain kind
>>>>>> (usually 0..1 or 1..1).
>>>>>>
>>>>>> Has anyone done a similar job?
>>>>> I have not done that in RDF, but recently I had to generate optimal
>>>>> spanning trees [1] from a directed acyclic graph (DAG). It occurred to
>>>>> me that a similar technique could be applied to RDF if you first omitted
>>>>> cycles from the RDF graph (perhaps by introducing blank nodes).
>>>>>
>>>>> One approach would be to put the spanning tree (however you choose to
>>>>> define it) in one named graph, and all the other "non-tree" triples in
>>>>> another named graph.
>>>>>
>>>>> This would make it easier to apply conventional block-and-line layout
>>>>> styles (using XSL or CSS) to the spanning tree, and use the non-tree
>>>>> links to "decorate" the format (e.g. using hyperlinks or other
>>>>> interactive behavior).
>>>>>
>>>>> Your use case might be quite different than mine. I am motivated by the
>>>>> problem of applying formatting style to RDF graphs. Since conventional
>>>>> layout techniques for screen and paper have a tree-based target model
>>>>> (pages/screens,blocks,lines,characters), somewhere in the process you
>>>>> must find or make a tree from your graph-based data. By specifying how
>>>>> to construct one or more useful (i.e., "meaningful for formatting")
>>>>> spanning trees from a given RDF graph, you achieve greater flexibility
>>>>> and transparency in the process.
>>>> Any serializer to turtle, etc, produces a tree in the process.
>>> I assumed the original poster wanted a spanning tree of the RDF graph,
>>> not just a tree-like serialization of the RDF graph. This would require
>>> omitting all but one triple from each set of triples that have the same
>>> object.
>>>> For example, the serializer in rdflib.js uses the same algorithm for
>>>> serializing turtle/N3, rdf/xml and also a form of graphical HTML
>>>> layout the tabulator project uses for a "data view" of rdf resource.
>>>> This latter also represents quoted graphs of N3 as rounded-corner
>>>> bubbles around the graph, and is useful for vizualising at rule files.
>>>> https://github.com/linkeddata/rdflib.js and specifically
>>>> https://github.com/linkeddata/rdflib.js/blob/master/serialize.js for
>>>> the serializer and
>>>> https://github.com/linkeddata/tabulator/blob/master/js/panes/dataContentPane.js
>>>> for the code which generates the graphical view.
>>> Since it is trivial to construe any DAG as a tree, I did not think that
>>> is what the original question was about. Rather, I took the question as:
>>> "of all the possible spanning trees implicit in an RDF graph, are some
>>> more useful (e.g., more 'meaningful') than others, and if so how best to
>>> specify and construct them?". (It is quite likely I did not get the
>>> question right.)
>>>
>>> I interpreted the question thus because a problem that is looming in my
>>> work is how to tame the "great blooming, buzzing confusion" that comes
>>> at you from any nontrivial RDF query. Solutions such as Tabulator tame
>>> the confusion by presenting the graph as linked hierarchical views of
>>> property lists, which is fine for data geeks but not attractive or
>>> optimal for many business uses. Custom queries and transformations can
>>> provide effective interfaces but are tedious to build and maintain, and
>>> can limit users' interaction with the data. By introducing the ability
>>> to specify a meaningful spanning tree into the query-transform process
>>> we get another control point with which to enrich and style the raw RDF
>>> data for particular business purposes. We will also have provided a
>>> declarative bridging mechanism between the web of data and the web of
>>> documents (to the extent that our specified spanning trees are
>>> "document-like").
>>>
>>> Regards,
>>> --Paul
>>>> In general, a graph may have disconnected parts and so may have to be serialized to more than one tree.
>>>>
>>>> (Note that if you allow N3's reverse arc syntax ( <#a> is :child of
>>>> <#b> ) the you can serialize any acyclic graph to turtle without
>>>> having to generate arbitrary identifiers for blank nodes, just using
>>>> the turtle [ ] syntax. That is one reason why it was a shame that
>>>> the reverse syntax was omitted from Turtle. The serializer above
>>>> does not use the reverse link syntax in its output, so it generates a
>>>> tree of forward links. This goes against a maxim of mine that
>>>> forward links are not treated special over backward links in RDF...
>>>> but I digress.)
>>>>> I suppose such a system could be implemented with SPARQL, but it would
>>>>> be nice to have a non-SPARQL declarative syntax for specifying the
>>>>> spanning tree. RIF might work.
>>>>>
>>>>> Regards,
>>>>> --Paul
>>>>>
>>>>> [1] http://en.wikipedia.org/wiki/Spanning_tree
>>>>>> I am working for bindings librdf for Ada2012. I could write such code
>>>>>> directly in Ada (so it may be easier), but better would be to make C
>>>>>> interface for this. I may write in Ada and leave TODO note "port it to
>>>>>> C".
>>>>>>
>>>>>> Any response?
>>>>>>
>>>>>> --
>>>>>> Victor Porton - http://portonvictor.org
>> --
>> Victor Porton - http://portonvictor.org

--
Victor Porton - http://portonvictor.org
Received on Sunday, 6 July 2014 20:03:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:52 UTC