Re: Blank Nodes Re: Toward easier RDF: a proposal

On Mon, Dec 3, 2018 at 8:47 PM PatHayes <> wrote:

> On Nov 25, 2018, at 11:14 AM, thomas lörtsch <> wrote:
> On 22. Nov 2018, at 13:02, Tim Berners-Lee <> wrote:
> David
> I agree with your resolution to make RDF easier to use for real
>  developers, whatever they are.  But I do not despair at the level that you
> do, I am more hopeful.
> Let me pick just one of your points (with a new subject as suggested).
> On 2018-11 -21, at 22:40, David Booth <> wrote:
> 3. Blank nodes.  They are an important convenience for RDF
> authors,
> Yes, here I agree.  The default data language for developers at the moment
> if JSON, and that is full of blank nodes.  Every {} in JSON is equivalent
> to a blank node [] in turtle
> Where in JSON you write
> { “name”: “Fred Bloggs”,
> “address”: {
>  “number”:  123,
>  “street”: “Acacia Avenue” }
> }
> in turtle you write
> [ :name “Fred Bloggs”;
> :address [
>    :number  123;
>    :street  “Acacia Avenue” ]
> ]
> Which is just as simple as the JSON.  When you look at Turtle as a language
> to write and to generate it is I think nice.
> IMO this is a good example that bnodes actually are foremost: structure.
> I used to think of them as plastic bags: you put things in them to
> transport them or keep them together but they carry no meaning in
> themselves (not counting the advertisements usually printed on them as
> "meaning", of course).
> Bnodes allow graphs to encode nested lists (trees). That is useful because
> although graphs are very flexible, in real life we often prefer less
> flexible data structures like lists, nested lists, tables. At least I do
> when I write things down. Those structures are very useful. They add some,
> well, structure, to what we want to express. Do they carry "meaning"? I’d
> say yes but normally I don’t refer to the structure itself. In contrary
> it’s so useful because I don’t have to explicate it - it’s just there, as
> bullet points, indentation, columns and rows.
> Sometimes I do want to adress a specific location in that structure. Then
> it’s useful to be able to give that bnode an identifier (and the ability to
> do so is a plus for RDF). However a triple with a bnode seperated from the
> other triples containing that same bnode can always only be so useful. It’s
> like taking two cells out of a bigger table, without headings or the full
> row. How far can that possibly get you? I think that some of the complaints
> voiced in this thread are based on unreasonable expectations and on a lack
> of understanding what bnodes are and can be.
> Maybe unreasonable expectations at a deeper level are the core of the
> problem: the usefulness of graphs as data structures is limited, maybe more
> limited than RDF likes to admit. They are not always the most appropriate
> solution. We often use much more structured approaches to information
> modelling like trees and tables, and for good reasons.
> RDF might be much more useful if it had a way to integrate those
> structures instead of trying to mimick them - and integrate itself better
> into other datastructures. Then maybe we would need less blank nodes.
> Nested lists as first class citizens in RDF would be a good thing. Also
> tables. There were discussions about "dark triples" pre the 2004 spec but I
> couldn’t find much in the mailinglist archives on the thinking behind it.
> But putting more emphasis on linking into existing data structures - like
> into certain cells in a RDBMS table or subtrees in a JSON document - might
> be helpful as well.
> My main problem with bnodes is that it’s so hard to see where one
> structure ends and the next one begins, and what that structure actually
> is: a list? nested? how deep? a table even? an n-ary relation? where does
> that end? which node represents its main role?
> A relational table or a nested list make that much easier. In a graph it
> takes extra effort to mark and characterize boundaries and substructures.
> RDF tries to do all that with just the bnodes and they are overloaded.
> That’s why it can be much harder to figure out what’s going on in an RDF
> based system than in a RDBMS based application - despite all the self
> describing properties etc.
> I think this is a very basic and important point. It is what I meant,
> expressed differently, by saying that RDF has no way to indicate scope.
> Bnodes in RDF are, logically, existentially quantified variables, but RDF
> has no way to indicate, and therefore no way for anyone to know, where the
> quantifiers are which bind those variables. So, for example, if we assume
> they are just outside each RDF document, then we should standardize
> bnodeIDs apart when merging; but if we assume they have larger scope, then
> maybe we shouldn’t. Bnodes introduced to encode structures like n-ary
> relational assertions, or lists, or some complicated piece of OWL syntax,
> should have a very narrow scope corresponding to the exact boundaries of
> those structures, and hence should be ‘invisible’ from outside (which is
> why it is fine to make them vanish in a higher-level syntax using [ ] or (
> ).)
> Ideally, RDF2 should provide for these structures directly, but maybe we
> can get the benefit with a relatively tiny step, just by having a syntax
> for RDF which has explicit scoping brackets. Off the cuff, imagine a
> variant of NTriples in which a subset of triples can be enclosed in
> brackets, say [  ] (or something else if thse are already taken) to
> indicate that any bnode ID in a triple inside the bracket is local to those
> triples, ie is ‘bound'. Current RDF engines which do not make use of this
> information can simply ignore them, since they do not change the RDF
> meaning of the graph, but they may provide useful information to newer
> engines. For example, they might make it a lot easier to parse OWL syntax
> (‘Manchester’ syntax) from OWL/RDF.
> Putting brackets around an entire graph says, in effect, that all bnodeIDs
> in this graph are local to the graph: omitting them allows the possibility
> of sharing a bnode with some other graph (as in RDF datasets).
> A better system, which would allow for more elaborate structures, would be
> to have convention of labelled scope brackets of the form [ID ], where ID
> is any alphanumeric string, which is understood to ‘bind’ only bnodes with
> ids of the form _:string where ID is an initial substring of string. So for
> example [A  ] binds _:A1 and _:A17 but not _:B1. This would allow the full
> expressiveness of nested quantification without very much extra work at
> all, and again it could be simply ignored by current RDF engines without
> harm, although they might be missing out on some of the meaning being
> expressed by this more elaborate notation. And if you leave out the ID,
> then this defaults to the simpler notation in the previous paragraph, so bc
> is automatic.
> The scope identifier should only be attached to one bracket, to make this
> kind of silliness
> [A ,,,,[B,,,,,]A….]B
> impossible.
> This could be used to hide the internal strcuture of RDF lists:
> [L
> _:a rdf:first x:A .
> _:a rdf:rest _:Lb .
> _:Lb rdf:first x:B.
> _:Lb rdf:rest rdf:nil .
> ]
> could be abbreviated as something like
> {x:A,x:B}
> and this treated like a new kind of RDF name, which of course becomes the
> first bnodeID (_:a) when compiled into RDF triples (which is why that
> bnodeID is not included in the scope, so it can act as the ‘name' of the
> list elsewhere in the graph.)

Pat, to me it looks like you're describing an RDF Dataset where Blank Node
CANNOT be shared between the RDF Graphs, it would achieve the same no?

Open question: why can the scope of quantification not be the edge of the
RDF Graph, what is the use case / requirement for blank nodes to be shared
between graphs?

Received on Monday, 3 December 2018 21:13:54 UTC