- From: Nathan Rixham <nathan@webr3.org>
- Date: Mon, 3 Dec 2018 21:13:20 +0000
- To: phayes@ihmc.us
- Cc: tl@rat.io, Tim Berners-Lee <timbl@w3.org>, W3C Semantic Web IG <semantic-web@w3.org>
- Message-ID: <CANiy74zSxLMmMeW=BYxuQOugNN9cDyQ7gPt7abY9s9hLes2aHw@mail.gmail.com>
On Mon, Dec 3, 2018 at 8:47 PM PatHayes <phayes@ihmc.us> wrote: > On Nov 25, 2018, at 11:14 AM, thomas lörtsch <tl@rat.io> wrote: > > On 22. Nov 2018, at 13:02, Tim Berners-Lee <timbl@w3.org> wrote: > > David > > I agree with your resolution to make RDF easier to use for real > developers, whatever they are. But I do not despair at the level that you > do, I am more hopeful. > Let me pick just one of your points (with a new subject as suggested). > > > On 2018-11 -21, at 22:40, David Booth <david@dbooth.org> wrote: > > 3. Blank nodes. They are an important convenience for RDF > authors, > > > Yes, here I agree. The default data language for developers at the moment > if JSON, and that is full of blank nodes. Every {} in JSON is equivalent > to a blank node [] in turtle > > Where in JSON you write > > { “name”: “Fred Bloggs”, > “address”: { > “number”: 123, > “street”: “Acacia Avenue” } > } > > in turtle you write > > [ :name “Fred Bloggs”; > :address [ > :number 123; > :street “Acacia Avenue” ] > ] > > Which is just as simple as the JSON. When you look at Turtle as a language > to write and to generate it is I think nice. > > > > IMO this is a good example that bnodes actually are foremost: structure. > > I used to think of them as plastic bags: you put things in them to > transport them or keep them together but they carry no meaning in > themselves (not counting the advertisements usually printed on them as > "meaning", of course). > > Bnodes allow graphs to encode nested lists (trees). That is useful because > although graphs are very flexible, in real life we often prefer less > flexible data structures like lists, nested lists, tables. At least I do > when I write things down. Those structures are very useful. They add some, > well, structure, to what we want to express. Do they carry "meaning"? I’d > say yes but normally I don’t refer to the structure itself. In contrary > it’s so useful because I don’t have to explicate it - it’s just there, as > bullet points, indentation, columns and rows. > > Sometimes I do want to adress a specific location in that structure. Then > it’s useful to be able to give that bnode an identifier (and the ability to > do so is a plus for RDF). However a triple with a bnode seperated from the > other triples containing that same bnode can always only be so useful. It’s > like taking two cells out of a bigger table, without headings or the full > row. How far can that possibly get you? I think that some of the complaints > voiced in this thread are based on unreasonable expectations and on a lack > of understanding what bnodes are and can be. > > Maybe unreasonable expectations at a deeper level are the core of the > problem: the usefulness of graphs as data structures is limited, maybe more > limited than RDF likes to admit. They are not always the most appropriate > solution. We often use much more structured approaches to information > modelling like trees and tables, and for good reasons. > RDF might be much more useful if it had a way to integrate those > structures instead of trying to mimick them - and integrate itself better > into other datastructures. Then maybe we would need less blank nodes. > Nested lists as first class citizens in RDF would be a good thing. Also > tables. There were discussions about "dark triples" pre the 2004 spec but I > couldn’t find much in the mailinglist archives on the thinking behind it. > But putting more emphasis on linking into existing data structures - like > into certain cells in a RDBMS table or subtrees in a JSON document - might > be helpful as well. > > My main problem with bnodes is that it’s so hard to see where one > structure ends and the next one begins, and what that structure actually > is: a list? nested? how deep? a table even? an n-ary relation? where does > that end? which node represents its main role? > A relational table or a nested list make that much easier. In a graph it > takes extra effort to mark and characterize boundaries and substructures. > RDF tries to do all that with just the bnodes and they are overloaded. > That’s why it can be much harder to figure out what’s going on in an RDF > based system than in a RDBMS based application - despite all the self > describing properties etc. > > > I think this is a very basic and important point. It is what I meant, > expressed differently, by saying that RDF has no way to indicate scope. > Bnodes in RDF are, logically, existentially quantified variables, but RDF > has no way to indicate, and therefore no way for anyone to know, where the > quantifiers are which bind those variables. So, for example, if we assume > they are just outside each RDF document, then we should standardize > bnodeIDs apart when merging; but if we assume they have larger scope, then > maybe we shouldn’t. Bnodes introduced to encode structures like n-ary > relational assertions, or lists, or some complicated piece of OWL syntax, > should have a very narrow scope corresponding to the exact boundaries of > those structures, and hence should be ‘invisible’ from outside (which is > why it is fine to make them vanish in a higher-level syntax using [ ] or ( > ).) > > Ideally, RDF2 should provide for these structures directly, but maybe we > can get the benefit with a relatively tiny step, just by having a syntax > for RDF which has explicit scoping brackets. Off the cuff, imagine a > variant of NTriples in which a subset of triples can be enclosed in > brackets, say [ ] (or something else if thse are already taken) to > indicate that any bnode ID in a triple inside the bracket is local to those > triples, ie is ‘bound'. Current RDF engines which do not make use of this > information can simply ignore them, since they do not change the RDF > meaning of the graph, but they may provide useful information to newer > engines. For example, they might make it a lot easier to parse OWL syntax > (‘Manchester’ syntax) from OWL/RDF. > > Putting brackets around an entire graph says, in effect, that all bnodeIDs > in this graph are local to the graph: omitting them allows the possibility > of sharing a bnode with some other graph (as in RDF datasets). > > A better system, which would allow for more elaborate structures, would be > to have convention of labelled scope brackets of the form [ID ], where ID > is any alphanumeric string, which is understood to ‘bind’ only bnodes with > ids of the form _:string where ID is an initial substring of string. So for > example [A ] binds _:A1 and _:A17 but not _:B1. This would allow the full > expressiveness of nested quantification without very much extra work at > all, and again it could be simply ignored by current RDF engines without > harm, although they might be missing out on some of the meaning being > expressed by this more elaborate notation. And if you leave out the ID, > then this defaults to the simpler notation in the previous paragraph, so bc > is automatic. > > The scope identifier should only be attached to one bracket, to make this > kind of silliness > > [A ,,,,[B,,,,,]A….]B > > impossible. > > This could be used to hide the internal strcuture of RDF lists: > > [L > _:a rdf:first x:A . > _:a rdf:rest _:Lb . > _:Lb rdf:first x:B. > _:Lb rdf:rest rdf:nil . > ] > could be abbreviated as something like > {x:A,x:B} > and this treated like a new kind of RDF name, which of course becomes the > first bnodeID (_:a) when compiled into RDF triples (which is why that > bnodeID is not included in the scope, so it can act as the ‘name' of the > list elsewhere in the graph.) > Pat, to me it looks like you're describing an RDF Dataset where Blank Node CANNOT be shared between the RDF Graphs, it would achieve the same no? Open question: why can the scope of quantification not be the edge of the RDF Graph, what is the use case / requirement for blank nodes to be shared between graphs?
Received on Monday, 3 December 2018 21:13:54 UTC