Re: RDF-ISSUE-5 (Graph Literals) nesting and belief from William Waites on 2011-04-05 (public-rdf-wg@w3.org from April 2011)

From: William Waites <ww@styx.org>
Date: Tue, 5 Apr 2011 16:06:29 +0200
To: RDF WG <public-rdf-wg@w3.org>
Message-ID: <20110405140629.GI21404@styx.org>
Wanting to put this out for discussion in advance of the
teleconference.

As I see it the main reason for having graphs as a first-class
datatype in RDF is to be able to make statements about them and to
answer questions about them. Simple questions are things like, what
are the matches for this pattern in this graph? Or, where do triples
that match this pattern come from and what can be said about their
sources?

The second question bears on provenance and trust as it was put in 
the famous paper, or more generally reasoning about beliefs about
assertions. As soon as it is put in those terms the structure 
becomes obviously nested. It is natural to talk about assertions
(beliefs) about statements (graph), and beliefs about beliefs, etc.

I would argue that before named graphs we actually had this construct
implicitly. People would discuss fragments of an unnamed graph,
cutting and pasting into email, these fragments also being unnamed
sub-graphs. This type of unnamedness is exactly what blank nodes
are meant to capture. So for example this graph fragment,

  s1 p1 o1.
  s2 p2 o2.

is actually equivalent to writing in pseudo-trig,

  _:g1 { s1 p1 o1. s2 p2 o2 }.

where somewhat recursively, _:g1 means "this graph fragment".

Then some graphs achieved a more significant status where they 
were given names, typically the URL that could be dereferenced to
get a representation of them. This is as natural as simply saying,

  _:g1 owl:sameAs G1.

If we allow graphs to be unnamed we can write the sentence, "Mary
thinks Alice knows Bob" in a form like this:

  { Mary thinks { { Alice knows Bob } confidence 0.8 } }.

or perhaps

  { Alice knows Bob } confidence 0.8; source Mary.

and these will actually be parsed into the form you would expect
by some N3 parsers, the { generating a blank node as a graph name
in a similar way that [ generates a blank node subject.

This is all pretty straightforward except when it comes to evaluating
the scope of a blank node. Likely we want something with scoping 
rules similar to the programming languages we are all familiar to
where explicitly written blank nodes have a scope that covers all
graphs enclosed by the graph they appear in. Does this make sense?
Should we simply define it to be so?

How does this scoping rule change when names for graphs are thrown
into the mix? If I write, 

  Mary thinks <G1>.
  <G1> owl:sameAs { _:alice knows Bob }.
  _:alice owl:SameAs Alice.

have I broken the enclosure by introducing an explicit name for G1?
What if, instead of being one document, the first and last line are
in a different document from the second?

In any event, the asymmetry is glaring, as it stands now graphs can
have names, they have always been able to lack names, but we have
no way to write down or refer to a graph that lacks a name.

-w
-- 
William Waites                <mailto:ww@styx.org>
http://river.styx.org/ww/        <sip:ww@styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45
Received on Tuesday, 5 April 2011 14:07:00 UTC