Re: Web Semantics of Datasets (v0.2)

On 2011-10-10, at 12:30, Sandro Hawke wrote:

> Here's some revised wording for the proposal, getting a bit closer to
> spec text.   It's still somewhat informal, and mixing normative and
> non-normative bits, and best-practice.   And it's not as clear as it
> should be about handling change over time.
> 
>    -- Sandro
> ===
>  A dataset D is true iff (1) its default graph is true and (2) for
>  every pair of <N,G> in D, N names something (a "resource", sometimes
>  called a "g-box") which, at every time T in R, has G as its current
>  state.

[ apologies in advance for everywhere I've confused a term in logic with an english language term, it's really not my area of expertise ]

I'm not very comfortable with "its default graph is true" — as previously mentioned many systems default to having the default graph be the union of all named graphs (this turns out to be the most practical way to query SPARQL stores in our experience at least), and I doubt you can often determine truthfulness for all your named graphs - depending on what that implies.

Also, in general, I'm not that comfortable with anything that privileges the default graph in terms of "truth", especially as I don't really know what that means. It suggests rather a naïve view of trust, if that's the intent, and if not I'm not sure what the intent is.

It also raises the possibility of a "true" dataset becoming untrue through the use of SPARQL protocol parameters like default-graph-uri, or the FROM keyword. c.f. http://www.w3.org/TR/rdf-sparql-protocol/ §2.1.2.

- Steve

>  It follows from AWWW that if N is an IRI which can be dereferenced,
>  a successful, correct dereference of N at any time T in R must yield
>  a serialization ("representation") of G.
> 
>  In order to know whether a dereference occurs at a time in R, it is
>  useful to have R declared in the default graph of D, or in another
>  nearby, easy-to-find data source.  Where possible, is is helpful to
>  have R be All Time; that is, having N name a resource whose state,
>  by definition, never changes.
> 
>  In RDF data, N may be used (1) directly, to name the g-box,
>  expressing things like the license that applies to its state, or who
>  controls it; and (2) indirectly, to refer to G as the current state
>  of the g-box.  Indirect reference can be used to express things
>  about an RDF Graph (a "g-snap"), like that it was the graph some
>  entity asserted at some time.  Indirection is done in the semantics
>  of the predicates with which N is used.
> 
>  When N is used indirectly, the reference to G only holds inside time
>  range R, of course.  Care must be taken not to use N as if it
>  necessarily referred to G, outside of R.  Since R is defined to be
>  the same for all elements of D, indirect reference is safe in the
>  default graph.   
> 
> 
> 
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Monday, 10 October 2011 13:35:45 UTC