dataset semantics

On Fri, 2011-12-16 at 22:47 -0600, Pat Hayes wrote:
> On Dec 16, 2011, at 10:21 PM, Sandro Hawke wrote:
> 
> > ... maybe I can figure out some TriG
> > entailment tests....    Like, does this TriG document / dataset:
> > 
> >        { <a> <b> <c> }
> > 
> > entail this RDF graph:
> > 
> >    <a> <b> <c>.
> > 
> > I think it should, so we can have metadata in TriG, but other people
> > have disagreed.   How should we be gather test cases like this?
> 
> 
> FWIW, 'entailment' has a fairly precise meaning. A entails B when B is true whenever A is, or more precisely if, for every possible interpretation I, if A is true in I then B is true in I. So it only makes sense to speak of entailment when there is some notion of truth-in-an-interpretation to base it on. 

Yes, I know.

> So, what are the truth conditions for datasets? 

We haven't quite figured that out yet.   I'm proposing one part of that
is that a dataset being true implies its default graph is true.

The other part of the truth conditions has to do with the relationship
between the things named by the label URIs and the graphs they label.   

Unfortunately, I think we need to allow for several possible
relationships there, MAYBE even in the same dataset, which makes things
rather complicated.

One example of the relationship is what I called graphState in a
different thread.  In that case, the dataset being true would imply that
for each <U,G> in the dataset, the state of the resource U is the graph
G.   (Here, I mean "state" and "resource" in exactly the REST sense.)

Another example is an out of date version of graphState, maybe call it
graphStateWas.  In this case, the dataset being true would imply that
for each <U,G> in the dataset, the state of the resource U is, or used
to be, graph G.

Another example of the relationship is something I gather Cambridge
Semantics uses, which I'll call subjectOf.   (In one of their deployment
modes, triples are divided into two type, which I'll call A and B, based
on which predicate they use.  The dataset is constructed such that for
each <U, G> in the dataset, every type-A triple in G is of the form
{ <U> ?P ?O }.  The type-B triples are a little more complicated.)  In
this case, the dataset being true would imply the dataset being
segmented in this complicated but useful way.   

It's *rather* tempting to just use triples for this, making graphState,
graphStateWas, subjectOf, etc, be predicates.   That way the semantics
of datasets would be much simpler, with the complications bundled into
the semantics of those particular predicates. 

I'm guess I'm suggesting extending the definition of dataset to be a
default graph and rather than a set of pairs <U,G>, be a set of triples
<U, R, G>, where R is optional.  If R is omitted, you have the kind of
dataset we're used to now, where we have no idea what that relation is
supposed to be (unless the author tells us humans).

> Can one assert a dataset (ie claim it to be true)? 

Yes.

> How does one do that? 

The same way you do with RDF.  It kind of depends on your application.
Maybe you publish it on the web; maybe you send it to some agent; maybe
you publish it and send the URL somewhere, etc.

   -- Sandro

Received on Saturday, 17 December 2011 05:43:55 UTC