Re: How to model valid time of resource properties? from John Walker on 2014-10-16 (public-lod@w3.org from October 2014)

From: John Walker <john.walker@semaku.com>
Date: Thu, 16 Oct 2014 13:05:44 +0200 (CEST)
To: Hugh Glaser <hugh@glasers.org>
Cc: W3C LOD Mailing List <public-lod@w3.org>
Message-ID: <1840484748.357398.1413457544742.open-xchange@oxweb05.eigbox.net>
Hi Hugh,


> On October 16, 2014 at 12:45 PM Hugh Glaser <hugh@glasers.org> wrote:
>
>
> Hi,
> > On 15 Oct 2014, at 23:02, John Walker <john.walker@semaku.com> wrote:
> >
> > Hi
> >
> >> On October 15, 2014 at 2:59 PM Kingsley Idehen <kidehen@openlinksw.com>
> >> wrote:
> >>
> >> On 10/15/14 8:36 AM, Frans Knibbe | Geodan wrote:
> ...
> > Personally I would not use this approach for foaf:age and foaf:based_near as
> > these capture a certain snapshot/state of (the information about) a
> > resource. Having some representation where the foaf:age triple could be
> > entailed could lead to having multiple conflicting statements with no easy
> > way to find the truth.
> >
> > Having a clear understanding of the questions you want to ask of your
> > knowledge base should help steer modelling choices.
> This undoubtedly true, and very important - is the modelling fit for purpose?
> Proper engineering.
>
> >>> In the cases known to me that require the recording of history of
> >>> resources, all resource properties (except for the identifier) are things
> >>> that can change in time. If this pattern would be applied, it would have
> >>> to be applied to all properties, leading to vocabularies exploding and
> >>> becoming unwieldy, as described in the Discussion paragraph.
> >>>
> >>> I think that the desire to annotate statements with things like valid time
> >>> is very common. Wouldn't it be funny if the best solution to a such a
> >>> common and relatively straightforward requirement is to create large
> >>> custom vocabularies?
> > If you want to be able to capture historical states of a resource, using
> > named graphs to provide that context would be my first thought.
> However, there is a downside to this.
> If all that is happening is that Frans is gathering his own data into a store,
> and then using that data for some understood application of his, then this
> will be fine.
> Then he knows exactly the structure to impose on his RDF using named Graphs.
>
> But this is Linked Open Data, right?
> So what happens about use by other people?
> Or if Frans wants to build other queries over the same data?
> If he hasn’t foreseen the other structure, and therefore ensured that the
> required Named Graphs exist, then it won;t be possible to make the statements
> required about the RDF.
>
> The problem is that in choosing the Named Graph structure, the data publisher
> makes very deep assumptions and even decisions about how the dataset will be
> used.
> This is not really good practice in an Open world - in fact, one of the
> claimed advantages of Semantic Web technologies is that such assumptions (such
> as the choice of tables in a typical database) are no longer required!
>
> I’m not saying that Named Graphs aren’t useful and often appropriate, but
> choosing to use Named Graphs can really make the data hard to consume.
> And if they are used, the choice of how really needs to be considered very
> much with the modelling.
> (This is particularly important in the absence of any ability to nest Named
> Graphs.)
>
> Cheers

I'll be more clear here, when publishing information on the web, that
information is always contained in a document. It's already wide practice to
provide immutable historical versions of these document in wikis and code
repositories like github. Using such an approach when publishing LOD should be a
feasible (maybe even best practice) approach. Here I mean that each document
would be an RDF graph that can be versioned as with any other document.
Approaches like time-based conneg might be used to time travel.

For doing a certain query Frans is then free to decide which of these graphs he
wants to use to form his RDF dataset for the particular query he wants to do.
SPARQL 1.1 provides the FROM and FROM NAMED fro this purpose.

The question of named graphs only really comes into play once the data is loaded
to a graph store. Other factors such as management of the data can also
influence the decision on how named graphs are used within a store.

> >
> > If that resource consists of just one triple, then RDF reification of that
> > statement would also work as Kingsley mentions.
> >
> >>>
> >>> Regards,
> >>> Frans
> >>
> >> Frans,
> >>
> >> How about reified RDF statements?
> >>
> >> I think discounting RDF reification vocabulary is yet another act of
> >> premature optimization, in regards to the Semantic Web meme :)
> >>
> >> Some examples:
> >>
> >> [1] http://bit.ly/utterances-since-sept-11-2014 -- List of statements made
> >> from a point in time.
> >> [2] http://linkeddata.uriburner.com/c/8EPG33 -- About Connotation
> >>
> >> --
> >> Regards,
> >>
> >> Kingsley Idehen
> >> Founder & CEO
> >> OpenLink Software
> >> Company Web:
> >> http://www.openlinksw.com
> >>
> >> Personal Weblog 1:
> >> http://kidehen.blogspot.com
> >>
> >> Personal Weblog 2:
> >> http://www.openlinksw.com/blog/~kidehen
> >>
> >> Twitter Profile:
> >> https://twitter.com/kidehen
> >>
> >> Google+ Profile:
> >> https://plus.google.com/+KingsleyIdehen/about
> >>
> >> LinkedIn Profile:
> >> http://www.linkedin.com/in/kidehen
> >>
> >> Personal WebID:
> >> http://kingsley.idehen.net/dataspace/person/kidehen#this
> >
> >
>
> --
> Hugh Glaser
> 20 Portchester Rise
> Eastleigh
> SO50 4QS
> Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
>
>
Received on Thursday, 16 October 2014 11:26:11 UTC