Islands (ACTION-148)

In the telecon, I mentioned the idea of "islands".  This is not a 
technical design - its a way of thinking about the theory and practice 
of graphs on the web.

An island is a collection of graphs where all the RDF semantics 
(specifically for merge and for entailment relationships) work out as 
defined in the RDF 2004 specs.

That requires, for example, that the application trusts the information 
in all the graphs it's working with.

In practice, not all data is perfect.  An application will assemble a 
set of graphs it is going to work with - that may be some mixture of 
reading a number of places on the web, picking graphs out of a local 
graph store, and creating it's own data.  (from Yvres) RDF data about 
the Dr Who universe [1] is perfectly reasonable when working within that 
universe, but may be a bit suspect when considered in the real world.

The criteria is more "fit for purpose" - an application is going through 
two steps, one to collection the graphs it wants to work with together, 
the second to actually work with those graphs.

Islands aren't an absolute viewpoint and data may be come available, or 
an application may determine it trusts some new data, or even new 
island, and, for it's purpose, links them together.

Another application, with different goals, may take a different view as 
to whether two graphs can be considered to be compatible (an application 
specific term).  Foaf files declaring people's names may be good enough 
for a social network application, but not good enough for legal purposes.

For our named graphs discussions, the key technical requirement is to 
not combine data which shouldn't be.  Keeping data apart by default and 
letting the application decide when to allow it to merge or entail.

[2] does that.  Within one trig files, all the triples with the same 4th 
slot are in the same graph, and being one graph, all RDF semantics must 
be valid.  Triples with different 4th slot may or may not be combinable. 
  The basic machinery does decide - it just means that two triples with 
two different 4th slots have no defined relationship.

The use of a URI for a graph label in two different trig documents 
should mean the same thing but combining two datasets, like combining 
two graphs, will involve an application deciding that is can be done.

Islands aren't named or formally recognized - and one apps view of 
"usable together" may not be the same as another apps.

 Andy

[1] http://www.bbc.co.uk/doctorwho/dw
[2] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal

Received on Monday, 27 February 2012 15:53:25 UTC