W3C home > Mailing lists > Public > public-rdf-wg@w3.org > February 2012

Re: Islands (ACTION-148)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Wed, 29 Feb 2012 10:38:24 +0100
Message-ID: <4F4DF210.3070706@emse.fr>
To: public-rdf-wg@w3.org
Le 28/02/2012 23:13, Sandro Hawke a écrit :
> Antoine, I think what you're saying makes sense to me, about the
> proposed dataset semantics (in [2]) being very weak, so stronger
> semantics can be added with additional constraints.   But how does it
> constrain the graph labels, eg :g1?   Can you show me some entailments
> and non-entailments involving the use of the graph labels inside
> triples?  Also, is there any relationship between the default graph and
> the truth?  (And if not, then how do you transmit metadata in TriG in a
> standard way?)

The semantics does not constrain the graph labels. This is in order to 
accomodate for the most liberal uses of the labels, such as "labelling 
the graph with its primary topic's URI".

Of course, there is a need to set constraints in some cases. Any 
implementation can set its own constraints when minting graph IRIs, so 
the liberality is not really a problem for the internal behaviour of an 
isolated system. It becomes a real problem when the dataset is subject 
to move from one place to the other, possibly changing its owner or 
being distributed online.

For this, we would need some extra design, which I think would be along 
the line of what you proposed already. The idea that I'm thinking about 
is to exchange datasets *together with* a metadata file (or metadata 
header inside the same file) which specify what "scheme" you are using. 
In absence of such header, systems are asked to rely on the most liberal 
semantics, which allows different interpretations of 
differently-labelled graphs.
I'll send an email to show examples based on use cases.


>       -- Sandro
> On Tue, 2012-02-28 at 18:27 +0100, Antoine Zimmermann wrote:
>> Le 27/02/2012 22:42, Pat Hayes a écrit :
>>> On Feb 27, 2012, at 9:52 AM, Andy Seaborne wrote:
>>>> In the telecon, I mentioned the idea of "islands".  This is not a
>>>> technical design - its a way of thinking about the theory and
>>>> practice of graphs on the web.
>>>> An island is a collection of graphs where all the RDF semantics
>>>> (specifically for merge and for entailment relationships) work out
>>>> as defined in the RDF 2004 specs.
>>>> That requires, for example, that the application trusts the
>>>> information in all the graphs it's working with.
>>> No. It does not require that. There are two distinct issues here:
>>> what the truth conditions on a graph are, and whether or not you
>>> should trust the RDF (or more correctly, whether or not you should
>>> trust whoever is publishing it and claiming it to be true.) The RDF
>>> semantic specs address the first of these but say nothing at all
>>> about the second, other than that when you do accept some RDF, you
>>> are kind of obliged to also accept its valid consequences (so
>>> checking those is one way to determine, in fact, whether or not you
>>> should trust the RDF.)
>> I agree. Trust is an orthogonal issue.
>>>> In practice, not all data is perfect.  An application will assemble
>>>> a set of graphs it is going to work with - that may be some mixture
>>>> of reading a number of places on the web, picking graphs out of a
>>>> local graph store, and creating it's own data.  (from Yvres) RDF
>>>> data about the Dr Who universe [1] is perfectly reasonable when
>>>> working within that universe, but may be a bit suspect when
>>>> considered in the real world.
>>> Quite. And it would be great if we had a way to publish RDF 'in a
>>> context' which made such relationships clearer. But this is an
>>> aside.
>> Ok.
>>>> The criteria is more "fit for purpose" - an application is going
>>>> through two steps, one to collection the graphs it wants to work
>>>> with together, the second to actually work with those graphs.
>>>> Islands aren't an absolute viewpoint and data may be come
>>>> available, or an application may determine it trusts some new data,
>>>> or even new island, and, for it's purpose, links them together.
>>>> Another application, with different goals, may take a different
>>>> view as to whether two graphs can be considered to be compatible
>>>> (an application specific term).  Foaf files declaring people's
>>>> names may be good enough for a social network application, but not
>>>> good enough for legal purposes.
>>>> For our named graphs discussions, the key technical requirement is
>>>> to not combine data which shouldn't be.
>>> OK for that (who can disagree?) but...
>>>> Keeping data apart by default
>>> ... not with that. That seems ridiculously strong.
>> I don't think so. Defaults don't prevent doing otherwise. If you have a
>> relational schema which states that column xyz is NULL by default, it
>> does not forbid anyone to put data in that field.
>> Similarly, noone is forbidding anybody to use the merge operation, just
>> because we have Datasets.
>>>> and letting the application decide when to allow it to merge or
>>>> entail.
>>>> [2] does that.
>>> No, it does something even stronger. What [2] says is that *the same*
>>> URI when used in one graph can mean something completely different
>>> when used in another graph, and that *this is perfectly correct* and
>>> even in fact *consistent*.
>> Yes, and from my point of view, it is fine. Obviously we can discuss it,
>> but please make the discussion based on technical advantages or
>> drawbacks, not on philosophical considerations. If it matches the use
>> cases, then the group will have to admit it is pertinent. Certainly, it
>> won't match the use cases perfectly, but what alternative do we have?
>> You began to propose something, please formalise it completely on the
>> wiki and we'll be able to decide what to do with a well informed eye.
>> We (at least me) are willing to envisage other proposals as well.
>>> What this means is that every URI in every
>>> graph is interpreted locally to that graph, which in effect makes
>>> every URI into a blank node (since this is how blank nodes are
>>> interpreted.) This is dissolving the entire Web in a kind of
>>> universal solvent.
>> No, URIs are not interpreted as existentials, their interpretation is
>> simply parameterized. See the difference:
>>    :g1 { :s :p _:b }
>> entails
>>    g1 { :s :p _:c }
>> but
>>    :g1 { :s :p :b }
>> does not entail
>>    :g1 { :s :p :c }
>>>> Within one trig files, all the triples with the same 4th slot are
>>>> in the same graph, and being one graph, all RDF semantics must be
>>>> valid.
>>> The RDF semantics does not refer to graphs, but to vocabularies. An
>>> interpretation is a mapping FROM A VOCABULARY to a universe. Graphs
>>> are mentioned only as conjunctions of triples.
>> The RDF semantics does refer to graphs.
>> Especially, if E is an RDF graph and I is an RDF-interpretation, RDF
>> 2004 defines I(E), which is commonly named "the interpretation of E":
>> see Section 1.4 (http://www.w3.org/TR/rdf-mt/#gddenot).
>>> The 2004 semantics
>>> does not allow a given triple to mean different things depending upon
>>> which graph it occurs in.
>> Really? Can you show where it says so precisely?
>>>> Triples with different 4th slot may or may not be combinable.  The
>>>> basic machinery does decide - it just means that two triples with
>>>> two different 4th slots have no defined relationship.
>>> Even if they are, for example, the same triple. Really, is this what
>>> you want? Because we might as well just declare that RDF has no
>>> semantics at all, seems to me. It no longer serves any purpose.
>> According to the RDF semantics
>> :s owl:differentFrom :s .
>> is consistent. Is it really what people want? Simple entailment is very
>> weak and makes *everything* consistent, but it is standard. Its weakness
>> allows one to define constraints on vocabularies on top of it in various
>> ways, where you can detect inconsistencies.
>> The semantics in [2] is allowing everything to be consistent, provided
>> that each "named" graph inside datasets is itself consistent. That is
>> not a problem as extension can be defined that define how knowledge from
>> one graph influence knowledge from another graph (e.g., we could have
>> things like rdf:imports, or whatever).
>>>> The use of a URI for a graph label in two different trig documents
>>>> should mean the same thing but combining two datasets, like
>>>> combining two graphs, will involve an application deciding that is
>>>> can be done.
>>> But how will it? ANY two graphs are semantically consistent, on this
>>> account, and two graphs (with different labels) NEVER entail any
>>> graph larger than either of them (such as their merge, for example),
>>> according to the semantics in [2].
>> If you refer to what graphs entail, you are certainly talking about the
>> RDF semantics, which is completely defined in RDF 2004. The proposal in
>> [2] does not say anything about what graphs entail. It talks about what
>> datasets entail and mean. And it is not true that any two datasets are
>> always mutually consistent.
>>> So all semantic relationships are
>>> reduced to triviality, so there can be no criteria available to check
>>> for acceptability on any semantic grounds.
>> Semantic relationships are as complex in datasets as they are in the
>> logic used for individual graphs. If you want to simulate RDF reasoning
>> with datasets, you simply deal with datasets that have empty "named"
>> graphs and a default graph.
>>> Remember, *every* URI
>>> might mean sometjhing completely different in another graph, so you
>>> can't say things like one graph says that x:joe is age 10 and the
>>> other says he is age 12: that URI might refer to Joe in one graph and
>>> Susan in the other, and the URI for the age property might mean age
>>> in one graph and being-a-handle-of in the other. Graphs become black
>>> holes of meaning, without any way for anything inside to influence or
>>> connect with anything outside.
>> RDF does not offer any means to define unambiguously what a URI denotes.
>> It's all a question of personal interpretation. I provided an example a
>> while ago using your personal web page's URL, and you said "this clearly
>> denotes me", then I said, maybe it denotes your web page, and you said
>> "oh, yes, of course, it denotes my web page". But frankly, there
>> absolutely nothing in the RDF spec that can tell you for sure whether
>> it's your web page or yourself or anything else. And this is true for
>> *any* URI. So, the fact that they can be interpreted differently in
>> different graphs seems to me quite natural.
>> AZ
>>>> Islands aren't named or formally recognized - and one apps view of
>>>> "usable together" may not be the same as another apps.
>>> Oh what a tangled Web we weave.... (Sorry, couldnt resist :-)
>>> Pat
>>>> Andy
>>>> [1] http://www.bbc.co.uk/doctorwho/dw [2]
>>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal
>>> ------------------------------------------------------------ IHMC
>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>>> (850)202 4416   office Pensacola                            (850)202
>>> 4440   fax FL 32502                              (850)291 0667
>>> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
Received on Wednesday, 29 February 2012 09:38:50 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:03 UTC