W3C home > Mailing lists > Public > public-rdf-wg@w3.org > September 2012

Re: Are you planning to use the Dataset Semantics?

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 26 Sep 2012 15:51:00 -0400
Message-ID: <50635CA4.6050602@openlinksw.com>
To: public-rdf-wg@w3.org
On 9/26/12 3:19 PM, Pat Hayes wrote:
> On Sep 26, 2012, at 6:43 AM, Guus Schreiber wrote:
>
>>
>> On 26-09-2012 08:07, Pat Hayes wrote:
>>> On Sep 26, 2012, at 12:55 AM, Antoine Zimmermann wrote:
>>>
>>>> Le 26/09/2012 03:54, Pat Hayes a écrit :
>>>>> On Sep 25, 2012, at 5:16 PM, Sandro Hawke wrote:
>>>>>
>>>>>> For myself, at this point I'm 70% convinced that I can implement
>>>>>> all the dataset use cases I understand (the ones I enumerated in
>>>>>> the Federated Phonebook examples, plus SPARQL dump/restore) without
>>>>>> any standard dataset semantics beyond having a standard place for
>>>>>> metadata (eg the default graph in trig and the service description
>>>>>> graph in SPARQL).
>>>>> Sandro, how can you use metadata *at all* without some way to force a
>>>>> URI to denote a graph? When you use the URI in the metadata RDF, what
>>>>> (semantic or even pragmatic) constraint ensures that what it denotes
>>>>> there is the graph that you have in mind? Or indeed, that it is a
>>>>> graph at all?
>>>> Most metadata in the world are provided without any formal semantics that enforces denotation. You know it's metadata about something because it follows the specs.
>>> Most metadata (in fact, most data) is provided in a context where no semantic claims are made. The semantic web claims to be more than this, and to have a globally coherent, formal, semantics. Without that, the semantic web is just a random muddle of local ad-hoc conventions which may or may not be compatible with one another, just like the world had always been.
>>>
>>> We can give up on this global vision, and rejoice in the fact that what we are doing is pretty much just continuing the global mess that existed before the Web was even invented, in which case I see no point in even continuing to argue about semantics at all; or, we can take the sematnics seriously, and try to make it fit what people want to use the formalisms for.
>>>
>>> And just to rub the point home, the RDF specs *do* talk about semantics, so "following from the specs" might well be inderstood to mean "following from the semantics".
>> Pat,
>>
>> Just trying to establish where this puts us.
>>
>> So,
>>   if (1) we can't force the graph name denoting a graph container, and
>>   if (2) we can't define useful semantics without stating in which cases it does
>>   then we need some (standardized) way to so (e.g. some kind of "<n> a Graph" triple).
> Right, that is my take.  Sandro's  "<n> a Graph"  trick is the simplest idea I have seen so far, but notice that what it does is to state that the name <n> actually does directly refer to the graph. This allows other names to be indirect names of graphs, but it still doesn't allow the metadata to use an indirect graph name (that is, a name which denotes something other than the graph) to refer to the graph. So Sandroean metadata is still blocked in those cases, I think.
>
> Let me try to sum up what I feel to be the basic issue here. We keep dancing around this but failing to grasp the nettle, which is that SPARQL dataset usage seems to have forced us into a situation where RDF is required to treat URIs as being ambiguous. There is no coherent way out of this situation other than either by rejecting this usage, or by modifying RDF so that it can disambiguate ambiguous URIs. So far we have always finished up deciding that we are unwilling to take either step, so we are still left with this unsolved problem. The 'minimal semantics' currently on the table does not deal with this issue.
>
> -----------
>
> The basic tension we have is between two (maybe three, but lets leave http aside for now) different notions of what it means for a URI be a 'name'. The semantic idea is that naming is denotation, and RDF content is normatively required to respect this. Existing dataset use-pragmatics, however, often distinguishes graph/name pairing in a dataset from denotation, which makes everything more complicated, as we now have two different notions of what a URI might be understood to refer to, so URIs have become systematically ambiguous. We need some way to resolve this ambiguity: when I see a URI in some RDF, and that URI denotes (say) a time-interval directly, but is linked to a graph in a dataset, is this occurrence of the URI supposed to refer to the time or to the graph? How can I tell? There are several possible ways we might answer this, and it seems to me that we still have not decided on an answer to this question, which is the most fundamental of them all.
>
> 1. If the containing RDF is labelled "metadata" in some way, or is located in a special place reserved for metadata (eg the default graph?), then the URI refers to the graph; otherwise it refers to the time. This could be done purely pragmatically (Sandro's recent emails) or more formally, eg based on the graph-extension construction in the "minimal" semantics.
>
> 2. If the triple in which the URI occurs has a property which is from a "metadata" namespace, then it refers to the graph, otherwise it refers to the time.
>
> (A potential problem with both of these is that the diambiguation applies to a whole RDF graph, so it rules out using the same URI to refer to the graph in some triples but to the time (or whatever) in other triples in the same graph. I can see why this might be an issue.)
>
> 3. If the URI is asserted to be a graph, then it refers to the graph and only the graph, i.e. it is not ambiguous. (Sandro's <n> a graph). This does not really disambiguate, but it allows one to assert that a given use is unambiguous.
>
> 4. If the URI occurs in a context which defines graph URIs as denoting the graphs they "name", then it refers to the graph, but in other contexts it refers to the time. (This requires extending RDF with contexts which can affect the denotations of URIs.)(Which is easy to do, let me emphasize.)
>
> 5. We simply reject the ambiguity, say that the naming URI must always refer to the graph (so metatheoretic RDF is just vanilla RDF) and introduce some other convention to handle the ambiguity-creating pragmatic cases, such as having an extra triple to relate the graph to the time-interval or other 'context'. For example
>
> :graph22 {  < > rdf:trueIn :interval26.  #some triples }
>
> or
>
> {... graph22 rdf:trueIn :interval26 ....}
> :graph22 { #some triples }
>
> rather than
>
> :interval26 { #some triples }
>
> This has the problem that it deprecates some current use styles, but it has the advantage that it means that datatstore syntax will provide a way to actually name a graph, which is likely to be of wider utility (for example, this is what the Provenance WG seem to need.)
>
> The meaning of rdf:trueIn will need some extensions to the RDF semantics, but nothing too serious. Basically it says that the object of the property is an extra parameter for all triples in any graph denoted by the subject. I would be happy to revize the 2004 semantics to handle this. Or we could just punt on this and leave all notions of "context" outside the model theory. We should provide an RDF vocabulary term for this, though.
>
> Pat

Yes!

An rdf:trueIn predicate (or similar) with clearly defined semantics is 
what we seek. Such a predicate would enable us drop our current use of 
pragamas (in our SPARQL implementation) for reasoner invocation. I 
provided an example of how we handle this situation currently in 
response to Antoine's question in an earlier thread. Such a predicate 
enables our query processor to make decisions about when to reason, on 
what basis, and through which "context lenses" etc..

Kingsley
>
>
>> Correct?
>> Guus
>>
>>
>>
>>> Pat
>>>
>>>>
>>>>> Pat
>>>>>
>>>>>> -- Sandro
>>>>>>
>>>>>> [1] http://www.w3.org/2005/10/Process-20051014/tr#cfr
>>>>>>
>>>>>>
>>>>> ------------------------------------------------------------ IHMC
>>>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>>>>> (850)202 4416   office Pensacola                            (850)202
>>>>> 4440   fax FL 32502                              (850)291 0667
>>>>> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Antoine Zimmermann
>>>> ISCOD / LSTI - Institut Henri Fayol
>>>> École Nationale Supérieure des Mines de Saint-Étienne
>>>> 158 cours Fauriel
>>>> 42023 Saint-Étienne Cedex 2
>>>> France
>>>> Tél:+33(0)4 77 42 83 36
>>>> Fax:+33(0)4 77 42 66 66
>>>> http://zimmer.aprilfoolsreview.com/
>>>>
>>>>
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 or (650)494 3973
>>> 40 South Alcaniz St.           (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile
>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>
>>>
>>>
>>>
>>>
>>>
>>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>
>


-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







Received on Wednesday, 26 September 2012 19:51:27 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:51 GMT