W3C home > Mailing lists > Public > public-linked-json@w3.org > February 2013

Re: Problem with auto-generated fragment IDs for graph names

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 15 Feb 2013 10:59:53 -0600
Cc: Manu Sporny <msporny@digitalbazaar.com>, RDF WG <public-rdf-wg@w3.org>, Linked JSON <public-linked-json@w3.org>
Message-Id: <576C2092-C605-4756-BD83-E54BF519F120@ihmc.us>
To: Eric Prud'hommeaux <eric@w3.org>

On Feb 14, 2013, at 4:36 PM, Eric Prud'hommeaux wrote:

> * Pat Hayes <phayes@ihmc.us> [2013-02-14 11:13-0600]
>> 
>> On Feb 14, 2013, at 8:02 AM, Eric Prud'hommeaux wrote:
>> 
>>> * Pat Hayes <phayes@ihmc.us> [2013-02-13 23:16-0600]
>>>> Manu, let me try to put the other case, in terms that approximate your self-confidence that you must be right. Obviously I am speaking here as an individual, not on behalf of the WG.
>>>> 
>>>> On Feb 13, 2013, at 9:24 PM, Manu Sporny wrote:
>>>> 
>>>>> On 02/13/2013 05:11 PM, Richard Cyganiak wrote:
>>>>>> PROPOSAL: Put @id on all graphs.
>>>>>> 
>>>>>> Why the aversion against simple and obvious solutions?
>>>>> 
>>>>> The simple and obvious solution you propose is wrong for developers.
>>>> 
>>>> For all developers? That seems like a rather strong claim. 
>>>> 
>>>>> 
>>>>> It attempts to side-step an arbitrary constraint imposed on developers
>>>>> by RDF Concepts by making developers lives harder. Worse, it ignores the
>>>>> reality of transient messages, including transient RDF Datasets that
>>>>> must be identified with document-local identifiers if the digital
>>>>> signatures are going to work out.
>>>> 
>>>> Well, this is the first time I have heard of "transient RDF". RDF, as far as I have always understood, was never intended to be transient. It is intended for publishing data on the Web. So it sounds as though you are simply using it for a purpose for which it was not designed, and never intended to be used. Perhaps your problems may arise from this mismatch between the intentions of the designers and your planned use.
>>> 
>>> I'd characterize this more as "quoting RDF", which we've been wrestling with since the beginning.
>> 
>> We have? News to me. It has come up *very* occasionally, but nobody has argued for it very strongly in any WG activity. And in spite of TImBL's early interest in it, I have never seen anyone cite an actual use case. It would break (or seriously complicate) SPARQL. 
> 
> If I want to know who says the moon is made of what, I can ask supply
> data:
>  @prefix : <x:/>.
>  { _:doc1 :author "Bob" }        # default graph
>  _:doc1 { :TheMoon :madeOf :greenCheese }
> query:
>  PREFIX : <x:/>
>  SELECT ?who ?what {
>    ?doc :author ?who
>    GRAPH ?doc { :TheMoon :madeOf ?what }
>  }
> results:
>  ┌───────┬──────────────────┐
>  │ ?who  │ ?what            │
>  │ "Bob" │ <x:/greenCheese> │
>  └───────┴──────────────────┘
> 
> The system that did this passes all of the SPARQL CR tests.

OK, I am impressed. I wasnt aware that SPARQL allowed variables in graph name position. 

But let me ask you about this example. You are assuming here that the _:doc1 in the triple in the default graph, and the _:doc1 used as a graph label, refer to the same thing, which is the moon-green-cheese graph, right? What is interesting here is that this assumption seems inevitable when we have a bnode involved, as here, but (the WG has decided) it cannot be assumed when an IRI is used. So this data:

{ex:doc1 :author "Bob" }
ex:doc1 {:TheMoon :madeOf :greenCheese }

does *not* entail that Bob is the author of the graph (since 'ex:doc1' might denote something else, which is what the default graph would be about, and not about the graph.) So this actually gives us a new, Manu-independent, reason to allow bnodes as graph labels in datasets: they provide exactly the missing expressivity that is needed to have the default graph act as genuine metadata.  

Hmm, I am now feeling like we should re-think our decision here. David, Guus, are you following this? Do I hear a groaning noise yet?

>> And in any case, this isn't what Manu is talking about, as far as I can see. He hasn't mentioned quoting, and it all seems to be about transience and digital signing. 
> 
> He has to construct a graph, canonicalize it, and hash it. By "quoting", I mean the construction of a graph for the purposes of discussion. Perhaps a term closer to KR would be "reification".
> 
> 
>>> I'm motivated to fix this not because of an interest in JSON-LD or Web Payments, but because quoting is a universal need:
>>> Bob says "the moon is made of green cheese".
>> 
>> Its more complicated than it seems. Do you want that quotation to be de dicto or de re? Does this quotation permit OWL equality reasoning, or is it referentially opaque? You are not allowed to say that you don't care, because the spec has to choose one way or the other. If you want both, you probably have to have two kinds of quotation. Reification is defined  (non-normatively) to be de re, allowing equality reasoning, so its not really traditional quotation. (Its  more like, Bob says *that* the moon is made of green cheese, without quote marks.)
>> 
>>> In the old days, the party line was that one uses reification for signing:
>>> _:statement1 dc:author "Bob" ;
>>>              rdf:subject :TheMoon ;
>>>              rdf:predicate :madeOf ;
>>>              rdf:object :greenCheese .
>>> 
>>> The analog in named graphs would be a bnode-labeled graph:
>> 
>> Only if you used a bnode in the reification, but why would you have done that? A reification with a bnode subject says that the described graph exists, that is all. It doesn't *identify* it , and it doesnt say anything about any actual graph in a document somewhere. 
> 
> I think that's exactly the desired effect.

Yes, I have (finally) understood that point now. 

> Under graph entailment (probably 95% of the SPARQL-using world), it has the same meaning as if one used an IRI. Under strict RDF entailment, which I've only rarely seen used in CWM, I'd think I'd be entitled to reduce
>  _:statement1 dc:author "Bob" ; rdf:subject :TheMoon ; rdf:predicate :madeOf ; rdf:object :greenCheese .
>  _:statement2 dc:author "Bob" ; rdf:subject :TheMoon ; rdf:predicate :madeOf ; rdf:object :greenCheese ; dc:date "2013-02-13" .
> to assertions to a single bnode. I could do the same under OWL, but then OWL wouldn't even treat <statement1> and <statement2> as distinct without a differentFrom assertion.
> 
> So why would it be odd to use a bnode for an rdf:Statement?
> 
> 
>>> _:statement1 dc:author "Bob" . _:statement1 { :TheMoon :madeOf :greenCheese } .
>>> 
>>> Except we've recently decided not to allow bnodes as graph labels, so:
>>> <statement1> dc:author "Bob" . <statement1> { :TheMoon :madeOf :greenCheese } .
>>> 
>>> 
>>> Normally, we shake a finger at someone who invents URLs that they don't intend to honor.
>> 
>> We are honoring it, its being treated as a graph name. Thats what graph names DO, they name graphs. 
> 
> Can I dereference it?

No, but then you never could dereference a graph name (and get the graph, that is.) An IRI used as a graph name actually has three distinct ways it can be thought of as referring to something. It can dereference to something (probably via http), or it can denote something (when used in an RDF triple), or it can label the graph (in a dataset). And it can do all of these at the same time, and all these can be different.

> If I see it uttered on two different transactions, can I confidently unify them?

Nope, but IRIs have been in this state ever since the dawn of RDF (in spite of http-range-14).

>>> Why is this case different?
>> 
>> Why do you think it is different? 
>> 
>> But OK, aside from scoring debate points, I will admit that using bnodes as graph labels does make semantic sense, if this is what they are supposed to mean. Is this what Manu wants them to mean? That is, a bnode used as a graph label means that this same bnode used inside some RDF (presumably in the default graph of the dataset?) must refer to that labeled graph? I would be cool with this if we could make that a genuine semantic constraint on datasets. It amounts to treating the labelling pairing as an equation: <name> = [ <graph> ] , which makes very good sense, but clashes with some of the other decisions we have taken (or carefully refused to take) about the meaning of graph labelling, since it means that graph labels must actually refer to their graphs. BTW, we would also have to make it clear exactly what sense of 'graph' is being referred to. I will hazard a guess that what Manu wants is for the bnode to identify a graph document rather than an actual graph (eg any bnodeIDs used inside it must not be allowed to change, or else the digital signature will get screwed up?)
> 
> I have the impression that our current weasel wording delivers us handily where you need not dereference a graph label and get that graph. I don't think that blank node labels make that harder.

True. I guess they make it kind of sharply obvious that if you don't have the dataset to hand, you are never going to find out what graph the IRI is supposed to be a graph label of. 

Pat


> From an implementer's perspective, the impact was limited to the Trig parser. SPARQL already has syntax which could construct datasets with bnode labels:
> 
>  PREFIX : <x:/>
>  INSERT { GRAPH ?g { :s :p :o } }
>   WHERE { BIND (BNODE("a") AS ?g) }
> The syntax constraints do keep you from creating a bnode graph directly, e.g.
>  PREFIX : <x:/>
>  INSERT { GRAPH _:a { :s :p :o } } # error, bnode not allowed as graph label.
>   WHERE {  }
> 
> 
> Incorporated above:
> * Pat Hayes <phayes@ihmc.us> [2013-02-14 11:36-0600]
>> 
>> On Feb 14, 2013, at 11:13 AM, Pat Hayes wrote:
>> 
>>> ... It amounts to treating the labelling pairing as an equation: <name> = <graph>,
>> 
>> actually <name> = [ <graph> ] 
>> 
>> where [   ] indicates some form of quotation. which makes better sense.
> 
> -- 
> -ericP
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 15 February 2013 17:00:33 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:39 GMT