Re: Problem with auto-generated fragment IDs for graph names

* Pat Hayes <phayes@ihmc.us> [2013-02-14 11:13-0600]
> 
> On Feb 14, 2013, at 8:02 AM, Eric Prud'hommeaux wrote:
> 
> > * Pat Hayes <phayes@ihmc.us> [2013-02-13 23:16-0600]
> >> Manu, let me try to put the other case, in terms that approximate your self-confidence that you must be right. Obviously I am speaking here as an individual, not on behalf of the WG.
> >> 
> >> On Feb 13, 2013, at 9:24 PM, Manu Sporny wrote:
> >> 
> >>> On 02/13/2013 05:11 PM, Richard Cyganiak wrote:
> >>>> PROPOSAL: Put @id on all graphs.
> >>>> 
> >>>> Why the aversion against simple and obvious solutions?
> >>> 
> >>> The simple and obvious solution you propose is wrong for developers.
> >> 
> >> For all developers? That seems like a rather strong claim. 
> >> 
> >>> 
> >>> It attempts to side-step an arbitrary constraint imposed on developers
> >>> by RDF Concepts by making developers lives harder. Worse, it ignores the
> >>> reality of transient messages, including transient RDF Datasets that
> >>> must be identified with document-local identifiers if the digital
> >>> signatures are going to work out.
> >> 
> >> Well, this is the first time I have heard of "transient RDF". RDF, as far as I have always understood, was never intended to be transient. It is intended for publishing data on the Web. So it sounds as though you are simply using it for a purpose for which it was not designed, and never intended to be used. Perhaps your problems may arise from this mismatch between the intentions of the designers and your planned use.
> > 
> > I'd characterize this more as "quoting RDF", which we've been wrestling with since the beginning.
> 
> We have? News to me. It has come up *very* occasionally, but nobody has argued for it very strongly in any WG activity. And in spite of TImBL's early interest in it, I have never seen anyone cite an actual use case. It would break (or seriously complicate) SPARQL. 

If I want to know who says the moon is made of what, I can ask supply
data:
  @prefix : <x:/>.
  { _:doc1 :author "Bob" }        # default graph
  _:doc1 { :TheMoon :madeOf :greenCheese }
query:
  PREFIX : <x:/>
  SELECT ?who ?what {
    ?doc :author ?who
    GRAPH ?doc { :TheMoon :madeOf ?what }
  }
results:
  ┌───────┬──────────────────┐
  │ ?who  │ ?what            │
  │ "Bob" │ <x:/greenCheese> │
  └───────┴──────────────────┘

The system that did this passes all of the SPARQL CR tests.


> And in any case, this isn't what Manu is talking about, as far as I can see. He hasn't mentioned quoting, and it all seems to be about transience and digital signing. 

He has to construct a graph, canonicalize it, and hash it. By "quoting", I mean the construction of a graph for the purposes of discussion. Perhaps a term closer to KR would be "reification".


> > I'm motivated to fix this not because of an interest in JSON-LD or Web Payments, but because quoting is a universal need:
> >  Bob says "the moon is made of green cheese".
> 
> Its more complicated than it seems. Do you want that quotation to be de dicto or de re? Does this quotation permit OWL equality reasoning, or is it referentially opaque? You are not allowed to say that you don't care, because the spec has to choose one way or the other. If you want both, you probably have to have two kinds of quotation. Reification is defined  (non-normatively) to be de re, allowing equality reasoning, so its not really traditional quotation. (Its  more like, Bob says *that* the moon is made of green cheese, without quote marks.)
> 
> > In the old days, the party line was that one uses reification for signing:
> >  _:statement1 dc:author "Bob" ;
> >               rdf:subject :TheMoon ;
> >               rdf:predicate :madeOf ;
> >               rdf:object :greenCheese .
> > 
> > The analog in named graphs would be a bnode-labeled graph:
> 
> Only if you used a bnode in the reification, but why would you have done that? A reification with a bnode subject says that the described graph exists, that is all. It doesn't *identify* it , and it doesnt say anything about any actual graph in a document somewhere. 

I think that's exactly the desired effect. Under graph entailment (probably 95% of the SPARQL-using world), it has the same meaning as if one used an IRI. Under strict RDF entailment, which I've only rarely seen used in CWM, I'd think I'd be entitled to reduce
  _:statement1 dc:author "Bob" ; rdf:subject :TheMoon ; rdf:predicate :madeOf ; rdf:object :greenCheese .
  _:statement2 dc:author "Bob" ; rdf:subject :TheMoon ; rdf:predicate :madeOf ; rdf:object :greenCheese ; dc:date "2013-02-13" .
to assertions to a single bnode. I could do the same under OWL, but then OWL wouldn't even treat <statement1> and <statement2> as distinct without a differentFrom assertion.

So why would it be odd to use a bnode for an rdf:Statement?


> >  _:statement1 dc:author "Bob" . _:statement1 { :TheMoon :madeOf :greenCheese } .
> > 
> > Except we've recently decided not to allow bnodes as graph labels, so:
> >  <statement1> dc:author "Bob" . <statement1> { :TheMoon :madeOf :greenCheese } .
> > 
> > 
> > Normally, we shake a finger at someone who invents URLs that they don't intend to honor.
> 
> We are honoring it, its being treated as a graph name. Thats what graph names DO, they name graphs. 

Can I dereference it? If I see it uttered on two different transactions, can I confidently unify them?


> > Why is this case different?
> 
> Why do you think it is different? 
> 
> But OK, aside from scoring debate points, I will admit that using bnodes as graph labels does make semantic sense, if this is what they are supposed to mean. Is this what Manu wants them to mean? That is, a bnode used as a graph label means that this same bnode used inside some RDF (presumably in the default graph of the dataset?) must refer to that labeled graph? I would be cool with this if we could make that a genuine semantic constraint on datasets. It amounts to treating the labelling pairing as an equation: <name> = [ <graph> ] , which makes very good sense, but clashes with some of the other decisions we have taken (or carefully refused to take) about the meaning of graph labelling, since it means that graph labels must actually refer to their graphs. BTW, we would also have to make it clear exactly what sense of 'graph' is being referred to. I will hazard a guess that what Manu wants is for the bnode to identify a graph document rather than an actual graph (eg any bnodeIDs used inside it must not be allowed to change, or else the digital signature will get screwed up?)

I have the impression that our current weasel wording delivers us handily where you need not dereference a graph label and get that graph. I don't think that blank node labels make that harder.
From an implementer's perspective, the impact was limited to the Trig parser. SPARQL already has syntax which could construct datasets with bnode labels:

  PREFIX : <x:/>
  INSERT { GRAPH ?g { :s :p :o } }
   WHERE { BIND (BNODE("a") AS ?g) }
The syntax constraints do keep you from creating a bnode graph directly, e.g.
  PREFIX : <x:/>
  INSERT { GRAPH _:a { :s :p :o } } # error, bnode not allowed as graph label.
   WHERE {  }


Incorporated above:
* Pat Hayes <phayes@ihmc.us> [2013-02-14 11:36-0600]
> 
> On Feb 14, 2013, at 11:13 AM, Pat Hayes wrote:
> 
> > ... It amounts to treating the labelling pairing as an equation: <name> = <graph>,
> 
> actually <name> = [ <graph> ] 
> 
> where [   ] indicates some form of quotation. which makes better sense.

-- 
-ericP

Received on Thursday, 14 February 2013 22:36:47 UTC