Re: Using bnode identifiers for predicates, graph names from Gavin Carothers on 2013-02-02 (public-rdf-wg@w3.org from February 2013)

From: Gavin Carothers <gavin@carothers.name>
Date: Sat, 2 Feb 2013 14:06:04 -0800
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: RDF-WG WG <public-rdf-wg@w3.org>
Message-ID: <CAPqY83ySX27ZAOuy0KwcAso10D84zaRu3e-ugRCBALb3S_JDnw@mail.gmail.com>
On Sat, Feb 2, 2013 at 11:44 AM, Andy Seaborne <
andy.seaborne@epimorphics.com> wrote:

>
>
> On 02/02/13 18:31, Manu Sporny wrote:
>
>> On 01/31/2013 06:41 AM, Sandro Hawke wrote:
>>
>>> How is using bnodes to identify graphs any more absurd than using
>>> them to identify people (the canonical example)?    Blank nodes make
>>> prefect logical sense as local (file scope) identifiers.   They are
>>> clearly useful.
>>>
>>
> bnodes don't identify people : bnode + IFP or bnode + some properties
> identify.
>
> [] foaf:name  "Andy" .
>
> is not uniquely me.  It can be anyone called "Andy"
>
>
>
>>> I'll agree with Andy's point, however, that this ship has already
>>> sailed.   While blank nodes are fine in any position in generalized
>>> rdf, they are not okay for predicates or graph names in standard
>>> rdf.
>>>
>>
>> Alright, so the direction from the group is clear.
>>
>> That creates a problem with normalization of JSON-LD (and really, any
>> normalization of any RDF graph that doesn't specify an IRI as a name).
>>
>> Assume that we have the following JSON-LD document, containing two
>> "anonymous" graphs:
>>
>> [
>>    {
>>      "@context": "http://example.org/mycontext.**jsonld<http://example.org/mycontext.jsonld>
>> ",
>>      "@graph": {
>>         "name": "Sandro"
>>      }
>>    },
>>    {
>>      "@context": "http://example.org/mycontext.**jsonld<http://example.org/mycontext.jsonld>
>> ",
>>      "@graph": {
>>         "name": "Pat"
>>      }
>>    }
>> ]
>>
>
> Tangent: how do you know that is 2 graphs, and not 2 fragment of one graph?
>
>
>  We need to digitally sign the document via the RDF Graph Normalization
>> algorithm and generate something like this to digitally sign:
>>
>> _:bnode1 <http://schema.org/name> "Pat" _:graph1 .
>> _:bnode2 <http://schema.org/name> "Sandro" _:graph2 .
>>
>
> How do you sign the document with bnode subjects and objects without the
> same issue? (presumably by label and a deterministic allocation).
>
> (the Normalization document isn't REC track is it?)
>
>
>  However, now we can't name it _:graph1, or anything else like that,
>> right?
>>
>
> Internally (within a JSON LD processor), you can call them "graph1" and
> "graph2" if you want does not have to a bnode, bnode label, string or even
> RDF thing of any kind.
>
> It will/will not round trip with signing any more or less than bnode
> subjects do.
>
> Do you need to name them at all for signing?
> Why not sign the two graphs, and combine the signings?
>

If you are going to be signing a graph, your going to need to create a byte
stream of that graph. If you've created a byte stream you can take a hash
of it (in fact your going to anyway with most signing systems) at this
point you can create a unique URI for graph that is GLOBALLY unique to that
set of bytes. I've mentioned this before Manu, in fact just about every
time graph signing comes up. Blank nodes for graph names are not needed for
this use case.


>
> if order matters (and then it's not an RDF Dataset anyway), add a counter
> to the combine step
>
> Is it different to the same doc with the graphs reversed in the JSON
> array?  use case?
>
>
>  So we need to come up with another naming scheme that is
>> deterministic and it needs to match an IRI. It seems kind of strange to
>> introduce a mechanism for something that is already basically there.
>>
>
> This seems to be the heart of it : bnodes don't match an IRI (see above)
>
> You seem to want more than document scoped labels - you want labels that
> are stable across multiple parses of the document.


Yep, and since Manu needs this for signing the use case already has to
solve the stable byte stream for graph issue.


>
>
>  Even stranger, the IRI that will be generated will inevitably conflict
>> with some other normalized graph IRI because it isn't scoped to the
>> document. These identifiers need to be scoped to the document if the RDF
>> graph normalization algorithm is going to work fairly cleanly.
>>
>> Could we introduce the concept of a 'blank graph identifier'?
>>
>
> Sure - (1) create your own URI scheme or (2) a systematic way to generate
> UUIDs based on doc and position of the graph in the doc


an md5 URI, or UUID hash type URI will work just fine.


>
>
>  This is time critical for us. We are in the process of launching a
>> financial product that uses 'blank graph identifier's for graph IDs when
>> normalizing to perform a digital signature. This has the potential for
>> delaying that launch. We really need to iron out this issue pretty soon.
>>
>
> It is time critical for RDF-WG as well :-)
>
>
>> -- manu
>>
>>
>         Andy
>
>
>
Received on Saturday, 2 February 2013 22:06:32 UTC