Re: Problem with auto-generated fragment IDs for graph names

On 02/13/2013 05:11 PM, Richard Cyganiak wrote:
> PROPOSAL: Put @id on all graphs.
> 
> Why the aversion against simple and obvious solutions?

The simple and obvious solution you propose is wrong for developers.

It attempts to side-step an arbitrary constraint imposed on developers
by RDF Concepts by making developers lives harder. Worse, it ignores the
reality of transient messages, including transient RDF Datasets that
must be identified with document-local identifiers if the digital
signatures are going to work out.

Look at this from the standpoint of a Web Payments message. Something
that is completely transient, but needs to be digitally signed:

[{
  "@graph": {
    "source": "http://mybank.com/accounts/manu",
    "destination": "http://yourbank.com/accounts/richard",
    "amount": "5.00",
    "currency": "USD"
  }
},{
  "@graph": {
    "source": "http://mybank.com/accounts/manu",
    "destination": "http://yourbank.com/accounts/kingsley",
    "amount": "5.00",
    "currency": "USD"
  }
}]

You are stating that instead of doing the thing above, that we have to
now require all developers to generate identifiers for that dataset by
specifying an IRI for each graph:

[{
  "@id": "http://payswarm.com/transients#graph-38234jlkfsj9834u",
  "@graph": {
    "source": "http://mybank.com/accounts/manu",
    "destination": "http://yourbank.com/accounts/richard",
    "amount": "5.00",
    "currency": "USD"
  }
},{
  "@id": "http://payswarm.com/transients#graph-38234jlkfsj9834u",
  "@graph": {
    "source": "http://mybank.com/accounts/manu",
    "destination": "http://yourbank.com/accounts/kingsley",
    "amount": "5.00",
    "currency": "USD"
  }
}]

Why make developers jump through hoops because of some deficiency in
RDF? They don't have to do this for JSON. What we're proposing is that
we can auto-generate the IDs to get around RDFs deficiency by using
"graph:" IRIs, but only when we HAVE to serialize down to another RDF
serialization format (like NQuads, which we have to do when doing the
RDF Graph Normalization stuff). So, JSON-LD developers can happily use
the first bit of markup and can remain completely unaware that graph
name identifiers are automatically created for them when they normalize
to the NQuad serialization format:

_:c14n1
  <https://example.com/vocab#source>
    <http://mybank.com/accounts/manu>
      <graph:1> .
_:c14n1
  <http://example.com/vocab#destination>
    <http://yourbank.com/accounts/richard>
      <graph:1> .
_:c14n1
  <http://example.com/vocab#amount>
    "5.00"
      <graph:1> .
_:c14n1
  <http://example.com/vocab#currency>
    "USD"
      <graph:1> .
_:c14n2
  <https://example.com/vocab#source>
    <http://mybank.com/accounts/manu>
      <graph:2> .
_:c14n2
  <http://example.com/vocab#destination>
    <http://yourbank.com/accounts/kingsley>
      <graph:2> .
_:c14n2
  <http://example.com/vocab#amount>
    "5.00"
      <graph:2> .
_:c14n2
  <http://example.com/vocab#currency>
    "USD"
      <graph:2> .

> You seem to consistently choose the path of greatest resistance.

I consistently reject solutions that are anti-developer or anti-author. :)

I want people to look at RDF and say "Oh, that makes sense." instead of
"WTF? Why do I have to explicitly name graphs in certain cases when that
requirement doesn't exist at all for blank nodes?!"

This WG is punting on trying to solve the problem of document-local
identifiers. I get that. There are, however, repercussions for doing so.
I was asked to go back and think about using fragment identifiers as
auto-generated graph names. After discussing it with our CTO, it became
clear that fragment identifiers for graph names expose a particularly
problematic serialization issue when serializing without a document
base. That is, it isn't clear whether this will be viewed as valid in a
quad-store:

_:foo <http://example.org/bar> _:baz <#_graph:1> .

The quad above is digitally signed in the Web Payments work without a
base IRI. It is important that all processors that process it DO NOT
add a base IRI, otherwise the signatures will no longer match the
data in the quad store. However, <#_graph:1> isn't an absolute IRI and
is thus invalid in the RDF model. So, the only solution that we can see
is to use an absolute IRI that is meant to be interpreted as a
document-local identifier:

_:foo <http://example.org/bar> _:baz <graph:1> .

The above works, but has the downside of needing a new IRI scheme, which
none of us want, but hey, that's the best option we have right now
beside this one:

_:foo <http://example.org/bar> _:baz <_:graph1> .

... which is what we had been using for the past two years before
realizing that RDF Concepts forbids that sort of thing. This would be
the ideal solution if it weren't for the limitation imposed by the set
of RDF documents that assign special meaning to "_:" and restrict its
usage to be only for blank node identifiers and not also for
document-local identifiers.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny, G+: +Manu Sporny)
Founder/CEO - Digital Bazaar, Inc.
blog: Aaron Swartz, PaySwarm, and Academic Journals
http://manu.sporny.org/2013/payswarm-journals/

Received on Thursday, 14 February 2013 03:25:05 UTC