Re: Patch Issue: Blank Nodes and Tractability from Andy Seaborne on 2013-09-08 (public-ldp-patch@w3.org from September 2013)

From: Andy Seaborne <andy@apache.org>
Date: Sun, 08 Sep 2013 11:33:15 +0100
To: public-ldp-patch@w3.org
Message-ID: <522C526B.6090409@apache.org>
On 08/09/13 04:17, Sandro Hawke wrote:
> In theory, patching a large branching subgraph of blank nodes is
> intractable.   It's the graph isomorphism problem, which has complexity NP.
>
> I think rdf lists don't have this problem, at least with a clever
> implementation, because the subgraph of blank nodes doesn't
> significantly branch.
>
> I wonder if there are many RDF graphs that do have this problem.   I can
> easily imagine paths which are 2-4 blank nodes deep, coming out of
> record structures that are mapped to RDF, but even an exponential search
> is okay when the exponent is only 4.  The question is: are there, in
> practice, subgraphs where you have to traverse 20 or 30 blank nodes,
> with branching, to get to the part you need to alter? If we don't know
> of any, how confident are we there won't be any real need for them?
>
> Also, do SPARQL systems handle rdf:lists efficiently enough?     Can I
> append to an rdf:List like this:
>
> DELETE { ?last rdf:rest rdf:nil }
> INSERT { ?last rdf:rest ( :NewLastElement ) }
> WHERE  { :myList :myProp ?n1.  ?n1 rdf:rest ?n2.  ?n2 rdf:rest ?n3.
> .....    rdf:rest ?last }

rdf:rest*
  Arbitrary length path.

DELETE { ?last rdf:rest rdf:nil }
INSERT { ?last rdf:rest ( :NewLastElement ) }
WHERE  { :myList :myProp ?list .
          ?list rdf:rest* ?last .
          ?last rdf:rest rdf:nil }

> where the chain is 20 or 200 or 2000 or even 20,000 elements long and
> have the the system handle it comfortably?

It has to run down the list if all you give it is the head, but it's a 
loop, not storing and growing internal state as it goes.

> If the answers to the above questions are not good,

I think it is also hard to understand.  The app developer has been 
dropped into the triple representation of a list when the date may have 
shown ( ... ).  It's data assembler.

> then we might want a
> patch format that sidesteps the whole problem, for instance by having a
> limited way to refer to blank nodes in the protocol. For example,
> patches are often written relative to a particular resource
> representation, pointed to by an ETag.   I don't see any technical
> reason the patch couldn't say "I'm using the same blank-node-names as
> that representation does", and if the server cares enough to retain that
> mapping, it can do patch processing in linear time.

I agree that dropping down to the label is a good idea and necessary if 
we don't restrict the styles of RDF that can be effectively managed with 
a patch.

It has to be the system's choice of internal label:

1/ you can't use the label as given by syntax because it may be repeated 
in unrelated updates.

Update 1: add triples:

<#peter> foaf:knows _:a .
_:a foaf:name "Paul" .

Update 2: add triples:

<#peter> foaf:knows _:a .
_:a foaf:name "Patricia" .

I can imagine template-driven updates doing that.

2/ In Turtle the system generates them for [] and ()


The question is how to find the internal label.

SPARQL's STR(?blank) could be used -- in the base spec says it's undef 
(=error, and so extensible) to call it on a blank node.  Or IRI(?bnode) 
could be overloaded for skolemization (several systems already do 
variations on this).

	Andy

>
>          -- Sandro
>
>
Received on Sunday, 8 September 2013 10:33:43 UTC