Re: Last Call comment on SPARQL 1.1 Update

Michael,

The draft lists all severe issues already, namely #26, #18 and #19 (the
#19 relates to this because cost of errors/attacks scales linearly or
faster with the scale of the storage and security becomes more and more
important). I see no reason to lengthen that list.

The body of the document does not contain a word "transaction". Even
worse, I see no possibility to reach some consensus about it, due to
variety of implementations. It is not really important for me because
"my" SPARUL is a preprocessor on full-scale SQL engine; I can implement
any standardized semantics in hours. So let others choose.

Meanwhile we offer numerous implementation-specific pragmas, some of
them control transaction log and the like. We also offer graph-level
security for both SPARQL read-only access and SPARUL read-write. We also
can make SPARQL queries with side effects such as loading missing
resources on demand, and there's security for that side effects as well.
We also offer different non-SPARUL tools for massive data loading,
because one LOAD at time is definitely not the best way of keeping a
hundred of CPU cores busy and different single/cluster configurations
require different loading policies. We also configure parsers, because
real data are not always perfect and we should selectively recover from
different sorts of errors.
All these features can not be fit in any common-purpose spec, due to
prohibitive cost of the "smallest valid implementation". What could be
in the spec, however, is a common syntax for implementation-specific
pragmas, like in XQuery, but this idea is rejected in SPARQL 1.0 times.

Best Regards,

Ivan Mikhailov (with all hats off)
OpenLink Software
http://virtuoso.openlinksw.com

On Sun, 2011-05-29 at 17:29 +0100, Michael Hausenblas wrote:
> All,
> 
> This is a comment concerning the Last Call Working Draft 'SPARQL 1.1  
> Update' [1]. It is clearly written and, AFAICT sound. However, I have  
> an issue with it - more on the conceptual level. I tried to express my  
> concerns in a blog post [2] and will do my best to summarise in the  
> following.
> 
> While the proposed update language - without any doubt - is perfectly  
> suitable for 'small to medium'-sized setups, I fear that we will run  
> into troubles in large-scale deployments concerning the costs for  
> updating and deleting huge volumes of triples. Now, I wish I had  
> experimental evidence myself to proof this (and I have to admit I  
> don't have), but I would like the WG to consider to either include a  
> section discussing the issue, or setting up a (non-REC Track) document  
> that discusses this (which could be titled 'implementation/usage  
> advices for large-scale deployments' or the like). I do feel strongly  
> about this and would offer to contribute to such a document, if desired.
> 
> I'd very much appreciate it if WG members would be able to point me to  
> own experiences in this field (experiments or real-world deployments  
> alike).
> 
> Cheers,
>  Michael (with my DERI AC Rep and RDB2RDF WG co-chair hat off)
> 
> [1] http://www.w3.org/TR/2011/WD-sparql11-update-20110512/
> [2] http://webofdata.wordpress.com/2011/05/29/ye-shall-not-delete-data/
> 
> --
> Dr. Michael Hausenblas, Research Fellow
> LiDRC - Linked Data Research Centre
> DERI - Digital Enterprise Research Institute
> NUIG - National University of Ireland, Galway
> Ireland, Europe
> Tel. +353 91 495730
> http://linkeddata.deri.ie/
> http://sw-app.org/about.html
> 
> 

Received on Sunday, 29 May 2011 18:39:07 UTC