- From: Michael Hausenblas <michael.hausenblas@deri.org>
- Date: Mon, 30 May 2011 09:44:09 +0100
- To: Ivan Mikhailov <imikhailov@openlinksw.com>
- Cc: W3C SPARQL WG comments <public-rdf-dawg-comments@w3.org>
Ivan, I assume this is not an official answer, based on the 'with all hats off' next to your name. > The draft lists all severe issues already, namely #26, #18 and #19 > (the > #19 relates to this because cost of errors/attacks scales linearly or > faster with the scale of the storage and security becomes more and > more > important). I see no reason to lengthen that list. I do not suggest to add something to the list. Let's see where we are re the issues: + ISSUE-18 'Concurrency in SPARQL/update' + ISSUE-19 'Security issues on SPARQL/UPdate' + ISSUE-26 'Conjunction of operation vs atomocity, transactions' Hm. Sounds more like ISSUE-18/26 to me, but without knowing the history of all discussions it's hard to tell ... > [...] All these features can not be fit in any common-purpose spec, > due to > prohibitive cost of the "smallest valid implementation". What could be > in the spec, however, is a common syntax for implementation-specific > pragmas, like in XQuery, but this idea is rejected in SPARQL 1.0 > times. Let me phrase my proposal a bit more concretely: [[ To future-proof the SPARQL Update specification, add a non-normative appendix titled ‘large-scale deployment considerations’ (from a system- level POV). This section should discuss performance and scalability issues concerning large-scale deployments (hundreds of nodes/tera- triples scale) and offer implementation advices how to handle update language operations, such as DELETE, in this context. ]] Cheers, Michael -- Dr. Michael Hausenblas, Research Fellow LiDRC - Linked Data Research Centre DERI - Digital Enterprise Research Institute NUIG - National University of Ireland, Galway Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html On 29 May 2011, at 19:38, Ivan Mikhailov wrote: > Michael, > > The draft lists all severe issues already, namely #26, #18 and #19 > (the > #19 relates to this because cost of errors/attacks scales linearly or > faster with the scale of the storage and security becomes more and > more > important). I see no reason to lengthen that list. > > The body of the document does not contain a word "transaction". Even > worse, I see no possibility to reach some consensus about it, due to > variety of implementations. It is not really important for me because > "my" SPARUL is a preprocessor on full-scale SQL engine; I can > implement > any standardized semantics in hours. So let others choose. > > Meanwhile we offer numerous implementation-specific pragmas, some of > them control transaction log and the like. We also offer graph-level > security for both SPARQL read-only access and SPARUL read-write. We > also > can make SPARQL queries with side effects such as loading missing > resources on demand, and there's security for that side effects as > well. > We also offer different non-SPARUL tools for massive data loading, > because one LOAD at time is definitely not the best way of keeping a > hundred of CPU cores busy and different single/cluster configurations > require different loading policies. We also configure parsers, because > real data are not always perfect and we should selectively recover > from > different sorts of errors. > All these features can not be fit in any common-purpose spec, due to > prohibitive cost of the "smallest valid implementation". What could be > in the spec, however, is a common syntax for implementation-specific > pragmas, like in XQuery, but this idea is rejected in SPARQL 1.0 > times. > > Best Regards, > > Ivan Mikhailov (with all hats off) > OpenLink Software > http://virtuoso.openlinksw.com > > On Sun, 2011-05-29 at 17:29 +0100, Michael Hausenblas wrote: >> All, >> >> This is a comment concerning the Last Call Working Draft 'SPARQL 1.1 >> Update' [1]. It is clearly written and, AFAICT sound. However, I have >> an issue with it - more on the conceptual level. I tried to express >> my >> concerns in a blog post [2] and will do my best to summarise in the >> following. >> >> While the proposed update language - without any doubt - is perfectly >> suitable for 'small to medium'-sized setups, I fear that we will run >> into troubles in large-scale deployments concerning the costs for >> updating and deleting huge volumes of triples. Now, I wish I had >> experimental evidence myself to proof this (and I have to admit I >> don't have), but I would like the WG to consider to either include a >> section discussing the issue, or setting up a (non-REC Track) >> document >> that discusses this (which could be titled 'implementation/usage >> advices for large-scale deployments' or the like). I do feel strongly >> about this and would offer to contribute to such a document, if >> desired. >> >> I'd very much appreciate it if WG members would be able to point me >> to >> own experiences in this field (experiments or real-world deployments >> alike). >> >> Cheers, >> Michael (with my DERI AC Rep and RDB2RDF WG co-chair hat off) >> >> [1] http://www.w3.org/TR/2011/WD-sparql11-update-20110512/ >> [2] http://webofdata.wordpress.com/2011/05/29/ye-shall-not-delete-data/ >> >> -- >> Dr. Michael Hausenblas, Research Fellow >> LiDRC - Linked Data Research Centre >> DERI - Digital Enterprise Research Institute >> NUIG - National University of Ireland, Galway >> Ireland, Europe >> Tel. +353 91 495730 >> http://linkeddata.deri.ie/ >> http://sw-app.org/about.html >> >> > >
Received on Monday, 30 May 2011 08:44:39 UTC