- From: Orri Erling <erling@xs4all.nl>
- Date: Tue, 14 Jul 2009 19:06:01 +0200
- To: <public-rdf-dawg@w3.org>
- Message-Id: <200907141706.n6EH6uRk044720@smtp-vbr5.xs4all.nl>
All Concerning transactions and SPARUL, I would make the following comments: In most of our use of SPARUL we are dealing with bulk load and bulk update situations that have no concurrency or transactional requirements. Virtuoso supports full transactions up to serializable isolation with RDF as well as relational data. With SPARRUL, we use this rather seldom. The reason is that one can easily run out of memory for rollback data if updating millions of rows, which is not uncommon with SPARUL, for example if it is being used for materializing entailment. If there is a limit to transaction size, as there will be in systems which must keep rollback state, this will easily be hit and it is very difficult to split large insert-select combinations to smaller chunks. Thus we have made a row autocommit mode which commits every now and then on its own initiative. Internally, serializable isolation is needed in order not to insert the same thing twice from two threads but such things are not visible to the user. If the SPARQL protocol should say anything about transactions, we would suggest it contained a switch for disabling any atomicity. This would explicitly state that rollback information need not be kept, i.e. the system can commit as often as it wants, and that no repeatability of read applies. A resonable default would be to be atomic for the update, saying nothing of read repeatability. A transaction of course would not encompass anything except the content of a single post request. And even this should be disableable for purposes of bulk operations. In our experience, bulk copying of data is much more common than any resource-committing transaction such as an update of a balance on an account. In fact I do not know that we would have done the latter at any time in RDF. I suggest issues of transactions be relegated to implementations and to connection based API's. In such situations connection opptions can be used for isolation, exclusive read and such things which are needed in transactional applications. Orri All Concerning transactions and SPARUL, I would make the following comments: In most of our use of SPARUL we are dealing with bulk load and bulk update situations that have no concurrency or transactional requirements. Virtuoso supports full transactions up to serializable isolation with RDF as well as relational data. With SPARRUL, we use this rather seldom. The reason is that one can easily run out of memory for rollback data if updating millions of rows, which is not uncommon with SPARUL, for example if it is being used for materializing entailment. If there is a limit to transaction size, as there will be in systems which must keep rollback state, this will easily be hit and it is very difficult to split large insert-select combinations to smaller chunks. Thus we have made a row autocommit mode which commits every now and then on its own initiative. Internally, serializable isolation is needed in order not to insert the same thing twice from two threads but such things are not visible to the user. If the SPARQL protocol should say anything about transactions, we would suggest it contained a switch for disabling any atomicity. This would explicitly state that rollback information need not be kept, i.e. the system can commit as often as it wants, and that no repeatability of read applies. A resonable default would be to be atomic for the update, saying nothing of read repeatability. A transaction of course would not encompass anything except the content of a single post request. And even this should be disableable for purposes of bulk operations. In our experience, bulk copying of data is much more common than any resource-committing transaction such as an update of a balance on an account. In fact I do not know that we would have done the latter at any time in RDF. I suggest issues of transactions be relegated to implementations and to connection based API's. In such situations connection opptions can be used for isolation, exclusive read and such things which are needed in transactional applications. Orri
Received on Tuesday, 14 July 2009 17:07:33 UTC