- From: Keith Bostic <keith.bostic@oracle.com>
- Date: Thu, 2 Nov 2006 09:17:08 -0500
- To: dturi@cs.manchester.ac.uk
- Cc: public-semweb-lifesci@w3.org
On Nov 2, 2006, at 5:48 AM, Daniele Turi wrote: >> To be absolutely clear -- the problems with Subversion were NOT >> problems or bugs in Berkeley DB, they were the result of >> incompatible interfaces between two software components. >> >> I don't want to turn this into a marketing presentation, but given >> how this conversation started, I think it's fair for me to give >> you a couple of examples: Berkeley DB is the database engine >> behind Sun Microsystems LDAP directory server, Google' s >> replicated Single Sign On service, Openwave's Email Mx product and >> the Amazon web site. >> >> Yes, that's right: when you log into Amazon, that customized page >> you see is built by roughly 1,000 accesses to Berkeley DB >> databases. And when you log into Google's gmail, your account >> information is stored in Berkeley DB. >> >> And, I can promise you two things: first, that every one of those >> products has a lot more than 1 thread or process accessing data at >> a time, and second, that every one of these companies wouldn't be >> using my technology if there was better or more reliable >> technology available! > > > I am surprised by the fact that Google uses BDB. In the following > recent article > > http://lwn.net/Articles/194667/ > > Google's Greg Stein says that they use their own system, called > Bigtable: I guess I wasn't absolutely clear, after all! :-) Google doesn't use Berkeley DB behind Subversion, they use it as the transactional, highly available data server behind their Single Sign On service. There's a paper on Google's use in the upcoming World's 2006 conference (Worlds is the USENIX Workshop on Real, Large Distributed Systems). The paper is entitled: Data Management for Internet-Scale Single-Sign-On Sharon E. Perl, Google Inc.; Margo Seltzer, Harvard University and Oracle Corporation and I'm sure it will be available on-line, shortly. I had never heard of Google using Subversion with their own back-end engine before, so I can't say why they made that decision or what scaling issues they found when using Berkeley DB behind Subversion. The obvious guess would be that Subversion doesn't use Berkeley DB's replication support, and so Subversion installations are limited to a single machine -- it may have been their choice to write their own Subversion repository that distributed their data instead of changing Subversion itself to use Berkeley DB's replication? Regards, --keith =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Keith Bostic +1-781-259-3139 keithbosticim (ymsgid) keith.bostic@oracle.com
Received on Thursday, 2 November 2006 14:20:06 UTC