- From: Brian McBride <bwm@hplb.hpl.hp.com>
- Date: Wed, 04 Sep 2002 12:26:10 +0100
- To: "Dennis Quan" <dquan@mit.edu>, "'Dave Reynolds'" <der@hplb.hpl.hp.com>
- Cc: <www-rdf-dspace@w3.org>
Dennis, I'm a bit surprised, but yes it does. It uses the writeUTF call to a datastream which is restricted to 64Kbyte strings :( The fix is both hard and easy. The code change is trivial, but the trivial change will be incompatible with existing databases. Is this a big deal for you. The BDB implemenation isn't well set up for such large literals; it will store each one three times. Brian At 15:39 03/09/2002 -0400, Dennis Quan wrote: >Hi Dave, > >I have not investigated this too deeply, but it appears that there is a >64 kilobyte restriction on the length of literals in the Berkeley >DB-backed Jena implementation. I have observed that the code is throwing >a java.io.UTFDataFormatError, which is thrown for this reason. If this >is a limitation, are there any plans to remove it? > >Thanks, >Dennis > > > -----Original Message----- > > From: www-rdf-dspace-request@w3.org >[mailto:www-rdf-dspace-request@w3.org] > > On Behalf Of Dave Reynolds > > Sent: Friday, August 09, 2002 9:54 AM > > To: karger@theory.lcs.mit.edu > > Cc: www-rdf-dspace@w3.org; Nick_Wainwright@HPLB.HPL.HP.COM; > > dquan@theory.lcs.mit.edu > > Subject: Re: Jena database performance > > > > > > Hi David, > > > > > My intuition tells me that the right cache for our application is a > > > "graph cache"---namely, a set of resources and the relations >incident > > > on those resources. > > > > > > Also could you provide more details on how those queries are > > > generated and then sent to the store? > > > > > > This intuition follows from the idea that most of > > > the queries being issues are of the form "now that I have object X, > > > give me the resource at the other end of predicate P from X". For > > > example, "now that I am holding object X and want to display it, > > > lookup X.type. Now that I have T=X.type, find an element that can >be > > > used to display T by finding T.viewers. etc." In the presence of >an > > > LRU cache, this would naturally over time cache all the data types > > > (not very many) and all the viewer elements for those types (also >not > > > very many). > > > > Understood. That seems like a good intuition. What would be the >easiest > > way to > > get statistics or example data to check it out? > > > > FYI In our eperson work the application does analagous things, in our >case > > we > > put the pointer chasing into a single query, for example: > > X rdf:type [ex:viewer []]; * []. > > brings back all the properties of X, including its rdf:type and for >its > > rdf:type > > brings back the viewer object. This is one query, over the network, >which > > brings > > back a whole bunch of RDF statements which the client app can then >pull > > apart. > > Though in fact in our case the type-to-viewer mapping is done using a > > display-policy expressed as an RDF graph that we can retrieve all of >in > > one > > query at client startup. > > > > The cost of this is that the client application has to be written so >as to > > exploit these batch queries, essentially we are doing app specific >caching. > > The > > advantage is that the store has explicit information on the access >paterns > > which > > could be used for cache management. A generic cache that worked well > > enough with > > just implicit inferred access patterns would simplify some of the >client > > code > > and would be of general use. > > > > I'll be out of email contact for the next two weeks but would like to > > follow > > this up more after I return. > > > > Dave
Received on Wednesday, 4 September 2002 07:33:00 UTC