- From: Bass, Mick <mick.bass@hp.com>
- Date: Thu, 15 Apr 2004 12:25:30 -0700
- To: "Merry, Martin" <martin.merry@hp.com>, "'Mackenzie Smith'" <kenzie@MIT.EDU>, "'Ryan Lee'" <ryanlee@w3.org>, "'www-rdf-dspace@w3.org'" <www-rdf-dspace@w3.org>
Nicely put, Martin - thanks for clarifying my attempt to summarize your _real_ thoughts. Looks like Steve Garland has taken it up to work on this this week and present a proposal on next week's call. Is that right Steve? - Mick > -----Original Message----- > From: Merry, Martin > Sent: Thursday, April 15, 2004 8:07 AM > To: Bass, Mick; Mackenzie Smith; Ryan Lee; www-rdf-dspace@w3.org > Subject: RE: SIMILE PI phone conference, 15-Apr-04 1100 EDT/1600 BST > > > Dear All > > Let me briefly try and clarify _my_ concerns about scale > (Nick may have other issues). > > Scalability covers a multitude of sins: in particular there > is scalability wrt instance data, and scalability wrt > ontologies. There is also the issue of whether datasets are > linked dynamically. > > Thus: > > 1) Small number of large corpora; no dynamic linking. > > Here one is putting a query to a known, small collection of > datasets. LInks between the different ontologies/schemata are > known in advance; hence one is essentially querying a single > large collection of triples. Lots of precomputation can be > done. Issues are to do with working out what precomuptation > is necessary, but provided the precomputation can be done > offline, probably gives you something tractable, and enables > some sort of faceted broswer-based approach to work. > > 2) Large number of large corpora; no dynamic linking. > > I''ve not thought of this as a dimension that Simile would > put a lot of effort into in terms of building prototypes, > simply because it's unlikely that there's be enough data out > there. In this case tho the amount of precomputation > necessary may become prohibitive. > > 3) Small number of large corpora; dynamic linking > > This is where one - as part of the querying /browsing process > - assembles a collection of corpora and says "I want to > browse these" - identifying links between them on the fly. > > It's not obvious to me that precomputation is possible in > this case: it's certainly a lot harder. If you can't do any > precomputation run-time performance is likely to be extremely > problematic. Faceted browsing may well not be the way to go. > > > 4) Large number of large corpora; dynamic linking > > Yuk. > > ------------------- > > I was initially asking for clarification of which of these > situations the project is in i.e. when the scalability report > comes out, can it be grounded in the use cases that the > project is going to be addressing, so we know which of these > issues we're facing. Once this has been done I was also > advocating a collection of milestones to address the issues > the report raised, so that there is some visible progress on > this prior to the "demonstration at scale" milestone in June > 05. In particular, I felt it would be a Good Thing to have > some sort of proof of concept of scalability before launching > into a prototype integrating Simile with DSpace. > > I hope all this makes sense - please yell if it doesn't. > > As I said at the beginning, these were my concerns, rather > than Nick's - NIck can add any clarification he feels is > neccessary. They're also my concerns rather than the Jena > team's, which I believe were captured during the discussions. > > Martin > > > -----Original Message----- > > From: www-rdf-dspace-request@w3.org > > [mailto:www-rdf-dspace-request@w3.org]On Behalf Of Bass, Mick > > Sent: 15 April 2004 14:24 > > To: Bass, Mick; Mackenzie Smith; Ryan Lee; www-rdf-dspace@w3.org > > Subject: RE: SIMILE PI phone conference, 15-Apr-04 1100 EDT/1600 BST > > > > > > > > Feedback on the milestones doc from Martin and Nick: > > > > 1. There is a full year between the paper describing issues > > of scaling up and the milestone ("Navigation and mapping > > demonstration at increased scale" - milestone 4, June 05) > > where scale is finally demonstrated. Suggest identifying > > useful and measurable intermediate milestones to be included > > with the scale whitepaper deliverable. > > > > 2. Concern that the work regarding scale remain consistent > > with the functional integration overall architecture to be > > defined in "Proposal for functional integration of SIMILE > > with DSpace" deliverable (milestone 3, December 04) and > > "Prototype of SIMILE working with DSpace - Architecture and > > implementation that incorporates tools developed for > > Milestones 1-3." (milestone 4, June 05). Desire to avoid > > forks in the effort that join late or not at all. Question > > was - "how can the milestones reflect the need for concerns > > regarding scale to be reflected in the milestones regarding > > SIMILE architecture and functional integration of SIMILE > with DSpace?" > > > > 3. Observation and concern that some navigate/browse > > paradigms could be fundamentally computationally intensive > > and intractable with extremely large and dynamically changing > > datasets. So architecture needs to consider both concerns of > > scale from the perspective of database and query capacity, as > > well as appropriate constraints on the interaction paradigm > > so that implementation is tenable regardless of the > > underlying DB technologies. That is, there are two sets of > > concerns: 1 - scaling Jena and RDQL so that client browsers > > can implement useful browse/navigate paradigms and 2 - > > designing useful browse/navigate/search paradigms including > > constraints (on support for dynamic data, for the types of > > queries that need to be issued) that can be implemented using > > available database and query technologies (of which Jena is > > one alternative). > > > > 4. Martin and the Jena team would like further guidance with > > respect to the interactions and support that SIMILE will > > require in addressing the issues of scale and/or client > > interface design described in the above 3 items. > > > > > > ==== > > Mick Bass > > > > 970.898.6788 office 408.216.0584 fax > > 303.667.1227 mobile 303.494.5202 residence > > bass@alum.mit.edu mick_bass@hp.com > > ==== > > > > > > > -----Original Message----- > > > From: www-rdf-dspace-request@w3.org > > > [mailto:www-rdf-dspace-request@w3.org] On Behalf Of Bass, Mick > > > Sent: Thursday, April 15, 2004 12:35 AM > > > To: Mackenzie Smith; Ryan Lee > > > Cc: www-rdf-dspace@w3.org > > > Subject: RE: SIMILE PI phone conference, 15-Apr-04 1100 > EDT/1600 BST > > > > > > > > > > > > Boo, it is late breaking but I will be on a plane to LA > > > tomorrow at 11a EDT and so must send my regrets as well. > > > > > > I'd like to see the milestones converge - I have some > > > feedback from Martin Merry and Nick Wainwright on the > > > milestones that I will send in a separate note before the call. > > > > > > - Mick > > > > > > > > > > -----Original Message----- > > > > From: www-rdf-dspace-request@w3.org > > > > [mailto:www-rdf-dspace-request@w3.org] On Behalf Of > > Mackenzie Smith > > > > Sent: Wednesday, April 14, 2004 9:03 PM > > > > To: Ryan Lee > > > > Cc: www-rdf-dspace@w3.org > > > > Subject: Re: SIMILE PI phone conference, 15-Apr-04 1100 > > EDT/1600 BST > > > > > > > > > > > > > > > > I too must send regrets for tomorrow -- I'll be on a > plane to DC > > > > at the appointed hour. Stefano knows what's happening, > and we've > > > > got a contact at Archnet to ask about their data. I do need to > > > > know what's happening with the milestones document > pretty soon, so > > > > I asked Stefano to make sure that gets discussed and > let me know > > > > when I'm back. > > > > > > > > Thanks, > > > > > > > > MacKenzie > > > > > > > > > > > > Quoting Ryan Lee <ryanlee@w3.org>: > > > > > > > > > > > > > > SIMILE PI phone conference, 15-Apr-04 1100 EDT/1600 BST > > > > > > > > > > +1.617.761.6200, code 7464 ("SIMI") > > > > > irc://irc.w3.org:6665/simile > > > > > > > > > > You may want to look at the W3C Teleconferencing IRC > > Agent (Zakim) > > > > > page > > > > > for useful intructions: > > > > > http://www.w3.org/2001/12/zakim-irc-bot > > > > > > > > > > Agenda > > > > > > > > > > 1. WWW2004 attendance (Eric) > > > > > > > > > > 2. Additional datasets (MacKenzie, Eric)? > > > > > > > > > > 3. Milestones document (MacKenzie)? > > > > > > > > > > 4. Infrastructure status (Stefano) > > > > > > > > > > 5. Development status (Ryan for Mark) > > > > > > > > > > Regrets: Mark > > > > > > > > > > Agenda at http://simile.mit.edu/wiki/MeetingAgenda15April2004 > > > > > > > > > > Please edit and add on the wiki. > > > > > > > > > > -- > > > > > Ryan Lee ryanlee@w3.org > > > > > W3C Research Engineer +1.617.253.5327 > > > > > http://web.mit.edu/simile/www/ > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Received on Thursday, 15 April 2004 15:25:37 UTC