Re: SIMILE PI phone conference, 01-August-2003 1200 EDT/1700 BST

Lynx dump of IRC Log: (html is attached)

   [INFO] Channel view for "[1]#simile" opened.
   === Highest connection count: 57 (56 clients)
   -->| YOU have joined [2]#simile
   =-= Topic for [3]#simile is "simile pi teleconf - em to be 10 min late
   :("
   =-=  Topic  for  [4]#simile  was set by em on Fri Aug 01 2003 08:55:45
   GMT-0700 (PDT)
   -->| marbut_ ([5]marbut@192.6.19.190) has joined [6]#simile
   |<-- marbut has left irc.w3.org (Connection reset by peer)
   [7]em dialing
   [8]marbut_ [9]http://www.oclc.org/research/projects/rdf_interop/index.
   shtm
   =-= em has changed the topic to "simile pi teleconf"
   [10]marbut_ [11]http://wip.dublincore.org/source.html
   [12]marbut_ [13]http://wip.dublincore.org:8080/interop/searchServlet
   [14]marbut_   KS:   working   on  two  things.  One  a  memorandum  of
   understanding  for support for the project, the other is reading about
   doi to talk to John Ericsson about genesis
   [15]em [16]http://www.w3.org/2002/04/12-amico/
   [17]marbut_  em:  I'm  still  progressing  on  the  sample  data - see
   previous URL
   [18]marbut_  I've not heard back from the edutella folks nor the CIDOC
   folks
   [19]marbut_ There's a small collection of AMICO data available though
   [20]em [21]http://sh.webhire.com/servlet/av/jd?ai=631&ji=1274969&sn=I
   [22]marbut_  As  regards  the  hire,  we  are  online  and off the w3c
   homesite with a pointer to the W3C position
   [23]marbut_ Mark: Next item - staged demostrators - any feedback?
   [24]marbut_  KS:  The  result  we'll  want  to follow on with the demo
   develop.
   [25]marbut_  Mark:  So  what's the best way to do the persistant store
   bit?
   [26]marbut_  KS: You can start with Jena and add stuff on it, or start
   with genesis, which has a slightly different api.
   [27]marbut_  There  are some limits on the complexity of the graphs in
   genesis, we need to do some more work on a higher level
   [28]marbut_ object api. We are working on this, but you need to figure
   if the higher level objects here are satisfactory.
   [29]marbut_  But  what  I anticipate you'll want to do is to start use
   Jena as a back end. That's how I anticipate it going.
   [30]marbut_  But in between, we'll try to make things compatible, this
   helps with the APIs
   [31]marbut_  we  have an alpha level implementation of the first level
   of  genesis abstraction, how distribution is done, differences between
   local and remote
   [32]marbut_  searches,  but  as I understood it distribution is not so
   important to the first demo
   [33]marbut_  so  I  was  planning  on  reserving  the  ability  to  do
   distribution, not implement distribution right now, although I have
   [34]marbut_ and implementation,
   [35]marbut_  em:  I think its a good idea, its a small, accomplishable
   demostrator, we can use it to tease out the team interaction,
   [36]marbut_  it  gives  us some idea to compare Jena and Genesis. If I
   understand your diagrams, then some of the query / inference layers
   [37]marbut_ could be in the persistant store.
   [38]marbut_  In  the  OCLC project we did this by emacs, doing it with
   editors  might be interesting, but this seems scoped so we can have an
   early end date
   [39]marbut_ but I was hoping before christmas.
   [40]marbut_ mark: I'm hoping to do this before the hires are in place.
   [41]marbut_  em:  let  me  offer  me some lessons learnt from the OCLC
   project
   [42]marbut_  when  we  asked  for  the data, we didn't ask if we could
   publish it, or make it available to others
   [43]marbut_  we  need  to  make it clear that we want to make the data
   available, for other implementations,
   [44]marbut_ also there was a tremendous amount of data management that
   had to go on
   [45]marbut_  e.g.  xml  was invalid, we tried to get diverse datasets,
   but  we  still  had to do data cleanup, so we need to think about this
   also
   [46]marbut_  the  other  thing was picking your data, the focus was on
   diversity  of  datasets, since the datasets were so small the specific
   overlaps
   [47]marbut_  were  quite hard to teaseout, so while the theory is good
   trying to integrate small collections of diverse data was hard
   [48]marbut_  because in practice no-one is going to search that stuff.
   We need to get complimentary collections that
   [49]marbut_  do  have some overlap. I think the type of collections we
   are looking at are going to be better.
   [50]marbut_  The other thing we got burned on was performance. The way
   we did inference was more along the lines of oring,
   [51]marbut_  but  the  performance  was very poor. For example imagine
   that rss.title is a subproperty of dc.title
   [52]marbut_ so say you want to search of dc.title="computers" then you
   search   for   all   the   resources   that   dc.title="computers"  or
   rss.title="computer"
   [53]marbut_ so it was done at the query level, not below, e.g. forward
   vs backward chaining.
   [54]marbut_  The  problem  was  with  a  1000  records,  and  4  or  5
   subproperty  relations,  the  performance  became very slow, so it was
   taking 6 or 7 secs responses
   [55]marbut_  so  the  last  thing we learned was this was a compelling
   example,  that  even with the delays, even with subproperty / equality
   relationships
   [56]marbut_ it was compelling for groups trying to integrate data from
   lots of collections.
   [57]marbut_ mark: does it use a specific query tool in Jena?
   [58]marbut_  em:  no,  it doesn't use rdql, before OCLC started to use
   Jena, it had a toolkit called EOR that was similar
   [59]marbut_  we  had  some  fancy  backend  table  representations for
   managing large scale triple stores
   [60]marbut_  e.g. s-p-o, the later one took Sergey Melniks work, so we
   had routines that could work with a model or with a backend relational
   [61]marbut_  data store, and created an API that worked with database,
   that created SQL queries to run those over the database
   [62]marbut_ em: i think lots of things were slowng this down,
   [63]marbut_ ks: I'm not sure how we can avoid doing ors
   [64]marbut_  em: I have some suggestions, but the project was focussed
   on getting something up
   [65]marbut_  it  got  a lot of interest, but it didn't move forward at
   OCLC
   [66]marbut_  one other lesson learnt, that gets back to genesis, there
   are  2 ways of viewing this - one of the areas we were exploring after
   that
   [67]marbut_  was  at  data ingestion time to add the inference, so you
   cache the inferences
   [68]marbut_ ks: that's the approach that haystack uses
   [69]marbut_   but   it  makes  it  harder  to  on-the-fly  changes  to
   equivalence
   [70]marbut_  doing  it even adenine style means you have to do a batch
   update
   [71]marbut_  em: yes, tradeoffs either way - for the applications that
   oclc was dealing with, not seeing realtime results for
   [72]marbut_  changing  the mapping wasn't important, but of course you
   create a lot more data
   [73]marbut_  in this 3 month pilot, the majority of the time was spent
   data massaging
   [74]marbut_  ks: I think best way to do this would be to have built in
   support for contains
   [75]marbut_ ks: keyword search has been done though, its the inference
   that causes the problem, but I'm not sure if I can think of a good way
   to do inference
   [76]marbut_ em: yes, but thats why it may be important. when we see it
   working, we may think of optimizations. It will tease out how
   [77]marbut_   to  merge  controlled  vocabularies  and  how  to  merge
   indicies. So this is a useful scoped project to do this.
   |<-- marbut_ has left irc.w3.org (Client exited)

References

   1. irc://irc.w3.org:6665/%23simile
   2. irc://irc.w3.org:6665/%23simile
   3. irc://irc.w3.org:6665/%23simile
   4. irc://irc.w3.org:6665/%23simile
   5. mailto:marbut@192.6.19.190
   6. irc://irc.w3.org:6665/%23simile
   7. irc://irc.w3.org:6665/em,isnick
   8. irc://irc.w3.org:6665/marbut_,isnick
   9. http://www.oclc.org/research/projects/rdf_interop/index.shtm
  10. irc://irc.w3.org:6665/marbut_,isnick
  11. http://wip.dublincore.org/source.html
  12. irc://irc.w3.org:6665/marbut_,isnick
  13. http://wip.dublincore.org:8080/interop/searchServlet
  14. irc://irc.w3.org:6665/marbut_,isnick
  15. irc://irc.w3.org:6665/em,isnick
  16. http://www.w3.org/2002/04/12-amico/
  17. irc://irc.w3.org:6665/marbut_,isnick
  18. irc://irc.w3.org:6665/marbut_,isnick
  19. irc://irc.w3.org:6665/marbut_,isnick
  20. irc://irc.w3.org:6665/em,isnick
  21. http://sh.webhire.com/servlet/av/jd?ai=631&ji=1274969&sn=I
  22. irc://irc.w3.org:6665/marbut_,isnick
  23. irc://irc.w3.org:6665/marbut_,isnick
  24. irc://irc.w3.org:6665/marbut_,isnick
  25. irc://irc.w3.org:6665/marbut_,isnick
  26. irc://irc.w3.org:6665/marbut_,isnick
  27. irc://irc.w3.org:6665/marbut_,isnick
  28. irc://irc.w3.org:6665/marbut_,isnick
  29. irc://irc.w3.org:6665/marbut_,isnick
  30. irc://irc.w3.org:6665/marbut_,isnick
  31. irc://irc.w3.org:6665/marbut_,isnick
  32. irc://irc.w3.org:6665/marbut_,isnick
  33. irc://irc.w3.org:6665/marbut_,isnick
  34. irc://irc.w3.org:6665/marbut_,isnick
  35. irc://irc.w3.org:6665/marbut_,isnick
  36. irc://irc.w3.org:6665/marbut_,isnick
  37. irc://irc.w3.org:6665/marbut_,isnick
  38. irc://irc.w3.org:6665/marbut_,isnick
  39. irc://irc.w3.org:6665/marbut_,isnick
  40. irc://irc.w3.org:6665/marbut_,isnick
  41. irc://irc.w3.org:6665/marbut_,isnick
  42. irc://irc.w3.org:6665/marbut_,isnick
  43. irc://irc.w3.org:6665/marbut_,isnick
  44. irc://irc.w3.org:6665/marbut_,isnick
  45. irc://irc.w3.org:6665/marbut_,isnick
  46. irc://irc.w3.org:6665/marbut_,isnick
  47. irc://irc.w3.org:6665/marbut_,isnick
  48. irc://irc.w3.org:6665/marbut_,isnick
  49. irc://irc.w3.org:6665/marbut_,isnick
  50. irc://irc.w3.org:6665/marbut_,isnick
  51. irc://irc.w3.org:6665/marbut_,isnick
  52. irc://irc.w3.org:6665/marbut_,isnick
  53. irc://irc.w3.org:6665/marbut_,isnick
  54. irc://irc.w3.org:6665/marbut_,isnick
  55. irc://irc.w3.org:6665/marbut_,isnick
  56. irc://irc.w3.org:6665/marbut_,isnick
  57. irc://irc.w3.org:6665/marbut_,isnick
  58. irc://irc.w3.org:6665/marbut_,isnick
  59. irc://irc.w3.org:6665/marbut_,isnick
  60. irc://irc.w3.org:6665/marbut_,isnick
  61. irc://irc.w3.org:6665/marbut_,isnick
  62. irc://irc.w3.org:6665/marbut_,isnick
  63. irc://irc.w3.org:6665/marbut_,isnick
  64. irc://irc.w3.org:6665/marbut_,isnick
  65. irc://irc.w3.org:6665/marbut_,isnick
  66. irc://irc.w3.org:6665/marbut_,isnick
  67. irc://irc.w3.org:6665/marbut_,isnick
  68. irc://irc.w3.org:6665/marbut_,isnick
  69. irc://irc.w3.org:6665/marbut_,isnick
  70. irc://irc.w3.org:6665/marbut_,isnick
  71. irc://irc.w3.org:6665/marbut_,isnick
  72. irc://irc.w3.org:6665/marbut_,isnick
  73. irc://irc.w3.org:6665/marbut_,isnick
  74. irc://irc.w3.org:6665/marbut_,isnick
  75. irc://irc.w3.org:6665/marbut_,isnick
  76. irc://irc.w3.org:6665/marbut_,isnick
  77. irc://irc.w3.org:6665/marbut_,isnick



Butler, Mark wrote:

>Hi Team
>
>I made a mistake, the participant pin is 733650
>
>Toll Free Access Number:
>    866 276 8920
>UK FreeCall Access Number:
>   0800 073 8926
>
>Mark
>
>  
>
>>-----Original Message-----
>>From: Butler, Mark [mailto:Mark_Butler@hplb.hpl.hp.com]
>>Sent: 01 August 2003 11:34
>>To: www-rdf-dspace@w3.org
>>Subject: SIMILE PI phone conference, 01-August-2003 1200 EDT/1700 BST
>>
>>
>>SIMILE PI phone conference, 01-August-03 1200 EDT/1700 BST
>>
>>Toll Free Access Number:
>>   866 276 8920
>>UK FreeCall Access Number:
>>   0800 073 8926
>>Participant PIN:
>>   2536617
>>
>>Please join irc channel:
>>irc://irc.w3.org:6665/simile
>> 
>>Agenda:
>> 
>>1/ update, status, & next steps
>>
>>2/ Discussion: Proposal for staged demostrators - background
>>
>>OCLC RDF-DC Interop Project
>>http://www.oclc.org/research/projects/rdf_interop/index.shtm
>>OCLC RDF-DC Interop CVS Repository
>>http://wip.dublincore.org/source.html
>>Proposal for staged development of demonstrator
>>http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Jul/0039.html
>>Task Assignments for Demonstrator
>>(See enclosed document)
>>
>>3/ Any other business
>>
>>Dr Mark H. Butler
>>Research Scientist                HP Labs Bristol
>>mark-h_butler@hp.com
>>Internet: http://www-uk.hpl.hp.com/people/marbut/
>>
>>
>>
>>
>>
>>    
>>


-- 
========================================================
   Kevin Smathers                kevin.smathers@hp.com    
   Hewlett-Packard               kevin@ank.com            
   Palo Alto Research Lab                                 
   1501 Page Mill Rd.            650-857-4477 work        
   M/S 1135                      650-852-8186 fax         
   Palo Alto, CA 94304           510-247-1031 home        
========================================================
use "Standard::Disclaimer";
carp("This message was printed on 100% recycled bits.");

Received on Friday, 1 August 2003 12:50:06 UTC