IRC Log / SIMILE Investigators Call Content 05.23.2003

We did a good job again this week of capturing the content of the call.

Thanks to Rob Tansley for agreeing to Chair, and to all the participants.

Content Follows.

- Mick

=======

<mickBass> attending: John erickson, mark butler, mick bass, rob
tansley, kevin smathers

<mickBass> attending: MacKenzie Smith

<mickBass> MS: history system use cases

<mickBass> MS: tracking actions, all the examples that MS has thought
of would need

<mickBass> jason kinner joins

[12:05] <mickBass> MS: thinking about typical queries against history
system

[12:06] <mickBass> MS: descriptive note 2.3 talks about actions, and
need to track them

[12:06] <mickBass> MS: examples for possible queries would require
that

[12:06] <mickBass> MJB: request send sample queries to me for incorp
into History sys use cases

[12:06] <mickBass> MJB: in research drivers document

[12:07] <mickBass> MS: need things like "for every item that contains
mimetype X, when was it last modified"

[12:07] <mickBass> JK: a few comments on the list regarding this issue

[12:07] <mickBass> MS: concerned that hist system includes required
info

[12:08] <mickBass> MS: descriptive note doesn't talk much about what
data is collected at each action

[12:08] <mickBass> JK: some of that info maintained via harmony ABC.

[12:08] <mickBass> JK: that type of query would be difficult not
becuase of missing data, but because of current lack of type info

[12:08] <mickBass> JK: really need to be able to select a typed
"Modification Event"

[12:08] <mickBass> MS: other examples

[12:09] <mickBass> MS: show all transformations of bitstream of type X
since it was originally created

[12:09] <mickBass> MS: has a transformation of a certain kind been
done on a particular bitstream, for any item in the system

[12:10] <mickBass> JK: restriction of current hist sys is support for
only 3 kinds of events.  Should probably be base classes for more
descriptive event types.  For example bitstream transform special of
modify action

[12:10] <mickBass> JK: these specializations could be introduced to
make queries easier


[12:11] <mickBass> MS: most existing examples in use case run against
the metadata

[12:11] <mickBass> MS: this may be rare - we wouldn't typically do
that

[12:12] <mickBass> RT: issue of how the change of state of one object
affects the state of other related objects, orthogonal to this
discussion

[12:12] <mickBass> RT: Do we have an open issues list for the hist sys
descriptive note, or hist sys in general

[12:13] <mickBass> MJB: Jason and i discussing this AM

[12:13] <mickBass> JK: appropriate.  Need distinguish between general
open, "open-ended" issues, and those that directly impact current SOW

[12:13] <mickBass> JK: as we review design documentation, should
clarify scope for SOW, but interested in raising higher level issues
as well

[12:14] <mickBass> RT: which issues affect current work e.g. "last
call"

[12:14] <mickBass> JK: 1. URI model for encoding resources within the
history system data

[12:14] <mickBass> this is biggest issue because next doc proposes
schema.  will have time to talk about it, so other issues wrt RDF
modelling will come out then.

[12:15] <mickBass> JK: first doc - current state, what's wrong, what
can be improved, basis for moving forward

[12:15] <mickBass> JK: next doc - proposal

[12:15] <mickBass> RT: can we include sample graphs

[12:15] <mickBass> JK: yes, in plan

[12:15] <mickBass> RT: URI model as main outstanding issue

[12:16] <mickBass> MB: URI resolution

[12:16] <mickBass> RT: any others?

[12:16] <mickBass> MB: perhaps resolution

[12:16] <mickBass> JK: assert, for Hist sys purposes, URI's need not
be resolvable

[12:16] <mickBass> JK: other systems may find it convenient if Hist
Sys URIs are resolvable

[12:17] *** MacKenzie (chatzilla@18.51.1.92) has joined channel
#simile

[12:17] <mickBass> JK: But Hist Sys requirement is to merge data such
that useful queries can be formed and executed

[12:17] <mickBass> RT: assent

[12:17] <mickBass> JK: heard: cool if URIs resolve to something

[12:18] <mickBass> JK: hear: cool if schema URIs resolve to a schema
defn

[12:18] <mickBass> RT: some overall SemWeb arch questions.  In scope?

[12:19] <mickBass> JK: plan.  Commit to handle sys URI scheme to
produce globally unique URIs, but not to support resolution of those
URIs in any way within the history system.

[12:19] <mickBass> RT: understand there won't be many handles.

[12:19] <mickBass> JK: harmony situation is state of being of a
resource

[12:20] <mickBass> JK: may want to be able to resolve the situation
URI to the object as it existed at that time

[12:20] <mickBass> JK: one resolution path may be a metadata (RDF)
reference that returns subgraph

[12:20] <mickBass> MB: clarify how many handles?

[12:21] <mickBass> JK: where handles are assigned is a big question

[12:21] <mickBass> JK: haven't completely determined

[12:22] <mickBass> JK: try to review decision criteria to consider
during implementation

[12:22] <mickBass> JK: tend to use handles first, especially for items
that may be resolvable in the future

[12:22] <mickBass> JK: use URN for objects that have no hope of ever
being resolvable (e.g. Database ID)

[12:22] <mickBass> JK: otherwise would tend towards handles over URNs

[12:23] <mickBass> JK: URLs... this issue compounded.  People expect
URLs to resolve to something.  OK people may expect handles to resolve
to something as well.  But opportunity to take advantage of the
current ambiguity in handle usage.

[12:24] <mickBass> JK: can confer with CNRI.  But feel using handles
solely for naming wouldn't violate any axioms of handle system.

[12:24] <mickBass> RT: summarizing 1) URI Format 2) Resolution

[12:24] <mickBass> RT: 1) to be resolved (no pun intended), 2) defer

[12:24] <mickBass> RT: others?

[12:25] <mickBass> JK: I don't think so.  One reason for last call.
Do others see big issues?

[12:25] <mickBass> RT: given iterative design and discussions on next
doc, I'm happy with that

[12:25] <mickBass> MS: litle things.  Question about resources.  List
of namespaces in 2.1 for different resources

[12:25] <mickBass> MS: itemized and described in 2.2

[12:25] <mickBass> MS: bundles in description but not in namespace

[12:26] <mickBass> JK: in current implementation not used at all.

[12:26] <mickBass> MS: isn't there an eperson group resource?

[12:27] <mickBass> RT: epeople in the history system are agents.
Groups can't be an agent directly.

[12:27] <mickBass> RT: so this state doesn't need to be there?

[12:27] <mickBass> MS: clarify.. USe handle uri scheme for handles
generated by history system (3.2.1)

[12:28] <mickBass> MB: should say "for URIs generated by"

[12:28] <mickBass> JK: will correct

[12:28] <mickBass> MS: only other question about the three actions,
probably need more specialization.

[12:28] <mickBass> MS: maybe an example would help

[12:29] <mickBass> JK: back to eperson group for a moment

[12:29] <mickBass> JK: on reviewers and submitters

[12:29] <mickBass> JK: in designing new schema, trying to figure out
how to address reviwers, submitters

[12:29] <mickBass> JK: considering a typed collection approach

[12:30] <mickBass> JK: considering eperson collection subclass of RDF
sequence but whose range is an eperson

[12:30] <mickBass> RT: need an example

[12:30] <mickBass> JK: reviewers and submitters

[12:30] <mickBass> RT: but where does it fit, as metadata about a
collection

[12:30] <mickBass> JK: yes

[12:30] <mickBass> RT: might be good to use the name as used in DSpace

[12:31] <mickBass> JK: confused by semantics of reviewers.  [lost]

[12:31] <mickBass> RT: if RDF has a standard means of expressing a
collection, it might be good to use that

[12:31] <mickBass> RT: ... collection of agents...

[12:32] <mickBass> JK: if we use RDF collections, need to establish
more.  are they sequential, prioritized, etc.  need more attributes
perhaps.

[12:32] <mickBass> JK: rob, please think about it.

[12:32] <mickBass> RT: dspace object model orthogonal to harmony stuff

[12:33] <mickBass> JK: yes but I will have to model changes to the
metadata, so need to consider it.

[12:33] <mickBass> RT: others?

[12:33] <mickBass> MJB: what about bitstream format

[12:34] <mickBass> MS: have property called UserTypeDescription

[12:34] <mickBass> RT: 2.4.3 bitstream.  Properties bitstream_type_id

[12:34] <mickBass> RT: that's the ID in the DSpace bitstream format
registry

[12:34] <mickBass> MS: what is usertypedesc

[12:34] <mickBass> RT: if format is unrecognized, we just gather free
text

[12:35] <mickBass> RT: description is a logical description (main
article, front page)

[12:36] <mickBass> JK: approach is to take id and rename to URI

[12:36] <mickBass> JK: bitstream type ID -> bitstream type, then could
be annotated

[12:36] <mickBass> RT: bitstream format is a first class object,
deserves to be modelled

[12:37] <mickBass> RT: should be captured by the history system, added
to your section 2.2

[12:37] <mickBass> RT: its analogous to bundles, should be added

[12:37] <mickBass> MS: if speaking of bitstream formats as first class
objects

[12:37] <mickBass> MS: missing - way of expressing policies about
resources

[12:37] <mickBass> MS: preservation systems need a way of recording
policies about lots of properties

[12:37] <mickBass> MS: ..

[12:38] <mickBass> MS: we are actively working with many players to
define global format registry, which will assign globally unique IDs
to formats.  HOpet taht can be canonical.

[12:38] <mickBass> MS: so we can hook up to global format registry

[12:38] <mickBass> MS: this should happen in next year or so

[12:39] <mickBass> JK: would be useful if these could be encoded as
URIs, along same lines as DC defines formats like MESH for
identifiers.  i.e. provide a controlled vocab that you can use in RDF
models.

[12:39] <mickBass> MS: will put forward.  Do what I can.

[12:39] <mickBass> KS: creating URIs less interesting than creating
URIs that resolve to something.

[12:40] <mickBass> KS: may be better to have a mime-type, not a URI,
where URI resolves to mime-type

[12:40] <mickBass> RT: process check, shall we move along?

[12:40] <mickBass> RT: other issues on desc note?

[12:40] <mickBass> JK: mick can you distribute the URL with the notes
for this call

[12:41] <mickBass> MB: help with log

[12:41] <mickBass> RT: I can help

[12:41] <mickBass> RT: moving on.

[12:41] <mickBass> RT: feedback on prototype demonstrators strawman
document

[12:41] <mickBass> RT: comments on existing demonstrators?

[12:41] <mickBass> MS: trivial question

[12:42] <mickBass> MS: under in scope demonstrators, we discuss which
use cases are relevant

[12:42] <mickBass> MS: none of them mention the learning object model
use case

[12:42] <mickBass> RT: most of work in descs, may need to update that
field

[12:43] <mickBass> MB: renamed from OCW to LEarnign objects, doc
doesn't reflect

[12:43] *** kevins2 (chatzilla@192.6.19.104) has joined channel
#simile

[12:43] <mickBass> MS: 2.4, not filled in

[12:43] <mickBass> MS: what does it mean

[12:44] <Rob> MJB: One of the key dimensions is how to interoperate
metadata from different sources


[12:44] *** JasonKinner (jason_kinn@66.100.194.29) has joined channel
#simile

[12:44] <Rob> MJB: Talking to Kevin about how to move metadata around
to achieve this


[12:45] <Rob> MJB: OAI harvesting protocol is one approach


[12:45] <Rob> MJB: This is intended to provoke thought about this


[12:45] <Rob> MS: This is just about the metadata?


[12:46] <Rob> MS: There could be overlap between this and
dissemination architecture


[12:46] <Rob> MS: Distribution in this case means distributed data
between several parties, not 'me distributing to you'


[12:47] <mickBass> KS: its CS variant of "distribution"


[12:48] <mickBass> MB: enough defintion for hiring?


[12:48] <mickBass> MS: have learned about what to look for wrt skills,
but still need roles/resp.  Its not a project plan.

[12:49] <mickBass> MB: agree, but may need to move in parallel?

[12:49] <mickBass> MB: definitely a tension to manage

[12:49] <mickBass> JSE: question - nothing in set of demonstrators
that deals with issues surrounding policies.

[12:50] <mickBass> JSE: would it be appropriate to articulate that.

[12:50] <mickBass> JSE: haven't brought it up because not sure in
scope or people cared

[12:50] <mickBass> MS: couldn't quite hear?

[12:50] <mickBass> JSE: do we want to have a demonstrator about policy
expression

[12:51] <mickBass> MS: is comingup at all my meetings

[12:51] <mickBass> MS: including rights but also preservation policies

[12:51] <mickBass> MS: no existing policy expression language, some
rights expr languages but don't do what we need

[12:51] <mickBass> JSE: some candidates do exist, but mostly agree

[12:52] <mickBass> JSE: take offline, between MS, JSE, MJB.  Flesh out
how to roll in.

[12:52] <mickBass> MS: not sure if its in scope

[12:52] <kevins2> rob, actually distribution in the programming sense
rather than in the publisher sense of a warehousing and bulk order
fulfillment.

[12:53] <mickBass> MS: action - email prep, then connect at CNRI meet

[12:53] <mickBass> JSE: best resource is policy expression language
called PONDER

[12:53] <mickBass> MS: OK, offline

[12:54] <mickBass> MB: good body of work that is in scope

[12:54] <mickBass> RT: other comments

[12:54] <mickBass> RT: prioritizing / ordering / dependencies?

[12:55] <mickBass> RT: e.g. 2.1 gather schemas / instances before 2.2
interoperate

[12:55] <mickBass> MB: need input from PIs on these prototypes.
Basically Rob, Mark, Mick have sourced them.

[12:55] <mickBass> MB: need to stay brainstorming before prioritizing.

[12:56] <mickBass> MB: preference to delay prioritizing.

[12:57] <mickBass> MS: what about haystack demo

[12:57] <mickBass> MB: see 2.9

[12:57] <mickBass> oops MJB: see 2.9

[12:57] <mickBass> RT: see 2.5

[12:57] <mickBass> MB: should add to issue list - proposed
demonstrators from EM, Karger.

[12:59] <mickBass> MB: ok to post? need to get individuals in place

[13:00] <mickBass> MS: if I post now, need EM and Karger's individuals
basically at same time.

[13:00] <mickBass> RT: is that a wrap?

[13:01] <mickBass> MB: need full court press on hiring

[13:01] <mickBass> MS: on history system use case

[13:01] <mickBass> MS: don't see curators wanting to ask these
questions

[13:01] <mickBass> MS: really want it in OAIS-oriented use of history
data

[13:01] <mickBass> MS: more needs in provenance, how objects
transformed over time.

[13:01] <mickBass> MS: think about 4.5.4.1.

[13:02] <mickBass> MS: may want to take some of those out

[13:02] <mickBass> MS: I'll send my examples, elts think together
about how to refine

[13:02] <mickBass> MS: out next week.

[13:03] <mickBass> RT: retain call with those available

[13:03] <mickBass> KS: 2 questions on old business

[13:03] <mickBass> KS: plenary dates fixed

[13:03] <mickBass> MS: 23, 24 held

[13:04] <mickBass> MS: ^^June

[13:04] <mickBass> KS: CNRI meeting, next week?

[13:04] <mickBass> MS: on June 12

[13:04] <mickBass> JSE: note to confirm sent yesterday

[13:05] <mickBass> KS: preference to participate by phone

[13:05] <mickBass> RT: anything else?

[13:05] <mickBass> MS: Jason coming on 12th?

[13:06] <mickBass> MB: still resolving with Karger

=============================================
Mick Bass


Manager
Research and Business Development
HP Laboratories
Hewlett-Packard Company
1 Cambridge Center
Cambridge, MA 02142


617.551.7634 office    617.551.7650 fax
617.899.3938 mobile    617.627.9694 residence
bass@alum.mit.edu      mick_bass@hp.com
=============================================
 

Received on Friday, 23 May 2003 13:25:43 UTC