Account on Cologne informal meeting - 2010-12-01 from Antoine Isaac on 2010-12-01 (public-xg-lld@w3.org from December 2010)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Wed, 1 Dec 2010 22:25:11 +0100
To: public-xg-lld <public-xg-lld@w3.org>
Message-ID: <4CF6BD37.6020609@few.vu.nl>
Hi everyone,

Here is an account of the informal discussion we had in Cologne, following the SWIB conference. I have done quite some edition, thinking that bits of this account could be re-used as such in the discussions that are ongoing on use case curation and terminology. I hope I have not betrayed the thoughts of one of the participants!
The raw IRC text is at the end.

Huge thanks to Anette and Joachim for organizing all this!

Antoine

========================

Present: Alexander, Anette, Antoine, Guenther, Joachim, Kai

Topics:
1. Terminology
2. Use cases and clusters
3. Outcomes of the group - content of report

==== 1. Terminology
Based on Mark's mail: http://lists.w3.org/Archives/Public/public-lld/2010Nov/0121.html

Terminology choices should be guided by (in that order)
1. Ease of understanding for librarians
2. Compatibility with LD technical terminology
Also, we can't do without a definition and example of resources that are included in the groups.

We found some terms easier to rule out for various reasons: model, list, codes, SKOS, schema, terms.

For group 1, among the proposed expressions "value vocabulary" is a (relative) best.
Group 1 should include also person lists (VIAF), places, events. Certainly also reference sets of works (like Getty's CONA). In fact group 1 can contain pretty much anything, except what falls in Group 2 :-) It's hard to think of a dataset that couldn't possibly be used in group 1 one day. Belonging to that group appears to be based on usage, not on essential characteristics of what's described. In fact we are very tempted by using the word "dataset". We could do not really make a choice between:
- "dataset" alone--on the condition that we explain the specific role that can be played by certain library datasets such as thesauri, authority list, etc.
- "reference dataset"
- "organized dataset"

For group 2, everyone agrees that this is the set of resources that in the RDF world will be expressed as "RDF vocabularies" (as defined in the RDF schema spec), namely, classes and properties. However, using "classes and properties" straight away may be hard as a first term for librarians. Further, choosing that may result in ruling out these vocabularies/models/schemes that are not (yet) implemented as RDF vocabularies.
A proposal that did not meet strong objections is to us "metadata element set" and say in the definition that this will correspond in RDF to classes and properties.


==== 2. Use cases and clusters

*Citations cluster*
Kai presented the current state of curation work on that cluster.
We discussed the content of the cases and the more abstract use cases extracted for them.
Interestingly the cluster curators added a new case to the contributed case studies: they felt something was missing.

Extracting "real" use cases is crucial for our final report. These are the real application scenarios, and could be used to organized the XG report section on UC (the clusters help distribute the work, but they might be shaky and/or overlapping).
The scenarios are what we should derive application requirements from. And from these application requirements we can derive recommendation for future work.

*Authority cluster*
Alexander presented the current state on authority cluster. They opted from a different approach, trying to abstract from the concrete case studies, come with ideas on paper and relate to the cases after.
They come with two main categories of scenarios for library authority data:
- enhance processes within libraries
- serve applications outside the library context
1st scenario requires highly detailed data, 2nd scenario would require simpler metadata element sets.

Alex and Jeff feel that they should create new cases--or ask other clusters if they have appropriate cases--to come with a better idea of the kind of metadata element set required.
We agreed that there was however no need to define an entire element set in this cluster (and in the XG more generally). Only having examples of elements would already be useful, to show what what kind of data is needed.
Joachim has volunteered to help the Authority cluster.

We remarked that there needs to be interaction between vocabulary alignment and authority files to put "subjects" somewhere back in the picture, to have not only authority lists.

*Goals*
http://www.w3.org/2005/Incubator/lld/wiki/Goals
There are doubts about the usefulness of general goals like PUBLISH. Re. publication of data, the priority should be to identify applications (why to do something), which influence the "how" of publication. Publishing alone is not a goal.
Perhaps "goals" is just not the good word to name some items there. Some current "goals" correspond to what is done in a case.
This could still be useful: we should consider "howtos" in our outcome, with cases being related to these howtos, as illustrations of them. PUBLISH could define a category for some of these best practices.


==== 3. Outcomes.

The discussion was focused on what we'd like to have as outcomes of the group, and the structure of the final report (http://www.w3.org/2005/Incubator/lld/wiki/index.php?title=FinalReportOutline&oldid=2239) considering these.
In particular, the current "education and outreach" section seems too neglected in the current setting.
We should pay more attention to the benefits of LLD, emphasize the opportunities for stakeholders, both for libraries and the rest of the world. For example, from the perspective of the library community, the benefits of the collaboration aspect are crucial (collaborative cataloguing, for instance).
So there should be a "benefits" section, rather at the front of the report.
And one recommendation could be to do more strategic work, targeting managers.

Also, there should be more stuff on potential curricula in "education and outreach". Guenther will browse through the cases and compare them what is in his own education material--trying to identify what is required in terms of qualifications for a "next generation librarian". Possible recommending future collaboration on this education topic. He will create a specific wiki page on this: we have already presentation material (http://www.w3.org/2005/Incubator/lld/wiki/Presentations), but the focus is quite different.


========================


menu
Status#lldX
(no topic set)
[09:20] == antoine [qw3birc@128.30.52.28] has joined #lld
[09:21] == anette [qw3birc@128.30.52.28] has joined #lld
[09:21] <antoine> Hi everyone!
[09:21] == gneher [chatzilla@194.95.250.30] has joined #lld
[09:21] <anette> Hi Antoine
[09:21] <antoine> So today will be informal...
[09:22] <antoine> We'll drop some lines trying to forward the main points and questions
[09:22] <antoine> but there won't be a full scribing
[09:22] <jneubert> Here in cologne are present Anette, Guenter, Antoine, Alexander, Kai and Joachim
[09:22] <antoine> Of course for the people around you can use the IRC to send questions
[09:24] <jneubert> two points on the informal agenda
[09:24] <jneubert> 1) use cases and clusters
[09:24] <jneubert> 2) outcomes of the group
[09:26] <jneubert> Alexander suggests to discuss terminology again and come to a descisioin of the group
[09:26] <jneubert> ... in order get the use cases in one common terminology
[09:29] <antoine> Guenther: teaching material?
[09:30] <jneubert> Guenter suggests to discuss outreach and how LLD stuff goes into curricula
[09:35] == danbri [danbri@87.210.48.176] has quit [Client exited]
[09:36] == kai [qw3birc@128.30.52.28] has quit [Ping timeout]
[09:37] <antoine> discussing http://lists.w3.org/Archives/Public/public-lld/2010Nov/0121.html
[09:38] <antoine> Group 1 and 2
[09:39] <antoine> Alexander: should we discuss instance data too?
[09:42] <anette> Group 2: Alexander: "model" is something that is not implemented. The vocabularies are implemented --> don't use model
[09:50] <emma> Hi everyone, I'll be around this morning, so thanks for informal scribing ;-)
[09:54] <antoine> hi!
[09:56] <anette> None of the terms in List 1 really describe what we mean. "Value vocabulary" ?
[09:59] <emma> +1 "value voc" : we may want to avoid terms that already have a (slightly different) meaning in our community ?
[10:05] <antoine> problem is that on linked data the role "group 1" may be played by any instance data set
[10:07] <anette> Joachim: KOS dataset - it is a dataset, but also used for organising knowledge
[10:09] <anette> Persons can be part of a dataset
[10:11] <anette> Alexander: a person cannot be part of a KOS - there cannot be a broader or narrower persons.
[10:14] <anette> Alexander: value vocabularies for lists, etc. but not authority data
[10:15] <anette> Kai: organized instance data?
[10:21] <anette> Antoine: Reference data - in French the understanding of the term is about authority
[10:24] <antoine> Organized instance data or reference instance data?
[10:25] <antoine> 1st step would be to extend the definitions of group 1 and group 2
[10:26] <antoine> group 1 should include also person lists (as in VIAF), places, events, events. Perhaps also reference sets of works (like Getty's CONA)
[10:28] <antoine> in fact we can't think of a dataset that could not *potentially* belong to that group
[10:28] <antoine> it is really connected to the use of the dataset, not what it is made of.
[10:28] <antoine> In fact we could use dataset...
[10:31] <antoine> s/dataset/"dataset"
[10:32] <jneubert> maybe "referenced dataset" could add the context in which it is used
[10:33] <antoine> kai: raise concern that if we go too broad we'd lose that distinction between values and the  rest
[10:34] <emma> What is the difference between a KOS and any dataset ? It's "organised" ?
[10:35] == danbri [danbri@130.37.28.132] has joined #lld
[10:37] <antoine> not so much difference in fact! It's rather in the way it's used...
[10:37] == kai [qw3birc@128.30.52.28] has joined #lld
[10:43] <antoine> group 2 is the stuff we would represent in RDF as classes and properties
[10:43] <emma> Maybe we could go for "datasets" as a terminology, and have a paragraph in the report where we explain the specific role played by KOSs in the LD environment
[10:43] <emma> Isn't group 2 list missing "medatadata terms" ?
[10:45] <antoine> @emma: re. group 1, this is indeed one of the options on the table (with "reference" and "organized" dataset)
[10:48] <antoine> Re. group 2: we don't like "term"
[10:49] <anette> "metadata" is too confusing
[10:53] <antoine> but it is understandable by librarians!
[10:54] <antoine> Consensus is to use "metadata XYZ" and say in the definition that this will correspond in RDF to classes and proprties
[10:55] <antoine> then what is XYZ?
[10:56] <emma> That was my next question ;-)
[10:57] <antoine> in RDF this is "RDF vocabulary" but that's really hard to get for librarians
[11:00] <antoine> At least there is no strong objection for "metadata element set" :-)
[11:02] <emma> I still don't like it, but if everyone else is ok...
[11:05] <antoine> <break>
[11:16] <anette> Discussion on use cases
[11:19] <antoine> Citations cluster
[11:19] <antoine> Definition of "citation"
[11:19] <antoine> Scope: what's in, what's out
[11:19] <antoine> E.g. description of a scientific dataset is out
[11:23] <antoine> A "cite" is different of a "link"
[11:31] <ww> maybe a counterexample for "person cannot be part of a KOS" -> personne juridique, organisational hierarchies, subsidiaries, etc. of course i agree if person is restricted to natural person
[11:32] <ww> ... even then it's still a stretch
[11:40] <antoine> <discussion on what the citations cases feature>
[11:43] <antoine> One case was added
[11:44] <antoine> from another cluster
[11:44] <antoine> -> which is a right thing to do
[11:46] <antoine> There are other related use cases: P20, chronicling america...
[11:46] <antoine> Then we try to extract a "real" use case
[11:50] <antoine> Are extracted use cases different or are they one variation of one unique use case?
[11:50] <antoine> -> kai: they are quite different
[11:51] <antoine> @ww: we're still discussing UCs now...
[12:02] == danbri [danbri@130.37.28.132] has quit [Client exited]
[12:07] <emma> I have to go, bye & thanks to scribes ;-)
[12:09] == emma [qw3birc@128.30.52.28] has quit [Quit: Page closed]
[12:10] <antoine> process of use cases
[12:10] <antoine> use case extraction is crucial
[12:11] <antoine> abstract use case should give the general organization of the report
[12:11] <antoine> we should derive application requirements from these scenarios
[12:12] <antoine> from these application requirements we can derive recommendation for future work
[12:12] <antoine> <authority file cluster>
[12:13] <antoine> trying to distinguish scenarios
[12:13] <antoine> library data to enhance processes within libraries
[12:13] <antoine> library data to serve applications outside the library contect
[12:14] <antoine> 1st scenario requires highly detailed data
[12:14] <antoine> 2nd scenario would require simpler metadata element sets
[12:16] <antoine> Jeff and I decided to free ourselves from the concrete studies
[12:16] <antoine> come with ideas on paper first
[12:17] <antoine> and relate to the cases after
[12:17] <antoine> perhaps create new cases --or ask other clusters if they have appropriate cases
[12:17] <antoine> Joachim has volunteered to help the Authority cluster
[12:23] <antoine> No need to define an entire element set in this XG
[12:23] <antoine> Exemples of elements alone would already be useful
[12:23] <antoine> (what kind of data element is needed)
[12:25] <antoine> There need to be interaction between vocabulary alignment and authority files
[12:26] <antoine> -> "subjects" are missing here!
[12:28] <antoine> <goals>
[12:28] <antoine> doubts about the usefulness of the thing
[12:33] <antoine> s/the thing/general goals like "publish"
[12:36] <antoine> next step will be applications
[12:36] <antoine> they will/should inflence the "how" of the publication
[12:38] <antoine> gneher: we need "publish" as a category for best practices
[12:39] <antoine> kei: perhaps "goals" is just not good
[12:39] <antoine> s/kei/kai
[12:41] <antoine> many current "goals" is the practice that is done in the case
[12:41] <antoine> we should consider "howtos" in our outcome
[12:42] <antoine> cases are relate to "howtos"
[12:43] <antoine> Anette: one best practice should be to ask first why to do something
[12:45] <antoine> <outcomes>
[12:45] <antoine> Alexander: we distinguish
[12:46] <antoine> 1: internal library community
[12:46] <antoine> 2: outreach to other communities
[12:50] <antoine> antoine: could we have this as specific sub-section in the rather loose "outreach" part?
[12:50] <antoine> emphasizing the opportunites for LLD, withing library community and beyond
[12:50] <antoine> reocmmending to do more strategic work (manager-level)
[12:50] <antoine> kai: two-way process: how about W3C?
[12:51] <antoine> alexander: for withing the community, the collaboration aspect is crucial (collaborative cataloguing, for instance)
[12:52] <antoine> s/withing/within
[12:53] <antoine> kai: we have to summarize the benefits for the stakeholders
[12:53] <antoine> antoine: like in the JISC report?
[12:54] <antoine> kai: benefits for the libraries, benefits for the world
[12:56] <antoine> jneubert: the education and outreach section does not capture that
[12:57] <antoine> there should be a "benefit" part
[12:59] <antoine> kai: put it in front!
[13:02] <antoine> gneher
[13:02] <antoine> : there should be more stuff on potential curricula in "education and outreach"
[13:02] <antoine> I will browse through the use cases
[13:03] <antoine> and compare them what is in my own education material
[13:03] == gneher [chatzilla@194.95.250.30] has quit [Ping timeout]
[13:03] <antoine>
[13:04] <antoine> for the "next generation librarian"
[13:04] <antoine> recommending future collaboration on the topics
[13:05] <antoine> s/topics/education
[13:05] <antoine> different kind of targets would be possible (computer science, material)
[13:08] <antoine> There will be a new wiki page on this
Received on Wednesday, 1 December 2010 21:24:38 UTC