- From: Mark van Assem <mark@cs.vu.nl>
- Date: Fri, 07 Jan 2011 11:05:02 +0100
- To: "ZENG, MARCIA" <mzeng@kent.edu>
- CC: Karen Coyle <kcoyle@kcoyle.net>, Emmanuelle Bermes <emmanuelle.bermes@bnf.fr>, "public-xg-lld@w3.org" <public-xg-lld@w3.org>
Thanks all for the feedback! I've tried to address all your points in de value vocab description: - "A dataset is a collection of structured metadata records" - added some more "similar terms", including KOS, gazetteer, authority file, concept scheme - "They are "building blocks" with which metadata records can be built." Re Marcia's point [["For example, in digital gazetteers not only the place names are controlled but also the place features, type, coordinates, and even maps are included."]] I'm not sure I get what you mean with the "also controlled", but I think indeed that this is the same as the VIAF situation: the values in a value vocabulary can be described with elements and values themselves, which would make them "datasets" also. However, we can still see VIAF as a value vocab and not a dataset, as its main role is to be a building block for metadata records. Mark Op 6-1-2011 18:15, ZENG, MARCIA schreef: > I like the way Karen used in terms of building block or not... Also > agree with Jeff’s use of SKOS ‘concept scheme’ to define VIAF. > > * Regarding ‘data sets’: To me, the ‘data sets’ we are talking about > are structured data. Outside in other places ‘data sets’ could be > un-structured or semi-structured data (e.g., data.gov’s raw data > sets). > * Regarding ‘value vocabularies’: In the conventional way we have > used “knowledge organization systems (KOS)” for concept schemes > (broader than “controlled vocabularies”). Most of the vocabulary > types are clear such as pick lists, taxonomies, thesauri, subject > headings. But there is a group of ‘metadata-like’ KOS such as > authority files and digital gazetteers. They are/can be > constructed as thesauri (e.g., The Getty Thesaurus of Geographic > Names (TGN) and Union List of Artist Names (ULAN)). Or, they can > be in other structures. It is the contents they include that made > them also be referred to ‘data sets’. For example, in digital > gazetteers not only the place names are controlled but also the > place features, type, coordinates, and even maps are included. > Digital gazetteers can be used alone as data sets or be the value > vocabularies used in structured data sets. This might be like the > VIAF situation, depending on how it is constructed or on how it is > used. > > My 2 cents. > Marcia > > On 1/6/11 11:37 AM, "Karen Coyle" <kcoyle@kcoyle.net> wrote: > > Quoting Emmanuelle Bermes <emmanuelle.bermes@bnf.fr>: > > > > As for myself, I do have a few more comments : > > - I think the emphasis on value vocabs is too important in the current > > definition of dataset. It's actually creating confusion, in my view. > > - I'm wondering if we could use the term "instance" (a dataset is a > > collection of instance descriptions) or is it too implementation > oriented ? > > > > > I'm not sure that the term "instance" will work -- even a value in a > list could be considered an instance, no? > > Somehow, the concept for a dataset is that it consists of the > descriptions of entities that you need for an application or function, > rather than the building blocks for creating such a description. > (Which gets back to Mark's statement about "A record for Derrida's > book in dataset X ...") > > Essentially, one person's dataset could be another person's building > block. But I think the key is that a dataset is complete for an > application, while a value vocabulary needs to be combined with other > data to be useful. > > No, I'm not satisfied with that explanation... I'll ruminate on this > and see if I can find better words. > > kc > > > Emmanuelle > > > > On Thu, Jan 6, 2011 at 5:13 PM, Mark van Assem <mark@cs.vu.nl> wrote: > > > > > Hi Emma, > > > > > > I saw you had already followed up on our action to clarify "value > > > vocabularies". > > > > > > I saw that you think we should clarify how value vocabularies > actually > > > appear in metadata records (as literals, codes, identifiers). > While I kinda > > > feel we should try to stay agnostic to that I kept it in, but > rewrote it > > > slightly: > > > > > > "In actual metadata records, the values used can be literals, > codes, or > > > identifiers (including URIs), as long as these refer to a > specific concept > > > in a value vocabulary. " > > > > > > I also moved your point re "closed list" up to the initial > definition; this > > > is indeed central to what a value vocab is. > > > > > > Mark. > > > > > > > > > On 06/01/2011 16:34, Mark van Assem wrote: > > > > > >> Hi Jodi, > > >> > > >> X and Y would be two collections ("datasets") from two different > > >> libraries. It could also be two subcollections or within one > collection, > > >> but I think making them separate ones will make it more > illustrative. > > >> > > >> Do you have a suggestion on how to clarify or replace X and Y with > > >> specific existing collections/libraries as examples? > > >> > > >> Mark > > >> > > >> > > >> On 06/01/2011 16:21, Jodi Schneider wrote: > > >> > > >>> Thanks for this, Mark! I especially like the 'confusions' area > -- that > > >>> will make this quite useful. > > >>> > > >>> In this, it would be helpful if you'd explain what datasets X and Y > > >>> might be. Particular collections? Subcollections of a larger whole? > > >>> "in some cases records in a dataset are themselves used as > values in > > >>> other datasets. For example, Derrida wrote a book that comments on > > >>> Heidegger's book "Sein und Zeit". A record for Derrida's book > in dataset > > >>> X can state this by relating it to a record for Heidegger's book in > > >>> dataset Y. This statement in the Derrida record could consist > of the > > >>> Dublin Core Subject with as value a reference to the Heidegger > record. > > >>> In this case we would still term X and Y datasets, not a value > > >>> vocabularies." > > >>> > > >>> -Jodi > > >>> > > >>> On 6 Jan 2011, at 08:00, Mark van Assem wrote: > > >>> > > >>> > > >>>> Hi all, > > >>>> > > >>>> As per my action I have written some text [1] to explain the terms > > >>>> "dataset, metadata element set, value vocabulary" with > feedback from > > >>>> Karen and Antoine to address the things that don't fit very > nicely. > > >>>> > > >>>> Please let me know what you think, after I've had your input > we'll put > > >>>> it on the public list to get shot at. > > >>>> > > >>>> Mark. > > >>>> > > >>>> [1] > > >>>> > http://www.w3.org/2001/sw/wiki/Library_terminology_informally_explained#Vocabularies.2C_Element_sets.2C_Datasets > > >>>> > > >>>> > > >>>> On 28/12/2010 18:40, Karen Coyle wrote: > > >>>> > > >>>>> I have been organizing the vocabularies and technologies on the > > >>>>> archives > > >>>>> cluster page [1] and it was a very interesting exercise trying to > > >>>>> determine what category some of the "things" fit into. This > could turn > > >>>>> out to be a starting place for our upcoming discussion of our > > >>>>> definitions since it has real examples. The hard part seems > to be value > > >>>>> vocabularies v. datasets, and I have a feeling that there > will not be a > > >>>>> clear line between them. > > >>>>> > > >>>>> kc > > >>>>> [1] > > >>>>> > > >>>>> > http://www.w3.org/2005/Incubator/lld/wiki/Cluster_Archives#Vocabularies_and_Technologies > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > > > -- > > ===== > > Emmanuelle Bermès - http://www.bnf.fr > > Manue - http://www.figoblog.org > > > > > > -- > Karen Coyle > kcoyle@kcoyle.net http://kcoyle.net > ph: 1-510-540-7596 > m: 1-510-435-8234 > skype: kcoylenet > > >
Received on Friday, 7 January 2011 10:06:03 UTC