- From: Mark van Assem <mark@cs.vu.nl>
- Date: Fri, 07 Jan 2011 11:05:02 +0100
- To: "ZENG, MARCIA" <mzeng@kent.edu>
- CC: Karen Coyle <kcoyle@kcoyle.net>, Emmanuelle Bermes <emmanuelle.bermes@bnf.fr>, "public-xg-lld@w3.org" <public-xg-lld@w3.org>
Thanks all for the feedback!
I've tried to address all your points in de value vocab description:
- "A dataset is a collection of structured metadata records"
- added some more "similar terms", including KOS, gazetteer, authority
file, concept scheme
- "They are "building blocks" with which metadata records can be built."
Re Marcia's point [["For example, in digital gazetteers not only the
place names are controlled but also the place features, type,
coordinates, and even maps are included."]]
I'm not sure I get what you mean with the "also controlled", but I think
indeed that this is the same as the VIAF situation: the values in a
value vocabulary can be described with elements and values themselves,
which would make them "datasets" also. However, we can still see VIAF as
a value vocab and not a dataset, as its main role is to be a building
block for metadata records.
Mark
Op 6-1-2011 18:15, ZENG, MARCIA schreef:
> I like the way Karen used in terms of building block or not... Also
> agree with Jeff’s use of SKOS ‘concept scheme’ to define VIAF.
>
> * Regarding ‘data sets’: To me, the ‘data sets’ we are talking about
> are structured data. Outside in other places ‘data sets’ could be
> un-structured or semi-structured data (e.g., data.gov’s raw data
> sets).
> * Regarding ‘value vocabularies’: In the conventional way we have
> used “knowledge organization systems (KOS)” for concept schemes
> (broader than “controlled vocabularies”). Most of the vocabulary
> types are clear such as pick lists, taxonomies, thesauri, subject
> headings. But there is a group of ‘metadata-like’ KOS such as
> authority files and digital gazetteers. They are/can be
> constructed as thesauri (e.g., The Getty Thesaurus of Geographic
> Names (TGN) and Union List of Artist Names (ULAN)). Or, they can
> be in other structures. It is the contents they include that made
> them also be referred to ‘data sets’. For example, in digital
> gazetteers not only the place names are controlled but also the
> place features, type, coordinates, and even maps are included.
> Digital gazetteers can be used alone as data sets or be the value
> vocabularies used in structured data sets. This might be like the
> VIAF situation, depending on how it is constructed or on how it is
> used.
>
> My 2 cents.
> Marcia
>
> On 1/6/11 11:37 AM, "Karen Coyle" <kcoyle@kcoyle.net> wrote:
>
> Quoting Emmanuelle Bermes <emmanuelle.bermes@bnf.fr>:
>
>
> > As for myself, I do have a few more comments :
> > - I think the emphasis on value vocabs is too important in the current
> > definition of dataset. It's actually creating confusion, in my view.
> > - I'm wondering if we could use the term "instance" (a dataset is a
> > collection of instance descriptions) or is it too implementation
> oriented ?
> >
>
>
> I'm not sure that the term "instance" will work -- even a value in a
> list could be considered an instance, no?
>
> Somehow, the concept for a dataset is that it consists of the
> descriptions of entities that you need for an application or function,
> rather than the building blocks for creating such a description.
> (Which gets back to Mark's statement about "A record for Derrida's
> book in dataset X ...")
>
> Essentially, one person's dataset could be another person's building
> block. But I think the key is that a dataset is complete for an
> application, while a value vocabulary needs to be combined with other
> data to be useful.
>
> No, I'm not satisfied with that explanation... I'll ruminate on this
> and see if I can find better words.
>
> kc
>
> > Emmanuelle
> >
> > On Thu, Jan 6, 2011 at 5:13 PM, Mark van Assem <mark@cs.vu.nl> wrote:
> >
> > > Hi Emma,
> > >
> > > I saw you had already followed up on our action to clarify "value
> > > vocabularies".
> > >
> > > I saw that you think we should clarify how value vocabularies
> actually
> > > appear in metadata records (as literals, codes, identifiers).
> While I kinda
> > > feel we should try to stay agnostic to that I kept it in, but
> rewrote it
> > > slightly:
> > >
> > > "In actual metadata records, the values used can be literals,
> codes, or
> > > identifiers (including URIs), as long as these refer to a
> specific concept
> > > in a value vocabulary. "
> > >
> > > I also moved your point re "closed list" up to the initial
> definition; this
> > > is indeed central to what a value vocab is.
> > >
> > > Mark.
> > >
> > >
> > > On 06/01/2011 16:34, Mark van Assem wrote:
> > >
> > >> Hi Jodi,
> > >>
> > >> X and Y would be two collections ("datasets") from two different
> > >> libraries. It could also be two subcollections or within one
> collection,
> > >> but I think making them separate ones will make it more
> illustrative.
> > >>
> > >> Do you have a suggestion on how to clarify or replace X and Y with
> > >> specific existing collections/libraries as examples?
> > >>
> > >> Mark
> > >>
> > >>
> > >> On 06/01/2011 16:21, Jodi Schneider wrote:
> > >>
> > >>> Thanks for this, Mark! I especially like the 'confusions' area
> -- that
> > >>> will make this quite useful.
> > >>>
> > >>> In this, it would be helpful if you'd explain what datasets X and Y
> > >>> might be. Particular collections? Subcollections of a larger whole?
> > >>> "in some cases records in a dataset are themselves used as
> values in
> > >>> other datasets. For example, Derrida wrote a book that comments on
> > >>> Heidegger's book "Sein und Zeit". A record for Derrida's book
> in dataset
> > >>> X can state this by relating it to a record for Heidegger's book in
> > >>> dataset Y. This statement in the Derrida record could consist
> of the
> > >>> Dublin Core Subject with as value a reference to the Heidegger
> record.
> > >>> In this case we would still term X and Y datasets, not a value
> > >>> vocabularies."
> > >>>
> > >>> -Jodi
> > >>>
> > >>> On 6 Jan 2011, at 08:00, Mark van Assem wrote:
> > >>>
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> As per my action I have written some text [1] to explain the terms
> > >>>> "dataset, metadata element set, value vocabulary" with
> feedback from
> > >>>> Karen and Antoine to address the things that don't fit very
> nicely.
> > >>>>
> > >>>> Please let me know what you think, after I've had your input
> we'll put
> > >>>> it on the public list to get shot at.
> > >>>>
> > >>>> Mark.
> > >>>>
> > >>>> [1]
> > >>>>
> http://www.w3.org/2001/sw/wiki/Library_terminology_informally_explained#Vocabularies.2C_Element_sets.2C_Datasets
> > >>>>
> > >>>>
> > >>>> On 28/12/2010 18:40, Karen Coyle wrote:
> > >>>>
> > >>>>> I have been organizing the vocabularies and technologies on the
> > >>>>> archives
> > >>>>> cluster page [1] and it was a very interesting exercise trying to
> > >>>>> determine what category some of the "things" fit into. This
> could turn
> > >>>>> out to be a starting place for our upcoming discussion of our
> > >>>>> definitions since it has real examples. The hard part seems
> to be value
> > >>>>> vocabularies v. datasets, and I have a feeling that there
> will not be a
> > >>>>> clear line between them.
> > >>>>>
> > >>>>> kc
> > >>>>> [1]
> > >>>>>
> > >>>>>
> http://www.w3.org/2005/Incubator/lld/wiki/Cluster_Archives#Vocabularies_and_Technologies
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
> > --
> > =====
> > Emmanuelle Bermès - http://www.bnf.fr
> > Manue - http://www.figoblog.org
> >
>
>
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
>
>
Received on Friday, 7 January 2011 10:06:03 UTC