Re: Intrinsic vs extrinsic metadata (my action #54)

Hi Bernadette,

The metadata classification scheme is very interesting.

I have a doubt about why you defined distribution in a different way of,
for example, license. In the same way that a data/dataset has a
distribution that has metadata, a data/dataset has a license that has
metadata. Why distribution is not simply a metadata type?

Another thing that I think that could be represented in the diagram are the
relationships that could exist among the diverse data/dataset metadata. So,
metadata has a relation with metadata.

Thank you.

Best regards,
Laufer


2014-07-01 20:51 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>:

> Hi Mark,
>
> Thanks again for the explanation! These examples are really helpful for
> the understanding of the role of the different types of metadata.
> I think that examples like these will be very useful to illustrate the
> best practices. After having some feedback from the group, it could be nice
> to update the wiki page with the diagrams together with a brief explanation
> and an example for each type of metadata. What do you think?
>
> I'm sending attached a pdf version of the updated diagram. Since I am
> using PowerPoint to create the diagrams, I am including the ppt version as
> well. If you have suggestions for other tools that may help the
> collaborative work, please let me know.
>
> It has been a great discussion! Thanks!
>
> kind regards,
> Bernadette
>
>
>
>
> 2014-07-01 20:09 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>
>  Hi Bernadette,
>>
>>  Thanks for the further discussion and updates to your diagram.
>>
>>  I also like the vertical continuum in the other diagram to express how
>> intrinsic / extrinsic these different kinds of metadata are.
>>
>>  I'd say that scope and granularity are distinct and not
>> interchangeable.
>> Scope defines the dimensions and location of the 'bounding box' or
>> 'envelope' in time and space, whereas granularity is a measure of how many
>> sample points there are *within* that bounding box or envelope.
>>
>>  A simple example could be weather observation data, where the scope
>> defines that the dataset has a coverage of the United Kingdom for the month
>> of June 2014 and the granularity is dependent on how closely spaced the
>> weather observation stations are and how frequently a new data point is
>> recorded for wind speed, rainfall, barometric pressure etc. - e.g. is it
>> per day, per hour, per minute or per second?  They both have temporal and
>> geospatial dimensions, but I'd redraw that part of the diagram like this.
>>
>>  By the way - just a suggestion:  can we try to export any diagrams like
>> this as vector graphics, either in SVG or PDF?  That makes it much easier
>> for us all to make modifications fairly easily, rather than having to
>> kludge bitmap modifications in Photoshop or Gimp.
>>
>>  Best wishes,
>>
>>  - Mark
>>
>>
>>
>> On 1 Jul 2014, at 22:52, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>> wrote:
>>
>> Hi Mark,
>>
>> Thank you very much for your explanation!
>>
>> After reading your examples, I agree with you that scope is a intrinsec
>> property, once it provides a better understanding about the meanining of
>> the data itself (this was my initial idea about intrinsec metadata).  In
>> the Data on the Web context, structural information is not enough to
>> provide the semantics of the data, we need more information, like the scope
>> of the data.
>>
>> Instead of removing the classification, I suggest to have two categories
>> of intrinsec metadata: scope/granularity and structural. Do you think that
>> scope and granularity can be considered together as a single category?
>>
>> I also agree that "these characteristics really fit on a sliding scale
>> between Very Intrinsic and Very Extrinsic, with some middle ground in
>> between". I created a figure that tries to illustrate this idea. Thi figure
>> is attached.
>>
>> I'm sending attached another version of the diagram with the idea of a
>> new classification.
>>
>> Yes, this discussion is very interesting and it is really important for
>> best practices identification and definition :)
>>
>> Thanks again!
>>
>> Kind regards,
>> Bernadette
>>
>>
>>
>>
>> 2014-07-01 17:51 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>> Hello Bernadette,
>>
>> Thanks for your updated diagram.
>>
>> I don't mind if we have slightly different opinions about where to draw
>> the boundary between 'intrinsic' and 'extrinsic'.
>>
>> We both agree that structural metadata (what kind of data is it?) is
>> intrinsic.
>>
>> I think the scope metadata is perhaps on the boundary between intrinsic
>> and extrinsic, in the sense that even if you transform the data into
>> another format or provide it through a different access method, the scope
>> remains invariant.
>>
>> For example, consider local government spending data.
>>
>> At one level, you need intrinsic structural metadata that says 'this is
>> spending per year on this expenditure category in this region', and we use
>> classes and predicates from controlled vocabularies to express that so
>> that anyone looking for that kind of data can find it, no matter which
>> local government authority published it.  There may be domain-specific data
>> publishing guidelines that recommend specific vocabularies to use.  Some
>> will be core W3C vocabularies.  Others may be more domain-specific but
>> ideally globally defined and multi-lingual.
>>
>> At another level, you want to be able to identify a particular dataset by
>> its temporal and spatial scope.  I consider this to be intrinsic to the
>> dataset, even though it's not a structural description.  If a dataset of
>> local government spending data is published for a particular city and a
>> particular fiscal year, the data contained within that dataset has that
>> scope.  We can transform that set of data into different formats and
>> provide additional methods to access it - and that temporal+spatial scope
>> remains invariant under those changes.  We can't transform the spending
>> data for London in 1999 into the spending data for Paris in 2013.  They are
>> distinguishing characteristics of the data itself that distinguishes one
>> set of data from another set of data, even when they share the same
>> structural semantics.  That's why I think of temporal/spatial scope as
>> being intrinsic to the dataset and its data, because they are (in my
>> opinion) equally important to the meaning of the data - they're
>> effectively expressing what the data is about (i.e. its subject or scope),
>> whereas the intrinsic structural metadata says 'this is government spending
>> for a particular city or region and a particular time interval'.  You
>> actually need both.
>>
>> At another level, you want to explain which formats are available, how
>> you can access it, which licence applies for usage of the data.  Those
>> things feel much more extrinsic, because they can change over time -
>> e.g. additional formats and access methods can be provided, other formats
>> or access methods might be deprecated or withdrawn.  A licence might be
>> changed to a more liberal licence - or a more restrictive licence.
>>
>> However, we can agree to differ about the boundary between intrinsic and
>> extrinsic - and as I wrote, it's probably something of a continuum or
>> sliding scale, rather than only consisting of only two possibilities with a
>> very clearly defined boundary between them.
>>
>> The main issue is to use this exercise as a way to explore all the useful
>> dimensions of metadata and identify the best practice ways of expressing
>> those - and it seems that this discussion is helping to make some
>> additional progress in that direction.
>>
>> I like your updated diagram.  Maybe it's easier for everyone to agree on
>> it if we remove the words 'intrinsic' and 'extrinsic' from the diagram but
>> just use them internally for the thought processes that try to make it as
>> complete as possible.
>>
>> Best wishes,
>>
>> - Mark
>>
>>
>>
>>
>> On 1 Jul 2014, at 20:51, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>  wrote:
>>
>> > Hello Mark,
>> >
>> > Thank you very much for sharing your thoughts about metadata definition.
>> >
>> > I read your notes on the wiki page and I have some comments:
>> >
>> > - I agree with you that a dataset may be described by two types of
>> metadata. The metadatas that describes the data itself (intrinsic one) and
>> the metadata that describes the dataset (extrinsic metadata). In the
>> diagram that I showed in the last meeting, I called them structural and
>> descriptive metadata.
>> >
>> > - I believe that intrinsic properties are the ones that describe the
>> meaning of the data itself, like concepts, classes and properties.
>> Intrinsic metadata has a similar role of a database schema and should be
>> described by a domain vocabulary.
>> >
>> > - In this case,  Scope (temporal and geographic) and Granularity
>> (temporal and spatial) should be considered extrinsic properties, once they
>> describe the dataset instead of the meaning of data. Extrinsic properties
>> should be described by standard vocabulariies like DCAT, PROV and the
>> Quality and Data Usage vocabularies.
>> >
>> > Maybe I'm being too strict with this classification, but on the other
>> hand I think this may help the understanding of the different types of
>> metadata and their roles on describing a dataset.
>> >
>> > I'm sending attached a new version of the diagram that I showed on our
>> last meeting. In this new version, I included more subclasses (access,
>> granularity and scope) for the extrinsic metadata. I believe that now it
>> is possible to define the properties (intrinsic and extrinsic) described in
>> your notes.
>> >
>> > It would be great if you could take a look at the diagram and tell me
>> if these ideas make sense to you.
>> >
>> > Thanks again!
>> >
>> > kind regards,
>> > Bernadette
>> >
>> >
>> >
>> > 2014-07-01 10:29 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>> > Dear DWBP colleagues,
>> >
>> > I've added a section to the DWBP wiki with some thoughts about
>> intrinsic vs extrinsic metadata, in response to my action #54 from last
>> Friday's call and the initial discussion there.
>> >
>> > I've now added that section at
>> >
>> >
>> https://www.w3.org/2013/dwbp/wiki/Guidance_on_the_Provision_of_Metadata#Intrinsic_vs_Extrinsic_Metadata
>> >
>> > Maybe it's not the best place for it - in which case, I'm happy for the
>> editors to move it to a better location in the Wiki.
>> >
>> > It's not definitive either - more of a discussion about the kinds of
>> metadata that is intrinsic to the data itself (irrespective of format or
>> access mechanism) and other kinds of metadata that is extrinsic (e.g.
>> depends on a particular format, access mechanism or licence).
>> >
>> > Please feel free to modify this and extend it.
>> >
>> > I hope that it's useful for the discussions that Bernadette and I were
>> having last week, as well as the work Hadley is writing about alternative
>> approaches to data catalogues.
>> >
>> > At least it might help us to ensure that we explore the various
>> 'dimensions' of metadata that might be used by data consumers when
>> searching for datasets or discovering related datasets.  I have also
>> included some ideas about capturing feedback about data usage (e.g. in
>> applications, websites, mash-ups), including links to related datasets that
>> add some valuable context.
>> >
>> > Feel free to develop this further if you think it is useful.
>> >
>> > Best wishes,
>> >
>> > - Mark
>> >
>> >
>> > CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>> confidential and are not to be regarded as a contractual offer or
>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>> or if this has been copied or sent to you in error, you must not use data
>> herein for any purpose, you must delete it, and should inform the sender.
>> GS1 disclaims liability for accuracy or completeness, and opinions
>> expressed are those of the author alone. GS1 may monitor communications.
>> Third party rights acknowledged. (c) 2013.
>> >
>> >
>> >
>> > --
>> > Bernadette Farias Lóscio
>> > Centro de Informática
>> > Universidade Federal de Pernambuco - UFPE, Brazil
>> >
>> ----------------------------------------------------------------------------
>> > <DWBP_metadata.jpg>
>>
>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>>  confidential and are not to be regarded as a contractual offer or
>> acceptance from GS1 (registered in Belgium).
>> If you are not the addressee, or if this has been copied or sent to you
>> in error, you must not use data herein for any purpose, you must delete it,
>> and should inform the sender.
>> GS1 disclaims liability for accuracy or completeness, and opinions
>> expressed are those of the author alone.
>> GS1 may monitor communications.
>> Third party rights acknowledged.
>> (c) 2012.
>> </a>
>>
>>
>>
>>
>> --
>> Bernadette Farias Lóscio
>> Centro de Informática
>> Universidade Federal de Pernambuco - UFPE, Brazil
>>
>> ----------------------------------------------------------------------------
>>
>>
>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail
>> are confidential and are not to be regarded as a contractual offer or
>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>> or if this has been copied or sent to you in error, you must not use data
>> herein for any purpose, you must delete it, and should inform the
>> sender. GS1 disclaims liability for accuracy or completeness, and opinions
>> expressed are those of the author alone. GS1 may monitor
>> communications. Third party rights acknowledged. (c) 2013.
>> <DWBP_metadata_v02.jpg><Extrinsic x Intrinsec.jpg>
>>
>>
>>
>>
>>  ------------------------------
>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>> confidential and are not to be regarded as a contractual offer or
>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>> or if this has been copied or sent to you in error, you must not use data
>> herein for any purpose, you must delete it, and should inform the sender.
>> GS1 disclaims liability for accuracy or completeness, and opinions
>> expressed are those of the author alone. GS1 may monitor communications.
>> Third party rights acknowledged. (c) 2013.
>> ------------------------------
>>
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------
>
>



-- 
.  .  .  .. .  .
.        .   . ..
.     ..       .

Received on Thursday, 3 July 2014 16:31:38 UTC