Re: Intrinsic vs extrinsic metadata (my action #54)

Hi Bernadette,

I think that a DCAT description (extended by the DWBP WG) would have
pointers to different metadata types of a specific Dataset, and
distribution would be one of them. Maybe distribution could have a special
status but I can´t see why.

>> I don't see how we could relate the different types of metadata. Could
you please give an example?
Iintrinsic metada related to specific distributions (for example, a CSV
file);
Different types of licenses/credentials could allow access to different
subsets of the Dataset implying in different intrinsic metadata.

Thank you.

Kind regards,
Laufer


2014-07-07 10:27 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>:

> Hi Laufer,
>
> Thanks for your comments! I'm gonna try to answer below:
>
> I have a doubt about why you defined distribution in a different way of,
>> for example, license. In the same way that a data/dataset has a
>> distribution that has metadata, a data/dataset has a license that has
>> metadata. Why distribution is not simply a metadata type?
>>
>
> As proposed by DCAT [1], my initial idea was to describe a dataset
> independently from its distributions. In the diagram, a dataset has a
> collection of data and it is described by different types of metadata (the
> ones illustrated in the diagram). Following the DCAT description of a
> dataset, I also consider that a dataset may have one or more distributions,
> where a distribution is a possible way of publishing the collection of data
> of a given dataset, for example a file or an API. In this context, I don't
> see distribution is a type of metadata.
>
> On the other hand, in the diagram, a distribution is also described by
> metadata. I am not sure if a distribution will have the same the metadata
> that a dataset has. I am also not sure if access metadata should be related
> to a dataset or to a specific distribution.
>
>
>> Another thing that I think that could be represented in the diagram are
>> the relationships that could exist among the diverse data/dataset metadata.
>> So, metadata has a relation with metadata.
>>
>
> I don't see how we could relate the different types of metadata. Could you
> please give an example?
>
>
> Thanks again!
>
> kind regards,
> Bernadette
>
> [1] http://www.w3.org/TR/vocab-dcat/
>
>
>
>
>>
>> Thank you.
>>
>> Best regards,
>> Laufer
>>
>>
>> 2014-07-01 20:51 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>:
>>
>> Hi Mark,
>>>
>>> Thanks again for the explanation! These examples are really helpful for
>>> the understanding of the role of the different types of metadata.
>>> I think that examples like these will be very useful to illustrate the
>>> best practices. After having some feedback from the group, it could be nice
>>> to update the wiki page with the diagrams together with a brief explanation
>>> and an example for each type of metadata. What do you think?
>>>
>>> I'm sending attached a pdf version of the updated diagram. Since I am
>>> using PowerPoint to create the diagrams, I am including the ppt version as
>>> well. If you have suggestions for other tools that may help the
>>> collaborative work, please let me know.
>>>
>>> It has been a great discussion! Thanks!
>>>
>>> kind regards,
>>> Bernadette
>>>
>>>
>>>
>>>
>>> 2014-07-01 20:09 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>>>
>>>  Hi Bernadette,
>>>>
>>>>  Thanks for the further discussion and updates to your diagram.
>>>>
>>>>  I also like the vertical continuum in the other diagram to express
>>>> how intrinsic / extrinsic these different kinds of metadata are.
>>>>
>>>>  I'd say that scope and granularity are distinct and not
>>>> interchangeable.
>>>> Scope defines the dimensions and location of the 'bounding box' or
>>>> 'envelope' in time and space, whereas granularity is a measure of how many
>>>> sample points there are *within* that bounding box or envelope.
>>>>
>>>>  A simple example could be weather observation data, where the scope
>>>> defines that the dataset has a coverage of the United Kingdom for the month
>>>> of June 2014 and the granularity is dependent on how closely spaced the
>>>> weather observation stations are and how frequently a new data point is
>>>> recorded for wind speed, rainfall, barometric pressure etc. - e.g. is it
>>>> per day, per hour, per minute or per second?  They both have temporal and
>>>> geospatial dimensions, but I'd redraw that part of the diagram like this.
>>>>
>>>>  By the way - just a suggestion:  can we try to export any diagrams
>>>> like this as vector graphics, either in SVG or PDF?  That makes it much
>>>> easier for us all to make modifications fairly easily, rather than having
>>>> to kludge bitmap modifications in Photoshop or Gimp.
>>>>
>>>>  Best wishes,
>>>>
>>>>  - Mark
>>>>
>>>>
>>>>
>>>> On 1 Jul 2014, at 22:52, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>> wrote:
>>>>
>>>> Hi Mark,
>>>>
>>>> Thank you very much for your explanation!
>>>>
>>>> After reading your examples, I agree with you that scope is a intrinsec
>>>> property, once it provides a better understanding about the meanining of
>>>> the data itself (this was my initial idea about intrinsec metadata).  In
>>>> the Data on the Web context, structural information is not enough to
>>>> provide the semantics of the data, we need more information, like the scope
>>>> of the data.
>>>>
>>>> Instead of removing the classification, I suggest to have two
>>>> categories of intrinsec metadata: scope/granularity and structural. Do you
>>>> think that scope and granularity can be considered together as a single
>>>> category?
>>>>
>>>> I also agree that "these characteristics really fit on a sliding scale
>>>> between Very Intrinsic and Very Extrinsic, with some middle ground in
>>>> between". I created a figure that tries to illustrate this idea. Thi figure
>>>> is attached.
>>>>
>>>> I'm sending attached another version of the diagram with the idea of a
>>>> new classification.
>>>>
>>>> Yes, this discussion is very interesting and it is really important for
>>>> best practices identification and definition :)
>>>>
>>>> Thanks again!
>>>>
>>>> Kind regards,
>>>> Bernadette
>>>>
>>>>
>>>>
>>>>
>>>> 2014-07-01 17:51 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>>>> Hello Bernadette,
>>>>
>>>> Thanks for your updated diagram.
>>>>
>>>> I don't mind if we have slightly different opinions about where to draw
>>>> the boundary between 'intrinsic' and 'extrinsic'.
>>>>
>>>> We both agree that structural metadata (what kind of data is it?) is
>>>> intrinsic.
>>>>
>>>> I think the scope metadata is perhaps on the boundary between intrinsic
>>>> and extrinsic, in the sense that even if you transform the data into
>>>> another format or provide it through a different access method, the scope
>>>> remains invariant.
>>>>
>>>> For example, consider local government spending data.
>>>>
>>>> At one level, you need intrinsic structural metadata that says 'this is
>>>> spending per year on this expenditure category in this region', and we use
>>>> classes and predicates from controlled vocabularies to express that so
>>>> that anyone looking for that kind of data can find it, no matter which
>>>> local government authority published it.  There may be domain-specific data
>>>> publishing guidelines that recommend specific vocabularies to use.  Some
>>>> will be core W3C vocabularies.  Others may be more domain-specific but
>>>> ideally globally defined and multi-lingual.
>>>>
>>>> At another level, you want to be able to identify a particular dataset
>>>> by its temporal and spatial scope.  I consider this to be intrinsic to the
>>>> dataset, even though it's not a structural description.  If a dataset of
>>>> local government spending data is published for a particular city and a
>>>> particular fiscal year, the data contained within that dataset has that
>>>> scope.  We can transform that set of data into different formats and
>>>> provide additional methods to access it - and that temporal+spatial scope
>>>> remains invariant under those changes.  We can't transform the spending
>>>> data for London in 1999 into the spending data for Paris in 2013.  They are
>>>> distinguishing characteristics of the data itself that distinguishes one
>>>> set of data from another set of data, even when they share the same
>>>> structural semantics.  That's why I think of temporal/spatial scope as
>>>> being intrinsic to the dataset and its data, because they are (in my
>>>> opinion) equally important to the meaning of the data - they're
>>>> effectively expressing what the data is about (i.e. its subject or scope),
>>>> whereas the intrinsic structural metadata says 'this is government spending
>>>> for a particular city or region and a particular time interval'.  You
>>>> actually need both.
>>>>
>>>> At another level, you want to explain which formats are available, how
>>>> you can access it, which licence applies for usage of the data.  Those
>>>> things feel much more extrinsic, because they can change over time -
>>>> e.g. additional formats and access methods can be provided, other formats
>>>> or access methods might be deprecated or withdrawn.  A licence might be
>>>> changed to a more liberal licence - or a more restrictive licence.
>>>>
>>>> However, we can agree to differ about the boundary between intrinsic
>>>> and extrinsic - and as I wrote, it's probably something of a continuum or
>>>> sliding scale, rather than only consisting of only two possibilities with a
>>>> very clearly defined boundary between them.
>>>>
>>>> The main issue is to use this exercise as a way to explore all the
>>>> useful dimensions of metadata and identify the best practice ways of
>>>> expressing those - and it seems that this discussion is helping to make
>>>> some additional progress in that direction.
>>>>
>>>> I like your updated diagram.  Maybe it's easier for everyone to agree
>>>> on it if we remove the words 'intrinsic' and 'extrinsic' from the diagram
>>>> but just use them internally for the thought processes that try to make it
>>>> as complete as possible.
>>>>
>>>> Best wishes,
>>>>
>>>> - Mark
>>>>
>>>>
>>>>
>>>>
>>>> On 1 Jul 2014, at 20:51, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>>>  wrote:
>>>>
>>>> > Hello Mark,
>>>> >
>>>> > Thank you very much for sharing your thoughts about metadata
>>>> definition.
>>>> >
>>>> > I read your notes on the wiki page and I have some comments:
>>>> >
>>>> > - I agree with you that a dataset may be described by two types of
>>>> metadata. The metadatas that describes the data itself (intrinsic one) and
>>>> the metadata that describes the dataset (extrinsic metadata). In the
>>>> diagram that I showed in the last meeting, I called them structural and
>>>> descriptive metadata.
>>>> >
>>>> > - I believe that intrinsic properties are the ones that describe the
>>>> meaning of the data itself, like concepts, classes and properties.
>>>> Intrinsic metadata has a similar role of a database schema and should be
>>>> described by a domain vocabulary.
>>>> >
>>>> > - In this case,  Scope (temporal and geographic) and Granularity
>>>> (temporal and spatial) should be considered extrinsic properties, once they
>>>> describe the dataset instead of the meaning of data. Extrinsic properties
>>>> should be described by standard vocabulariies like DCAT, PROV and the
>>>> Quality and Data Usage vocabularies.
>>>> >
>>>> > Maybe I'm being too strict with this classification, but on the other
>>>> hand I think this may help the understanding of the different types of
>>>> metadata and their roles on describing a dataset.
>>>> >
>>>> > I'm sending attached a new version of the diagram that I showed on
>>>> our last meeting. In this new version, I included more subclasses (access,
>>>> granularity and scope) for the extrinsic metadata. I believe that now it
>>>> is possible to define the properties (intrinsic and extrinsic) described in
>>>> your notes.
>>>> >
>>>> > It would be great if you could take a look at the diagram and tell me
>>>> if these ideas make sense to you.
>>>> >
>>>> > Thanks again!
>>>> >
>>>> > kind regards,
>>>> > Bernadette
>>>> >
>>>> >
>>>> >
>>>> > 2014-07-01 10:29 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>:
>>>> > Dear DWBP colleagues,
>>>> >
>>>> > I've added a section to the DWBP wiki with some thoughts about
>>>> intrinsic vs extrinsic metadata, in response to my action #54 from last
>>>> Friday's call and the initial discussion there.
>>>> >
>>>> > I've now added that section at
>>>> >
>>>> >
>>>> https://www.w3.org/2013/dwbp/wiki/Guidance_on_the_Provision_of_Metadata#Intrinsic_vs_Extrinsic_Metadata
>>>> >
>>>> > Maybe it's not the best place for it - in which case, I'm happy for
>>>> the editors to move it to a better location in the Wiki.
>>>> >
>>>> > It's not definitive either - more of a discussion about the kinds of
>>>> metadata that is intrinsic to the data itself (irrespective of format or
>>>> access mechanism) and other kinds of metadata that is extrinsic (e.g.
>>>> depends on a particular format, access mechanism or licence).
>>>> >
>>>> > Please feel free to modify this and extend it.
>>>> >
>>>> > I hope that it's useful for the discussions that Bernadette and I
>>>> were having last week, as well as the work Hadley is writing about
>>>> alternative approaches to data catalogues.
>>>> >
>>>> > At least it might help us to ensure that we explore the various
>>>> 'dimensions' of metadata that might be used by data consumers when
>>>> searching for datasets or discovering related datasets.  I have also
>>>> included some ideas about capturing feedback about data usage (e.g. in
>>>> applications, websites, mash-ups), including links to related datasets that
>>>> add some valuable context.
>>>> >
>>>> > Feel free to develop this further if you think it is useful.
>>>> >
>>>> > Best wishes,
>>>> >
>>>> > - Mark
>>>> >
>>>> >
>>>> > CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>>>> confidential and are not to be regarded as a contractual offer or
>>>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>>>> or if this has been copied or sent to you in error, you must not use data
>>>> herein for any purpose, you must delete it, and should inform the sender.
>>>> GS1 disclaims liability for accuracy or completeness, and opinions
>>>> expressed are those of the author alone. GS1 may monitor communications.
>>>> Third party rights acknowledged. (c) 2013.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Bernadette Farias Lóscio
>>>> > Centro de Informática
>>>> > Universidade Federal de Pernambuco - UFPE, Brazil
>>>> >
>>>> ----------------------------------------------------------------------------
>>>> > <DWBP_metadata.jpg>
>>>>
>>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>>>>  confidential and are not to be regarded as a contractual offer or
>>>> acceptance from GS1 (registered in Belgium).
>>>> If you are not the addressee, or if this has been copied or sent to you
>>>> in error, you must not use data herein for any purpose, you must delete it,
>>>> and should inform the sender.
>>>> GS1 disclaims liability for accuracy or completeness, and opinions
>>>> expressed are those of the author alone.
>>>> GS1 may monitor communications.
>>>> Third party rights acknowledged.
>>>> (c) 2012.
>>>> </a>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Bernadette Farias Lóscio
>>>> Centro de Informática
>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>
>>>> ----------------------------------------------------------------------------
>>>>
>>>>
>>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail
>>>> are confidential and are not to be regarded as a contractual offer or
>>>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>>>> or if this has been copied or sent to you in error, you must not use data
>>>> herein for any purpose, you must delete it, and should inform the
>>>> sender. GS1 disclaims liability for accuracy or completeness, and opinions
>>>> expressed are those of the author alone. GS1 may monitor
>>>> communications. Third party rights acknowledged. (c) 2013.
>>>> <DWBP_metadata_v02.jpg><Extrinsic x Intrinsec.jpg>
>>>>
>>>>
>>>>
>>>>
>>>>  ------------------------------
>>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>>>> confidential and are not to be regarded as a contractual offer or
>>>> acceptance from GS1 (registered in Belgium). If you are not the addressee,
>>>> or if this has been copied or sent to you in error, you must not use data
>>>> herein for any purpose, you must delete it, and should inform the sender.
>>>> GS1 disclaims liability for accuracy or completeness, and opinions
>>>> expressed are those of the author alone. GS1 may monitor communications.
>>>> Third party rights acknowledged. (c) 2013.
>>>> ------------------------------
>>>>
>>>
>>>
>>>
>>> --
>>> Bernadette Farias Lóscio
>>> Centro de Informática
>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>> ----------------------------------------------------------------------------
>>>
>>>
>>
>>
>>
>> --
>> .  .  .  .. .  .
>> .        .   . ..
>> .     ..       .
>>
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------
>
>



-- 
.  .  .  .. .  .
.        .   . ..
.     ..       .

Received on Monday, 7 July 2014 16:58:00 UTC