- From: Laufer <laufer@globo.com>
- Date: Mon, 7 Jul 2014 13:57:27 -0300
- To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
- Cc: Mark Harrison <mark.harrison@gs1.org>, public-dwbp-wg <public-dwbp-wg@w3.org>
- Message-ID: <CA+pXJijncv0Ow1GxJ4yYzG2Lu7K0vhAD_8C9afdi8dw95qYDZQ@mail.gmail.com>
Hi Bernadette, I think that a DCAT description (extended by the DWBP WG) would have pointers to different metadata types of a specific Dataset, and distribution would be one of them. Maybe distribution could have a special status but I can´t see why. >> I don't see how we could relate the different types of metadata. Could you please give an example? Iintrinsic metada related to specific distributions (for example, a CSV file); Different types of licenses/credentials could allow access to different subsets of the Dataset implying in different intrinsic metadata. Thank you. Kind regards, Laufer 2014-07-07 10:27 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>: > Hi Laufer, > > Thanks for your comments! I'm gonna try to answer below: > > I have a doubt about why you defined distribution in a different way of, >> for example, license. In the same way that a data/dataset has a >> distribution that has metadata, a data/dataset has a license that has >> metadata. Why distribution is not simply a metadata type? >> > > As proposed by DCAT [1], my initial idea was to describe a dataset > independently from its distributions. In the diagram, a dataset has a > collection of data and it is described by different types of metadata (the > ones illustrated in the diagram). Following the DCAT description of a > dataset, I also consider that a dataset may have one or more distributions, > where a distribution is a possible way of publishing the collection of data > of a given dataset, for example a file or an API. In this context, I don't > see distribution is a type of metadata. > > On the other hand, in the diagram, a distribution is also described by > metadata. I am not sure if a distribution will have the same the metadata > that a dataset has. I am also not sure if access metadata should be related > to a dataset or to a specific distribution. > > >> Another thing that I think that could be represented in the diagram are >> the relationships that could exist among the diverse data/dataset metadata. >> So, metadata has a relation with metadata. >> > > I don't see how we could relate the different types of metadata. Could you > please give an example? > > > Thanks again! > > kind regards, > Bernadette > > [1] http://www.w3.org/TR/vocab-dcat/ > > > > >> >> Thank you. >> >> Best regards, >> Laufer >> >> >> 2014-07-01 20:51 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>: >> >> Hi Mark, >>> >>> Thanks again for the explanation! These examples are really helpful for >>> the understanding of the role of the different types of metadata. >>> I think that examples like these will be very useful to illustrate the >>> best practices. After having some feedback from the group, it could be nice >>> to update the wiki page with the diagrams together with a brief explanation >>> and an example for each type of metadata. What do you think? >>> >>> I'm sending attached a pdf version of the updated diagram. Since I am >>> using PowerPoint to create the diagrams, I am including the ppt version as >>> well. If you have suggestions for other tools that may help the >>> collaborative work, please let me know. >>> >>> It has been a great discussion! Thanks! >>> >>> kind regards, >>> Bernadette >>> >>> >>> >>> >>> 2014-07-01 20:09 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>: >>> >>> Hi Bernadette, >>>> >>>> Thanks for the further discussion and updates to your diagram. >>>> >>>> I also like the vertical continuum in the other diagram to express >>>> how intrinsic / extrinsic these different kinds of metadata are. >>>> >>>> I'd say that scope and granularity are distinct and not >>>> interchangeable. >>>> Scope defines the dimensions and location of the 'bounding box' or >>>> 'envelope' in time and space, whereas granularity is a measure of how many >>>> sample points there are *within* that bounding box or envelope. >>>> >>>> A simple example could be weather observation data, where the scope >>>> defines that the dataset has a coverage of the United Kingdom for the month >>>> of June 2014 and the granularity is dependent on how closely spaced the >>>> weather observation stations are and how frequently a new data point is >>>> recorded for wind speed, rainfall, barometric pressure etc. - e.g. is it >>>> per day, per hour, per minute or per second? They both have temporal and >>>> geospatial dimensions, but I'd redraw that part of the diagram like this. >>>> >>>> By the way - just a suggestion: can we try to export any diagrams >>>> like this as vector graphics, either in SVG or PDF? That makes it much >>>> easier for us all to make modifications fairly easily, rather than having >>>> to kludge bitmap modifications in Photoshop or Gimp. >>>> >>>> Best wishes, >>>> >>>> - Mark >>>> >>>> >>>> >>>> On 1 Jul 2014, at 22:52, Bernadette Farias Lóscio <bfl@cin.ufpe.br> >>>> wrote: >>>> >>>> Hi Mark, >>>> >>>> Thank you very much for your explanation! >>>> >>>> After reading your examples, I agree with you that scope is a intrinsec >>>> property, once it provides a better understanding about the meanining of >>>> the data itself (this was my initial idea about intrinsec metadata). In >>>> the Data on the Web context, structural information is not enough to >>>> provide the semantics of the data, we need more information, like the scope >>>> of the data. >>>> >>>> Instead of removing the classification, I suggest to have two >>>> categories of intrinsec metadata: scope/granularity and structural. Do you >>>> think that scope and granularity can be considered together as a single >>>> category? >>>> >>>> I also agree that "these characteristics really fit on a sliding scale >>>> between Very Intrinsic and Very Extrinsic, with some middle ground in >>>> between". I created a figure that tries to illustrate this idea. Thi figure >>>> is attached. >>>> >>>> I'm sending attached another version of the diagram with the idea of a >>>> new classification. >>>> >>>> Yes, this discussion is very interesting and it is really important for >>>> best practices identification and definition :) >>>> >>>> Thanks again! >>>> >>>> Kind regards, >>>> Bernadette >>>> >>>> >>>> >>>> >>>> 2014-07-01 17:51 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>: >>>> Hello Bernadette, >>>> >>>> Thanks for your updated diagram. >>>> >>>> I don't mind if we have slightly different opinions about where to draw >>>> the boundary between 'intrinsic' and 'extrinsic'. >>>> >>>> We both agree that structural metadata (what kind of data is it?) is >>>> intrinsic. >>>> >>>> I think the scope metadata is perhaps on the boundary between intrinsic >>>> and extrinsic, in the sense that even if you transform the data into >>>> another format or provide it through a different access method, the scope >>>> remains invariant. >>>> >>>> For example, consider local government spending data. >>>> >>>> At one level, you need intrinsic structural metadata that says 'this is >>>> spending per year on this expenditure category in this region', and we use >>>> classes and predicates from controlled vocabularies to express that so >>>> that anyone looking for that kind of data can find it, no matter which >>>> local government authority published it. There may be domain-specific data >>>> publishing guidelines that recommend specific vocabularies to use. Some >>>> will be core W3C vocabularies. Others may be more domain-specific but >>>> ideally globally defined and multi-lingual. >>>> >>>> At another level, you want to be able to identify a particular dataset >>>> by its temporal and spatial scope. I consider this to be intrinsic to the >>>> dataset, even though it's not a structural description. If a dataset of >>>> local government spending data is published for a particular city and a >>>> particular fiscal year, the data contained within that dataset has that >>>> scope. We can transform that set of data into different formats and >>>> provide additional methods to access it - and that temporal+spatial scope >>>> remains invariant under those changes. We can't transform the spending >>>> data for London in 1999 into the spending data for Paris in 2013. They are >>>> distinguishing characteristics of the data itself that distinguishes one >>>> set of data from another set of data, even when they share the same >>>> structural semantics. That's why I think of temporal/spatial scope as >>>> being intrinsic to the dataset and its data, because they are (in my >>>> opinion) equally important to the meaning of the data - they're >>>> effectively expressing what the data is about (i.e. its subject or scope), >>>> whereas the intrinsic structural metadata says 'this is government spending >>>> for a particular city or region and a particular time interval'. You >>>> actually need both. >>>> >>>> At another level, you want to explain which formats are available, how >>>> you can access it, which licence applies for usage of the data. Those >>>> things feel much more extrinsic, because they can change over time - >>>> e.g. additional formats and access methods can be provided, other formats >>>> or access methods might be deprecated or withdrawn. A licence might be >>>> changed to a more liberal licence - or a more restrictive licence. >>>> >>>> However, we can agree to differ about the boundary between intrinsic >>>> and extrinsic - and as I wrote, it's probably something of a continuum or >>>> sliding scale, rather than only consisting of only two possibilities with a >>>> very clearly defined boundary between them. >>>> >>>> The main issue is to use this exercise as a way to explore all the >>>> useful dimensions of metadata and identify the best practice ways of >>>> expressing those - and it seems that this discussion is helping to make >>>> some additional progress in that direction. >>>> >>>> I like your updated diagram. Maybe it's easier for everyone to agree >>>> on it if we remove the words 'intrinsic' and 'extrinsic' from the diagram >>>> but just use them internally for the thought processes that try to make it >>>> as complete as possible. >>>> >>>> Best wishes, >>>> >>>> - Mark >>>> >>>> >>>> >>>> >>>> On 1 Jul 2014, at 20:51, Bernadette Farias Lóscio <bfl@cin.ufpe.br> >>>> wrote: >>>> >>>> > Hello Mark, >>>> > >>>> > Thank you very much for sharing your thoughts about metadata >>>> definition. >>>> > >>>> > I read your notes on the wiki page and I have some comments: >>>> > >>>> > - I agree with you that a dataset may be described by two types of >>>> metadata. The metadatas that describes the data itself (intrinsic one) and >>>> the metadata that describes the dataset (extrinsic metadata). In the >>>> diagram that I showed in the last meeting, I called them structural and >>>> descriptive metadata. >>>> > >>>> > - I believe that intrinsic properties are the ones that describe the >>>> meaning of the data itself, like concepts, classes and properties. >>>> Intrinsic metadata has a similar role of a database schema and should be >>>> described by a domain vocabulary. >>>> > >>>> > - In this case, Scope (temporal and geographic) and Granularity >>>> (temporal and spatial) should be considered extrinsic properties, once they >>>> describe the dataset instead of the meaning of data. Extrinsic properties >>>> should be described by standard vocabulariies like DCAT, PROV and the >>>> Quality and Data Usage vocabularies. >>>> > >>>> > Maybe I'm being too strict with this classification, but on the other >>>> hand I think this may help the understanding of the different types of >>>> metadata and their roles on describing a dataset. >>>> > >>>> > I'm sending attached a new version of the diagram that I showed on >>>> our last meeting. In this new version, I included more subclasses (access, >>>> granularity and scope) for the extrinsic metadata. I believe that now it >>>> is possible to define the properties (intrinsic and extrinsic) described in >>>> your notes. >>>> > >>>> > It would be great if you could take a look at the diagram and tell me >>>> if these ideas make sense to you. >>>> > >>>> > Thanks again! >>>> > >>>> > kind regards, >>>> > Bernadette >>>> > >>>> > >>>> > >>>> > 2014-07-01 10:29 GMT-03:00 Mark Harrison <mark.harrison@gs1.org>: >>>> > Dear DWBP colleagues, >>>> > >>>> > I've added a section to the DWBP wiki with some thoughts about >>>> intrinsic vs extrinsic metadata, in response to my action #54 from last >>>> Friday's call and the initial discussion there. >>>> > >>>> > I've now added that section at >>>> > >>>> > >>>> https://www.w3.org/2013/dwbp/wiki/Guidance_on_the_Provision_of_Metadata#Intrinsic_vs_Extrinsic_Metadata >>>> > >>>> > Maybe it's not the best place for it - in which case, I'm happy for >>>> the editors to move it to a better location in the Wiki. >>>> > >>>> > It's not definitive either - more of a discussion about the kinds of >>>> metadata that is intrinsic to the data itself (irrespective of format or >>>> access mechanism) and other kinds of metadata that is extrinsic (e.g. >>>> depends on a particular format, access mechanism or licence). >>>> > >>>> > Please feel free to modify this and extend it. >>>> > >>>> > I hope that it's useful for the discussions that Bernadette and I >>>> were having last week, as well as the work Hadley is writing about >>>> alternative approaches to data catalogues. >>>> > >>>> > At least it might help us to ensure that we explore the various >>>> 'dimensions' of metadata that might be used by data consumers when >>>> searching for datasets or discovering related datasets. I have also >>>> included some ideas about capturing feedback about data usage (e.g. in >>>> applications, websites, mash-ups), including links to related datasets that >>>> add some valuable context. >>>> > >>>> > Feel free to develop this further if you think it is useful. >>>> > >>>> > Best wishes, >>>> > >>>> > - Mark >>>> > >>>> > >>>> > CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are >>>> confidential and are not to be regarded as a contractual offer or >>>> acceptance from GS1 (registered in Belgium). If you are not the addressee, >>>> or if this has been copied or sent to you in error, you must not use data >>>> herein for any purpose, you must delete it, and should inform the sender. >>>> GS1 disclaims liability for accuracy or completeness, and opinions >>>> expressed are those of the author alone. GS1 may monitor communications. >>>> Third party rights acknowledged. (c) 2013. >>>> > >>>> > >>>> > >>>> > -- >>>> > Bernadette Farias Lóscio >>>> > Centro de Informática >>>> > Universidade Federal de Pernambuco - UFPE, Brazil >>>> > >>>> ---------------------------------------------------------------------------- >>>> > <DWBP_metadata.jpg> >>>> >>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are >>>> confidential and are not to be regarded as a contractual offer or >>>> acceptance from GS1 (registered in Belgium). >>>> If you are not the addressee, or if this has been copied or sent to you >>>> in error, you must not use data herein for any purpose, you must delete it, >>>> and should inform the sender. >>>> GS1 disclaims liability for accuracy or completeness, and opinions >>>> expressed are those of the author alone. >>>> GS1 may monitor communications. >>>> Third party rights acknowledged. >>>> (c) 2012. >>>> </a> >>>> >>>> >>>> >>>> >>>> -- >>>> Bernadette Farias Lóscio >>>> Centro de Informática >>>> Universidade Federal de Pernambuco - UFPE, Brazil >>>> >>>> ---------------------------------------------------------------------------- >>>> >>>> >>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail >>>> are confidential and are not to be regarded as a contractual offer or >>>> acceptance from GS1 (registered in Belgium). If you are not the addressee, >>>> or if this has been copied or sent to you in error, you must not use data >>>> herein for any purpose, you must delete it, and should inform the >>>> sender. GS1 disclaims liability for accuracy or completeness, and opinions >>>> expressed are those of the author alone. GS1 may monitor >>>> communications. Third party rights acknowledged. (c) 2013. >>>> <DWBP_metadata_v02.jpg><Extrinsic x Intrinsec.jpg> >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are >>>> confidential and are not to be regarded as a contractual offer or >>>> acceptance from GS1 (registered in Belgium). If you are not the addressee, >>>> or if this has been copied or sent to you in error, you must not use data >>>> herein for any purpose, you must delete it, and should inform the sender. >>>> GS1 disclaims liability for accuracy or completeness, and opinions >>>> expressed are those of the author alone. GS1 may monitor communications. >>>> Third party rights acknowledged. (c) 2013. >>>> ------------------------------ >>>> >>> >>> >>> >>> -- >>> Bernadette Farias Lóscio >>> Centro de Informática >>> Universidade Federal de Pernambuco - UFPE, Brazil >>> ---------------------------------------------------------------------------- >>> >>> >> >> >> >> -- >> . . . .. . . >> . . . .. >> . .. . >> > > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > ---------------------------------------------------------------------------- > > -- . . . .. . . . . . .. . .. .
Attachments
- image/jpg attachment: DWBP_metadata_v03.jpg
Received on Monday, 7 July 2014 16:58:00 UTC