Re: The 5 stars path from Eric Stephan on 2015-03-22 (public-dwbp-wg@w3.org from March 2015)

From: Eric Stephan <ericphb@gmail.com>
Date: Sun, 22 Mar 2015 14:38:33 -0700
To: Phil Archer <phila@w3.org>
Cc: Laufer <laufer@globo.com>, Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bernadette Farias Lóscio <bfl@cin.ufpe.br>, DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <CAMFz4jhcaqQ=MH2=Brg+CT0=vZznKhYu3Etu=Y8ggyzT04kPXQ@mail.gmail.com>
Wow what a wonderful thread to read.  Thank you Phil!  Many many thanks for
this wonderful note of clarity!

>>if Eric and Annette can provide similar examples for NetCDF that would be
terrific (I'm out of my depth here).

Yes I think we can show this quite easily.  Just off the top of my heads.

NetCDF:
   - is an open format for storing multi-dimensional data streams [NETCDF]
   - can be annotated with self describing metadata (called attributes)
   - has existing conventions for representing different forms of data.
E.g. CF convention.
   - has a CF vocabulary [CFNAMES] for curated climate and forecasting
terminology.
   - In addition the climate community within the Earth System Grid (ESG)
has adopted fully documented protocols [CMIP5] to show how regional and
climate model datasets must be organized so that they can be inter-related
to support regional and global climate studies.
  - Leverages existing ISO standards used in the geospatial, dublin core,
and metadata communities.
   - Finally an ontology was developed by NASA JPL called SWEET [SWEET],
there is previous research showing how the CF terms can inter-related.

I would submit that even without the ontology in terms of open data, the
climate community is already at 5 star.



Eric


References

[NETCDF] http://en.wikipedia.org/wiki/NetCDF
[CFNAMES]
http://cfconventions.org/Data/cf-standard-names/28/build/cf-standard-name-table.html
[CMIP5] http://cmip-pcmdi.llnl.gov/cmip5/
[SWEET] https://sweet.jpl.nasa.gov/

On Sun, Mar 22, 2015 at 10:45 AM, Phil Archer <phila@w3.org> wrote:

> We are in full agreement.
>
> One of my hopes for this WG is that we can indeed lead people to publish
> formats like CSV in the best way (i.e. with good quality metadata) without
> them feeling somehow inferior.
>
> If that leads us to define our own star rating system, I wouldn't mind.
> Something like:
>
> * It's available on the Web in an open format with a declared licence
> (anything less is all but useless).
>
> ** As level 1 with good quality discovery metadata (we might refer to the
> DCAT Application profile work as an example).
>
> *** All the above plus structural metadata in the relevant format (e.g.
> CSV+ for CSV, VoID for RDF etc).
>
> This doesn't include quality metrics (which it should), and contact
> details (which it should) - but they might be defined at level 2?
>
> Maybe a start anyway.
>
> Phil.
>
> On 22/03/2015 13:50, Laufer wrote:
>
>> I agree, Phil.
>>
>> What I want to reinforce is that it would be nice if we could make clear
>> in
>> the document that 5 stars LD (or OD?) is not a scale of a dataset that is
>> well published in the web. We can have, for example, a "CSV dataset" (3
>> stars) more well published than a "LD dataset" (5 stars). Or, maybe, we
>> can
>> avoid using the 5 stars when what we want to say is that a dataset is
>> being
>> published in a CSV format.
>>
>> If we say that one dataset is 3 stars and other is 5 stars, people have
>> the
>> idea that the 5 one is better than the 3 one (as in reviews or hotels, for
>> example).
>>
>> We probably will not define our own scale but I hope that our set of BPs
>> could help people to publish a  "Well Published Data on The Web".
>>
>> Best Regards,
>> Laufer
>>
>> Em domingo, 22 de março de 2015, Christophe Guéret <
>> christophe.gueret@dans.knaw.nl
>> <javascript:_e(%7B%7D,'cvml','christophe.gueret@dans.knaw.nl');>>
>> escreveu:
>>
>>
>>  +1!
>>>
>>> Christophe
>>>
>>> --
>>> Sent with difficulties. Sorry for the brievety and typos...
>>> Op 22 mrt. 2015 08:47 schreef "Phil Archer" <phila@w3.org>:
>>>
>>>  I've just been reading through Friday's minutes and I see that this was
>>>> the hot topic of the day. As ever, I'm sorry I wasn't able to be there.
>>>>
>>>> Let me add my 2 cents.
>>>>
>>>> LD forms a small part of the available data on the Web. It would be
>>>> silly of us to push for everyone to convert their data into perfectly
>>>> linked 5 star data before they make it available publicly or behind a
>>>> pay-wall of some kind.
>>>>
>>>> What we *can* do IMO is:
>>>>
>>>> - Promote the publication of human readable metadata as Laufer has
>>>> described;
>>>>
>>>> - promote the publication of machine readable metadata and then show how
>>>> this can be (and is) done with RDF using DCAT as an example;
>>>>
>>>> - promote the publication of structural metadata which, for CSV at
>>>> least, we have a very clear route - use the CSV on the Web work;
>>>>
>>>> - if Eric and Annette can provide similar examples for NetCDF that would
>>>> be terrific (I'm out of my depth here).
>>>>
>>>> - We can leave it to the Spatial Data on the Web WG to handle spatial
>>>> stuff (as they are leaving some of their generic issues to this group).
>>>>
>>>> As an aside, the CSV WG has resolved its issues now and is expecting to
>>>> publish pretty much the stable version of its specs in the first week of
>>>> April.
>>>>
>>>> If you publish data in your favourite format + structural metadata in
>>>> whatever format goes with that (and the CSV WG is using JSON for its
>>>> metadata) then you are providing a route through which your users can
>>>> readily create 5 star data if they so wish. They may or may not use LD
>>>> themselves but the concept behind it is, I hope, clear enough to
>>>> readers?
>>>>
>>>>   From what I've read of Friday and the list since then, I dare t hope
>>>> this is in line with the general mood of the WG?
>>>>
>>>> Phil.
>>>>
>>>>
>>>>
>>>> On 20/03/2015 18:09, Laufer wrote:
>>>>
>>>>> Thank, you, Eric.
>>>>>
>>>>> Abraços,
>>>>> Laufer
>>>>>
>>>>> 2015-03-20 12:31 GMT-03:00 Eric Stephan <ericphb@gmail.com>:
>>>>>
>>>>>  Laufer and Bernadette,
>>>>>>
>>>>>> I raised an issue relating to this asking the question can we use 5
>>>>>>
>>>>> star
>>>>
>>>>> as a metric and not a path?
>>>>>>
>>>>> http://www.w3.org/2013/dwbp/track/issues/148
>>>>
>>>>>
>>>>>> Eric S.
>>>>>>
>>>>>> On Fri, Mar 20, 2015 at 7:54 AM, Bernadette Farias Lóscio <
>>>>>>
>>>>> bfl@cin.ufpe.br
>>>>
>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>  Hi Laufer,
>>>>>>>
>>>>>>> Thanks for the message! It is a very useful explanation!
>>>>>>>
>>>>>>> I fully agree with you: "In this dataset publishing I can see the
>>>>>>>
>>>>>> idea of
>>>>
>>>>> publishing metadata and using standard vocabularies, but is not a LD
>>>>>>> dataset."
>>>>>>>
>>>>>>> IMHO, we can use vocabularies to publish metadata, but we are not
>>>>>>>
>>>>>> doing
>>>>
>>>>> linked data, i.e., there are no links between resources.
>>>>>>>
>>>>>>> I also agree that "we should differentiate the idea of a Best
>>>>>>>
>>>>>> Practice of
>>>>
>>>>> a non LD dataset of the idea of an implicit Best Practice to go to a
>>>>>>>
>>>>>> LD
>>>>
>>>>> dataset, that is what the 5 stars scale says.".
>>>>>>>
>>>>>>> If we have a BP whose implementation proposes the use of the RDF
>>>>>>>
>>>>>> model to
>>>>
>>>>> publish data, then we are moving towards the 5 stars. It is important
>>>>>>>
>>>>>> to
>>>>
>>>>> note that, publishind data using the RDF model may be just one of the
>>>>>>> proposed approaches for implementation, i.e, we may show other ways
>>>>>>> of
>>>>>>> publishing data without using RDF.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Bernadette
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-03-20 11:32 GMT-03:00 Laufer <laufer@globo.com>:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>>>
>>>>>>>> I will start my comment using an example:
>>>>>>>>
>>>>>>>> Someone publish a page where there are links to 2 files:
>>>>>>>> a csv file with a dataset;
>>>>>>>> a text file that explains the structure of the dataset, in natural
>>>>>>>> language (metadata).
>>>>>>>>
>>>>>>>> In the page there are a lot of metadata provided in natural
>>>>>>>>
>>>>>>> language, as
>>>>
>>>>> for example, an overview of the dataset, license, organization,
>>>>>>>>
>>>>>>> version,
>>>>
>>>>> creator, rights, etc...
>>>>>>>>
>>>>>>>> At the same time, the page has an embedded dcat instance using rdfa
>>>>>>>> where there are info about the dataset, the distribution, etc.
>>>>>>>>
>>>>>>>> What I want to say is that we have here the metadata concept mixed
>>>>>>>>
>>>>>>> with
>>>>
>>>>> semantic web concepts, and it is a way of publishing data that, if
>>>>>>>>
>>>>>>> all the
>>>>
>>>>> things are well described, could be very useful to the society.
>>>>>>>>
>>>>>>>> In this dataset publishing I can see the idea of publishing metadata
>>>>>>>>
>>>>>>> and
>>>>
>>>>> using standard vocabularies, but is not a LD dataset.
>>>>>>>>
>>>>>>>> What I was discussing in the last meeting is: will we support in the
>>>>>>>> document the idea that the best way to publish is LD. I am not
>>>>>>>>
>>>>>>> saying that
>>>>
>>>>> I am against or not the idea. I am favorable to LD. But we should
>>>>>>>> differentiate the idea of a Best Practice of a non LD dataset of the
>>>>>>>>
>>>>>>> idea
>>>>
>>>>> of an implicit Best Practice to go to a LD dataset, that is what the
>>>>>>>>
>>>>>>> 5
>>>>
>>>>> stars scale says.
>>>>>>>>
>>>>>>>> Maybe is too much care with the words, sorry about this.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Laufer
>>>>>>>>
>>>>>>>> --
>>>>>>>> .  .  .  .. .  .
>>>>>>>> .        .   . ..
>>>>>>>> .     ..       .
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Bernadette Farias Lóscio
>>>>>>> Centro de Informática
>>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>>>>>
>>>>>>>
>>>>>>>  ------------------------------------------------------------
>>>> ----------------
>>>>
>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>>
>>>>
>>>> Phil Archer
>>>> W3C Data Activity Lead
>>>> http://www.w3.org/2013/data/
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>>>
>>>>
>>
> --
>
>
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
>
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
>
Received on Sunday, 22 March 2015 21:39:02 UTC