Re: The 5 stars path from Christophe Guéret on 2015-03-26 (public-dwbp-wg@w3.org from March 2015)

From: Christophe Guéret <christophe.gueret@dans.knaw.nl>
Date: Wed, 25 Mar 2015 18:52:24 -0700
To: Steven Adler <adler1@us.ibm.com>
CC: Phil Archer <phila@w3.org>, Laufer <laufer@globo.com>, Bernadette Farias Lóscio <bfl@cin.ufpe.br>, DWBP WG <public-dwbp-wg@w3.org>, Eric Stephan <ericphb@gmail.com>
Message-ID: <CABP9CAH5cWHy9iT2jDh2v_EebGOGWLBx0wWwz5qfmbVELvrc2g@mail.gmail.com>
BTW, speaking about stars and feedback we may want to have a look at the 5
star scheme for community engagement from Tim Davies:
http://www.opendataimpacts.net/engagement/

We could probably do something with it, if only linking to it somewhere.

Cheers,
Christophe

--
Sent with difficulties. Sorry for the brievety and typos...
Op 24 mrt. 2015 07:18 schreef "Steven Adler" <adler1@us.ibm.com>:

> Rating a dataset is only valuable if records within the dataset have
> ratings whose sum or average validates the dataset rating.  That is, there
> has to be provenance to the ratings.
>
>
> Best Regards,
>
> Steve
>
> Motto: "Do First, Think, Do it Again"
>
> [image: Inactive hide details for Bernadette Farias Lóscio ---03/24/2015
> 10:11:38 AM---Hi all, Thanks for the great discussion!]Bernadette Farias
> Lóscio ---03/24/2015 10:11:38 AM---Hi all, Thanks for the great discussion!
>
>
>
>    From:
>
>
> Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>
>    To:
>
>
> Eric Stephan <ericphb@gmail.com>
>
>    Cc:
>
>
> Phil Archer <phila@w3.org>, Laufer <laufer@globo.com>, Christophe Guéret <
> christophe.gueret@dans.knaw.nl>, DWBP WG <public-dwbp-wg@w3.org>
>
>    Date:
>
>
> 03/24/2015 10:11 AM
>
>    Subject:
>
>
> Re: The 5 stars path
> ------------------------------
>
>
>
> Hi all,
>
> Thanks for the great discussion!
>
> I like the idea of having a star rating discussion, but we need to be
> aware that publishing data on the Web is more than just publishing data and
> metadata. It also concerns issues like data access and feedback.
>
> I've been thinking a lot about this rating system and it would be great to
> consider all aspects related to data on the Web (ex: data format, metadata,
> identifiers, data access, feedback, versioning...), but I'm bot sure if
> this is the best choice. Maybe, we can have a rating system based just on
> data and metadata, which is similar to the initial proposal of Phil.
>
> Cheers,
> Bernadette
>
> 2015-03-22 18:38 GMT-03:00 Eric Stephan <*ericphb@gmail.com*
> <ericphb@gmail.com>>:
>
>    Wow what a wonderful thread to read.  Thank you Phil!  Many many
>    thanks for this wonderful note of clarity!
>
>    >>if Eric and Annette can provide similar examples for NetCDF that
>    would be terrific (I'm out of my depth here).
>
>    Yes I think we can show this quite easily.  Just off the top of my
>    heads.
>
>    NetCDF:
>       - is an open format for storing multi-dimensional data streams
>    [NETCDF]
>       - can be annotated with self describing metadata (called attributes)
>       - has existing conventions for representing different forms of
>    data.  E.g. CF convention.
>       - has a CF vocabulary [CFNAMES] for curated climate and forecasting
>    terminology.
>       - In addition the climate community within the Earth System Grid
>    (ESG) has adopted fully documented protocols [CMIP5] to show how regional
>    and climate model datasets must be organized so that they can be
>    inter-related to support regional and global climate studies.
>      - Leverages existing ISO standards used in the geospatial, dublin
>    core, and metadata communities.
>       - Finally an ontology was developed by NASA JPL called SWEET
>    [SWEET], there is previous research showing how the CF terms can
>    inter-related.
>
>    I would submit that even without the ontology in terms of open data,
>    the climate community is already at 5 star.
>
>
>
>    Eric
>
>
>    References
>
>    [NETCDF] *http://en.wikipedia.org/wiki/NetCDF*
>    <http://en.wikipedia.org/wiki/NetCDF>
>    [CFNAMES]
>    *http://cfconventions.org/Data/cf-standard-names/28/build/cf-standard-name-table.html*
>    <http://cfconventions.org/Data/cf-standard-names/28/build/cf-standard-name-table.html>
>    [CMIP5] *http://cmip-pcmdi.llnl.gov/cmip5/*
>    <http://cmip-pcmdi.llnl.gov/cmip5/>
>    [SWEET] *https://sweet.jpl.nasa.gov/* <https://sweet.jpl.nasa.gov/>
>
>
>    On Sun, Mar 22, 2015 at 10:45 AM, Phil Archer <*phila@w3.org*
>    <phila@w3.org>> wrote:
>       We are in full agreement.
>
>       One of my hopes for this WG is that we can indeed lead people to
>       publish formats like CSV in the best way (i.e. with good quality metadata)
>       without them feeling somehow inferior.
>
>       If that leads us to define our own star rating system, I wouldn't
>       mind. Something like:
>
>       * It's available on the Web in an open format with a declared
>       licence (anything less is all but useless).
>
>       ** As level 1 with good quality discovery metadata (we might refer
>       to the DCAT Application profile work as an example).
>
>       *** All the above plus structural metadata in the relevant format
>       (e.g. CSV+ for CSV, VoID for RDF etc).
>
>       This doesn't include quality metrics (which it should), and contact
>       details (which it should) - but they might be defined at level 2?
>
>       Maybe a start anyway.
>
>       Phil.
>
>       On 22/03/2015 13:50, Laufer wrote:
>          I agree, Phil.
>
>          What I want to reinforce is that it would be nice if we could
>          make clear in
>          the document that 5 stars LD (or OD?) is not a scale of a
>          dataset that is
>          well published in the web. We can have, for example, a "CSV
>          dataset" (3
>          stars) more well published than a "LD dataset" (5 stars). Or,
>          maybe, we can
>          avoid using the 5 stars when what we want to say is that a
>          dataset is being
>          published in a CSV format.
>
>          If we say that one dataset is 3 stars and other is 5 stars,
>          people have the
>          idea that the 5 one is better than the 3 one (as in reviews or
>          hotels, for
>          example).
>
>          We probably will not define our own scale but I hope that our
>          set of BPs
>          could help people to publish a  "Well Published Data on The Web".
>
>          Best Regards,
>          Laufer
>
>          Em domingo, 22 de março de 2015, Christophe Guéret <
> *christophe.gueret@dans.knaw.nl* <christophe.gueret@dans.knaw.nl>
>          <javascript:_e(%7B%7D,'cvml','*christophe.gueret@dans.knaw.nl*
>          <christophe.gueret@dans.knaw.nl>');>> escreveu:
>
>           +1!
>
>             Christophe
>
>             --
>             Sent with difficulties. Sorry for the brievety and typos...
>             Op 22 mrt. 2015 08:47 schreef "Phil Archer" <*phila@w3.org*
>             <phila@w3.org>>:
>              I've just been reading through Friday's minutes and I see
>                that this was
>                the hot topic of the day. As ever, I'm sorry I wasn't able
>                to be there.
>
>                Let me add my 2 cents.
>
>                LD forms a small part of the available data on the Web. It
>                would be
>                silly of us to push for everyone to convert their data
>                into perfectly
>                linked 5 star data before they make it available publicly
>                or behind a
>                pay-wall of some kind.
>
>                What we *can* do IMO is:
>
>                - Promote the publication of human readable metadata as
>                Laufer has
>                described;
>
>                - promote the publication of machine readable metadata and
>                then show how
>                this can be (and is) done with RDF using DCAT as an
>                example;
>
>                - promote the publication of structural metadata which,
>                for CSV at
>                least, we have a very clear route - use the CSV on the Web
>                work;
>
>                - if Eric and Annette can provide similar examples for
>                NetCDF that would
>                be terrific (I'm out of my depth here).
>
>                - We can leave it to the Spatial Data on the Web WG to
>                handle spatial
>                stuff (as they are leaving some of their generic issues to
>                this group).
>
>                As an aside, the CSV WG has resolved its issues now and is
>                expecting to
>                publish pretty much the stable version of its specs in the
>                first week of
>                April.
>
>                If you publish data in your favourite format + structural
>                metadata in
>                whatever format goes with that (and the CSV WG is using
>                JSON for its
>                metadata) then you are providing a route through which
>                your users can
>                readily create 5 star data if they so wish. They may or
>                may not use LD
>                themselves but the concept behind it is, I hope, clear
>                enough to readers?
>
>                  From what I've read of Friday and the list since then, I
>                dare t hope
>                this is in line with the general mood of the WG?
>
>                Phil.
>
>
>
>                On 20/03/2015 18:09, Laufer wrote:
>                   Thank, you, Eric.
>
>                   Abraços,
>                   Laufer
>
>                   2015-03-20 12:31 GMT-03:00 Eric Stephan <
>                   *ericphb@gmail.com* <ericphb@gmail.com>>:
>                    Laufer and Bernadette,
>
>                      I raised an issue relating to this asking the
>                      question can we use 5
>                    star
>                   as a metric and not a path?
>                *http://www.w3.org/2013/dwbp/track/issues/148*
>                <http://www.w3.org/2013/dwbp/track/issues/148>
>
>                   Eric S.
>
>                   On Fri, Mar 20, 2015 at 7:54 AM, Bernadette Farias
>                   Lóscio <
>                *bfl@cin.ufpe.br* <bfl@cin.ufpe.br>
>                   wrote:
>                 Hi Laufer,
>
>                      Thanks for the message! It is a very useful
>                      explanation!
>
>                      I fully agree with you: "In this dataset publishing
>                      I can see the
>                    idea of
>                   publishing metadata and using standard vocabularies,
>                   but is not a LD
>                   dataset."
>
>                   IMHO, we can use vocabularies to publish metadata, but
>                   we are not
>                doing
>                   linked data, i.e., there are no links between resources.
>
>                   I also agree that "we should differentiate the idea of
>                   a Best
>                Practice of
>                   a non LD dataset of the idea of an implicit Best
>                   Practice to go to a
>                LD
>                   dataset, that is what the 5 stars scale says.".
>
>                   If we have a BP whose implementation proposes the use
>                   of the RDF
>                model to
>                   publish data, then we are moving towards the 5 stars.
>                   It is important
>                to
>                   note that, publishind data using the RDF model may be
>                   just one of the
>                   proposed approaches for implementation, i.e, we may
>                   show other ways of
>                   publishing data without using RDF.
>
>                   Cheers,
>                   Bernadette
>
>
>
>
>                   2015-03-20 11:32 GMT-03:00 Laufer <*laufer@globo.com*
>                   <laufer@globo.com>>:
>
>                   Hi all,
>
>                      I will start my comment using an example:
>
>                      Someone publish a page where there are links to 2
>                      files:
>                      a csv file with a dataset;
>                      a text file that explains the structure of the
>                      dataset, in natural
>                      language (metadata).
>
>                      In the page there are a lot of metadata provided in
>                      natural
>                    language, as
>                   for example, an overview of the dataset, license,
>                   organization,
>                version,
>                   creator, rights, etc...
>
>                   At the same time, the page has an embedded dcat
>                   instance using rdfa
>                   where there are info about the dataset, the
>                   distribution, etc.
>
>                   What I want to say is that we have here the metadata
>                   concept mixed
>                with
>                   semantic web concepts, and it is a way of publishing
>                   data that, if
>                all the
>                   things are well described, could be very useful to the
>                   society.
>
>                   In this dataset publishing I can see the idea of
>                   publishing metadata
>                and
>                   using standard vocabularies, but is not a LD dataset.
>
>                   What I was discussing in the last meeting is: will we
>                   support in the
>                   document the idea that the best way to publish is LD. I
>                   am not
>                saying that
>                   I am against or not the idea. I am favorable to LD. But
>                   we should
>                   differentiate the idea of a Best Practice of a non LD
>                   dataset of the
>                idea
>                   of an implicit Best Practice to go to a LD dataset,
>                   that is what the
>                5
>                   stars scale says.
>
>                   Maybe is too much care with the words, sorry about this.
>
>                   Best Regards,
>                   Laufer
>
>                   --
>                   .  .  .  .. .  .
>                   .        .   . ..
>                   .     ..       .
>
>
>
>                   --
>                   Bernadette Farias Lóscio
>                   Centro de Informática
>                   Universidade Federal de Pernambuco - UFPE, Brazil
>
>
>                ----------------------------------------------------------------------------
>
>
>
>                --
>
>
>                Phil Archer
>                W3C Data Activity Lead
> *http://www.w3.org/2013/data/* <http://www.w3.org/2013/data/>
>
> *http://philarcher.org* <http://philarcher.org/>
> *+44 (0)7887 767755* <%2B44%20%280%297887%20767755>
>                @philarcher1
>
>
>       --
>
>
>       Phil Archer
>       W3C Data Activity Lead
> *http://www.w3.org/2013/data/* <http://www.w3.org/2013/data/>
>
> *http://philarcher.org* <http://philarcher.org/>
> *+44 (0)7887 767755* <%2B44%20%280%297887%20767755>
>       @philarcher1
>
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
>
> ----------------------------------------------------------------------------
>
>
Attachments

image/gif attachment: ecblank.gif
image/gif attachment: graycol.gif
Received on Thursday, 26 March 2015 01:53:02 UTC