- From: Eric Stephan <ericphb@gmail.com>
- Date: Sun, 22 Mar 2015 14:38:33 -0700
- To: Phil Archer <phila@w3.org>
- Cc: Laufer <laufer@globo.com>, Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bernadette Farias Lóscio <bfl@cin.ufpe.br>, DWBP WG <public-dwbp-wg@w3.org>
- Message-ID: <CAMFz4jhcaqQ=MH2=Brg+CT0=vZznKhYu3Etu=Y8ggyzT04kPXQ@mail.gmail.com>
Wow what a wonderful thread to read. Thank you Phil! Many many thanks for this wonderful note of clarity! >>if Eric and Annette can provide similar examples for NetCDF that would be terrific (I'm out of my depth here). Yes I think we can show this quite easily. Just off the top of my heads. NetCDF: - is an open format for storing multi-dimensional data streams [NETCDF] - can be annotated with self describing metadata (called attributes) - has existing conventions for representing different forms of data. E.g. CF convention. - has a CF vocabulary [CFNAMES] for curated climate and forecasting terminology. - In addition the climate community within the Earth System Grid (ESG) has adopted fully documented protocols [CMIP5] to show how regional and climate model datasets must be organized so that they can be inter-related to support regional and global climate studies. - Leverages existing ISO standards used in the geospatial, dublin core, and metadata communities. - Finally an ontology was developed by NASA JPL called SWEET [SWEET], there is previous research showing how the CF terms can inter-related. I would submit that even without the ontology in terms of open data, the climate community is already at 5 star. Eric References [NETCDF] http://en.wikipedia.org/wiki/NetCDF [CFNAMES] http://cfconventions.org/Data/cf-standard-names/28/build/cf-standard-name-table.html [CMIP5] http://cmip-pcmdi.llnl.gov/cmip5/ [SWEET] https://sweet.jpl.nasa.gov/ On Sun, Mar 22, 2015 at 10:45 AM, Phil Archer <phila@w3.org> wrote: > We are in full agreement. > > One of my hopes for this WG is that we can indeed lead people to publish > formats like CSV in the best way (i.e. with good quality metadata) without > them feeling somehow inferior. > > If that leads us to define our own star rating system, I wouldn't mind. > Something like: > > * It's available on the Web in an open format with a declared licence > (anything less is all but useless). > > ** As level 1 with good quality discovery metadata (we might refer to the > DCAT Application profile work as an example). > > *** All the above plus structural metadata in the relevant format (e.g. > CSV+ for CSV, VoID for RDF etc). > > This doesn't include quality metrics (which it should), and contact > details (which it should) - but they might be defined at level 2? > > Maybe a start anyway. > > Phil. > > On 22/03/2015 13:50, Laufer wrote: > >> I agree, Phil. >> >> What I want to reinforce is that it would be nice if we could make clear >> in >> the document that 5 stars LD (or OD?) is not a scale of a dataset that is >> well published in the web. We can have, for example, a "CSV dataset" (3 >> stars) more well published than a "LD dataset" (5 stars). Or, maybe, we >> can >> avoid using the 5 stars when what we want to say is that a dataset is >> being >> published in a CSV format. >> >> If we say that one dataset is 3 stars and other is 5 stars, people have >> the >> idea that the 5 one is better than the 3 one (as in reviews or hotels, for >> example). >> >> We probably will not define our own scale but I hope that our set of BPs >> could help people to publish a "Well Published Data on The Web". >> >> Best Regards, >> Laufer >> >> Em domingo, 22 de março de 2015, Christophe Guéret < >> christophe.gueret@dans.knaw.nl >> <javascript:_e(%7B%7D,'cvml','christophe.gueret@dans.knaw.nl');>> >> escreveu: >> >> >> +1! >>> >>> Christophe >>> >>> -- >>> Sent with difficulties. Sorry for the brievety and typos... >>> Op 22 mrt. 2015 08:47 schreef "Phil Archer" <phila@w3.org>: >>> >>> I've just been reading through Friday's minutes and I see that this was >>>> the hot topic of the day. As ever, I'm sorry I wasn't able to be there. >>>> >>>> Let me add my 2 cents. >>>> >>>> LD forms a small part of the available data on the Web. It would be >>>> silly of us to push for everyone to convert their data into perfectly >>>> linked 5 star data before they make it available publicly or behind a >>>> pay-wall of some kind. >>>> >>>> What we *can* do IMO is: >>>> >>>> - Promote the publication of human readable metadata as Laufer has >>>> described; >>>> >>>> - promote the publication of machine readable metadata and then show how >>>> this can be (and is) done with RDF using DCAT as an example; >>>> >>>> - promote the publication of structural metadata which, for CSV at >>>> least, we have a very clear route - use the CSV on the Web work; >>>> >>>> - if Eric and Annette can provide similar examples for NetCDF that would >>>> be terrific (I'm out of my depth here). >>>> >>>> - We can leave it to the Spatial Data on the Web WG to handle spatial >>>> stuff (as they are leaving some of their generic issues to this group). >>>> >>>> As an aside, the CSV WG has resolved its issues now and is expecting to >>>> publish pretty much the stable version of its specs in the first week of >>>> April. >>>> >>>> If you publish data in your favourite format + structural metadata in >>>> whatever format goes with that (and the CSV WG is using JSON for its >>>> metadata) then you are providing a route through which your users can >>>> readily create 5 star data if they so wish. They may or may not use LD >>>> themselves but the concept behind it is, I hope, clear enough to >>>> readers? >>>> >>>> From what I've read of Friday and the list since then, I dare t hope >>>> this is in line with the general mood of the WG? >>>> >>>> Phil. >>>> >>>> >>>> >>>> On 20/03/2015 18:09, Laufer wrote: >>>> >>>>> Thank, you, Eric. >>>>> >>>>> Abraços, >>>>> Laufer >>>>> >>>>> 2015-03-20 12:31 GMT-03:00 Eric Stephan <ericphb@gmail.com>: >>>>> >>>>> Laufer and Bernadette, >>>>>> >>>>>> I raised an issue relating to this asking the question can we use 5 >>>>>> >>>>> star >>>> >>>>> as a metric and not a path? >>>>>> >>>>> http://www.w3.org/2013/dwbp/track/issues/148 >>>> >>>>> >>>>>> Eric S. >>>>>> >>>>>> On Fri, Mar 20, 2015 at 7:54 AM, Bernadette Farias Lóscio < >>>>>> >>>>> bfl@cin.ufpe.br >>>> >>>>> wrote: >>>>>>> >>>>>> >>>>>> Hi Laufer, >>>>>>> >>>>>>> Thanks for the message! It is a very useful explanation! >>>>>>> >>>>>>> I fully agree with you: "In this dataset publishing I can see the >>>>>>> >>>>>> idea of >>>> >>>>> publishing metadata and using standard vocabularies, but is not a LD >>>>>>> dataset." >>>>>>> >>>>>>> IMHO, we can use vocabularies to publish metadata, but we are not >>>>>>> >>>>>> doing >>>> >>>>> linked data, i.e., there are no links between resources. >>>>>>> >>>>>>> I also agree that "we should differentiate the idea of a Best >>>>>>> >>>>>> Practice of >>>> >>>>> a non LD dataset of the idea of an implicit Best Practice to go to a >>>>>>> >>>>>> LD >>>> >>>>> dataset, that is what the 5 stars scale says.". >>>>>>> >>>>>>> If we have a BP whose implementation proposes the use of the RDF >>>>>>> >>>>>> model to >>>> >>>>> publish data, then we are moving towards the 5 stars. It is important >>>>>>> >>>>>> to >>>> >>>>> note that, publishind data using the RDF model may be just one of the >>>>>>> proposed approaches for implementation, i.e, we may show other ways >>>>>>> of >>>>>>> publishing data without using RDF. >>>>>>> >>>>>>> Cheers, >>>>>>> Bernadette >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2015-03-20 11:32 GMT-03:00 Laufer <laufer@globo.com>: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>>> >>>>>>>> I will start my comment using an example: >>>>>>>> >>>>>>>> Someone publish a page where there are links to 2 files: >>>>>>>> a csv file with a dataset; >>>>>>>> a text file that explains the structure of the dataset, in natural >>>>>>>> language (metadata). >>>>>>>> >>>>>>>> In the page there are a lot of metadata provided in natural >>>>>>>> >>>>>>> language, as >>>> >>>>> for example, an overview of the dataset, license, organization, >>>>>>>> >>>>>>> version, >>>> >>>>> creator, rights, etc... >>>>>>>> >>>>>>>> At the same time, the page has an embedded dcat instance using rdfa >>>>>>>> where there are info about the dataset, the distribution, etc. >>>>>>>> >>>>>>>> What I want to say is that we have here the metadata concept mixed >>>>>>>> >>>>>>> with >>>> >>>>> semantic web concepts, and it is a way of publishing data that, if >>>>>>>> >>>>>>> all the >>>> >>>>> things are well described, could be very useful to the society. >>>>>>>> >>>>>>>> In this dataset publishing I can see the idea of publishing metadata >>>>>>>> >>>>>>> and >>>> >>>>> using standard vocabularies, but is not a LD dataset. >>>>>>>> >>>>>>>> What I was discussing in the last meeting is: will we support in the >>>>>>>> document the idea that the best way to publish is LD. I am not >>>>>>>> >>>>>>> saying that >>>> >>>>> I am against or not the idea. I am favorable to LD. But we should >>>>>>>> differentiate the idea of a Best Practice of a non LD dataset of the >>>>>>>> >>>>>>> idea >>>> >>>>> of an implicit Best Practice to go to a LD dataset, that is what the >>>>>>>> >>>>>>> 5 >>>> >>>>> stars scale says. >>>>>>>> >>>>>>>> Maybe is too much care with the words, sorry about this. >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Laufer >>>>>>>> >>>>>>>> -- >>>>>>>> . . . .. . . >>>>>>>> . . . .. >>>>>>>> . .. . >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Bernadette Farias Lóscio >>>>>>> Centro de Informática >>>>>>> Universidade Federal de Pernambuco - UFPE, Brazil >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------ >>>> ---------------- >>>> >>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >>>> >>>> >>>> >> > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > @philarcher1 >
Received on Sunday, 22 March 2015 21:39:02 UTC