- From: Eric Stephan <ericphb@gmail.com>
- Date: Fri, 27 Mar 2015 12:03:35 -0700
- To: Annette Greiner <amgreiner@lbl.gov>
- Cc: Steven Adler <adler1@us.ibm.com>, DWBP WG <public-dwbp-wg@w3.org>
- Message-ID: <CAMFz4ji9+BU6PmkMhR6k62LwGuvJbmHeweDne3+c0DS5NF_Kug@mail.gmail.com>
>>For that reason, I would avoid mention of file types in those BPs entirely. +1 On Fri, Mar 27, 2015 at 11:57 AM, Annette Greiner <amgreiner@lbl.gov> wrote: > Steve, I agree that we don't want BPs like the ones about metadata to be > taken as applying only to data in certain file formats. I think the choice > of file type for the data itself is orthogonal to our recommendations about > metadata. For that reason, I would avoid mention of file types in those BPs > entirely. > > On Mar 27, 2015, at 11:50 AM, Steven Adler <adler1@us.ibm.com> wrote: > > > I mean that a best practice applies even when you are doing things that > are less than perfect. For example: > > > > We recommend that published Open Data uses DCAT+ metadata. This should > apply to JSON, RDF, CSV and PDF, JPEG, AVI, or even to "ancient" > Wordperfect documents from the 1980's. > > > > I would not want us to say that our best practices only apply to W3C > blessed file types, because: > > > > 1. It ignores the reality of the way the rest of the world publishes > data (which btw, is exactly the issue the CSV WG is designed to address > because W3C was rightly critized before CSV of only advocating for its own > standards) > > > > 2. It limits the audience who will care about what we write > > > > > > > > > > Best Regards, > > > > Steve > > > > Motto: "Do First, Think, Do it Again" > > > > <graycol.gif>Laufer ---03/27/2015 02:40:15 PM---Steve, I understand your > concerns and, for me, I think that when we say that there > > > > <ecblank.gif> > > From: > > <ecblank.gif> > > Laufer <laufer@globo.com> > > <ecblank.gif> > > To: > > <ecblank.gif> > > Steven Adler/Somers/IBM@IBMUS > > <ecblank.gif> > > Cc: > > <ecblank.gif> > > Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bart van Leeuwen < > bart_van_leeuwen@netage.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG > <public-dwbp-wg@w3.org> > > <ecblank.gif> > > Date: > > <ecblank.gif> > > 03/27/2015 02:40 PM > > <ecblank.gif> > > Subject: > > <ecblank.gif> > > Re: NY Property Tax Explorer > > > > > > > > Steve, > > > > I understand your concerns and, for me, I think that when we say that > there are some best practices, we are not saying to people to not publish > if they cannot do the best practices. If they don't have a choice, well, it > is better to publish in PDF. But it is not a best practice. It is a > practice better than no practice. > > > > As I was discussing in the thread of 5 stars LOD (as a scale of quality > that is understood many times as the absolute scale of quality of Data > Published on The Web), the LOD scale is not the absolute scale of quality > but it is one of them. But besides this scale, there are other quality axes > that could be enhanced, even using PDFs, for example, good metadata (about > licenses, SLAs, versions, update periods, etc.) good data, etc. > > > > So, IMHO, what we can say to someone that publish in PDF, and have no > other choice, is that the quality of the publication could be enhanced in > different ways, aggregating good metadata for example, etc. And when the > PDF could be replaced by another format, so, do it. > > > > Abraços, > > Laufer > > > > 2015-03-27 13:07 GMT-03:00 Steven Adler <adler1@us.ibm.com>: > > So, does our BP document only apply to data published in the future in > the file types we bless? > > > > > > Best Regards, > > > > Steve > > > > Motto: "Do First, Think, Do it Again" > > > > Christophe Guéret ---03/27/2015 11:40:10 AM---Hoi, We are not writing a > document that describes how people publish and consume > > > > > > From: > > > > Christophe Guéret <christophe.gueret@dans.knaw.nl> > > > > To: > > > > Makx Dekkers <mail@makxdekkers.com> > > > > Cc: > > > > Steven Adler/Somers/IBM@IBMUS, DWBP WG <public-dwbp-wg@w3.org>, Bart > van Leeuwen <bart_van_leeuwen@netage.nl> > > > > Date: > > > > 03/27/2015 11:40 AM > > > > Subject: > > > > RE: NY Property Tax Explorer > > > > > > > > Hoi, > > We are not writing a document that describes how people publish and > consume open data, we are writing guidelines on how they can best do it. > > > > The concept of "best" is obviously subjective but I hope we can at list > agree on some points. > > > > I was recently sitting with people dealing with crisis. They need a lot > of data and when asking for it they sometimes get a PDF with a picture of a > hand written table in it. According to the publisher this is good open > data. Is it really so? The consumers spent a lot of time extracting the > data from it... > > > > Our document could help there by letting the consumers having something > to help arguing with the publisher and hopefully get something more usable. > > > > As for every best practices, there is no guarantee ours will be followed > but having somewhere an officially endorsed way of publishing good open > data will surely be welcomed by many data publishers and consumers. > > > > Cheers, > > Christophe > > > > -- > > Sent with difficulties. Sorry for the brievety and typos... > > > > Op 27 mrt. 2015 16:19 schreef "Makx Dekkers" <mail@makxdekkers.com>: > > > > > > Apologies for missing the call, again, today. > > > > In my mind, we really need to say what we mean with ‘best practice’. Do > we really think we can define one best practice implying that all the rest > is ‘bad practice’? I don’t think so. What I would like to see is ‘practice > related to objectives’ and then try to determine what kinds of behaviour > make sense for what kinds of objectives. > > > > > > For example, certain forms of PDF are really good if you want to enable > out-loud reading of documents for the blind, but not so good to extract > tabular information. If you want to make your tabular data useful for > applications, there are better ways to publish the data than PDF. > > > > > > As I earlier argued for metadata best practices, I think the most useful > kind of advice should be something like: if you want to do A, then if you > publish data as X you will have the following advantages and disadvantages, > and you should really consider format Y to increase usefulness of your data. > > > > > > Makx. > > > > > > > > > > De: Steven Adler [mailto:adler1@us.ibm.com] > > Enviado el: 27 March 2015 15:41 > > Para: Bart van Leeuwen > > CC: DWBP WG > > Asunto: Re: NY Property Tax Explorer > > > > > > Bart, > > > > A PDF might not conform to your definition of a best practice, but NYC > is publishing tens of thousands of PDF's that describe property taxes, > hospitals, crime reports, and housing inspections. > > > > My point is that if we restrict our recommendations of best practices to > only conform to what we define as the best file types, we are deliberately > limiting the relevance of our work in the real world. > > > > > > > > > > > > Best Regards, > > > > Steve > > > > Motto: "Do First, Think, Do it Again" > > > > Bart van Leeuwen ---03/27/2015 10:35:44 AM---I think we try to assemble > a 'best practice' with this working group. I sincerely hope you don't con > > > > > > > > > > From: > > > > Bart van Leeuwen <bart_van_leeuwen@netage.nl> > > > > > > To: > > > > Steven Adler/Somers/IBM@IBMUS > > > > > > Cc: > > > > "DWBP WG" <public-dwbp-wg@w3.org> > > > > > > Date: > > > > 03/27/2015 10:35 AM > > > > > > Subject: > > > > Re: NY Property Tax Explorer > > > > > > > > I think we try to assemble a 'best practice' with this working group. > > I sincerely hope you don't consider data published in a PDF to conform > to this best practice. > > > > I'm not arguing that it is possible to get usable data from these > formats, but they were not intended to carry data in a machine readable way. > > > > Bart > > > > Steven Adler <adler1@us.ibm.com> wrote on 27-03-2015 15:09:32: > > > > > From: Steven Adler <adler1@us.ibm.com> > > > To: "DWBP WG" <public-dwbp-wg@w3.org> > > > Date: 27-03-2015 15:10 > > > Subject: NY Property Tax Explorer > > > > > > You may recall I submitted a use case about this example from NYC > > > last year. The developer, Chris Wong, who works for Socrata, wrote > > > a Ruby routine to scrape 1000 PDF files for property tax data to > > > fill out this map app: > > > > > > http://www.w3.org/2013/dwbp/track/issues/56 > > > > > > Chris is a self-taught developer, by no means a pro. I think this > > > story well demonstrates that Data on the Web today is quite > > > innovative and PDF, JPG, AVI, MP3, and MP4 are commonly machine > readable. > > > > > > Restricting our recommendations to file formats that conform only > > > those covered by W3C WG's (JSON, CSV, RDF, etc) ignores the reality > > > of how Open Data is published and used. > > > > > > > > > Best Regards, > > > > > > Steve > > > > > > Motto: "Do First, Think, Do it Again" > > > > > > > > > > -- > > . . . .. . . > > . . . .. > > . .. . > > > > >
Received on Friday, 27 March 2015 19:04:03 UTC