- From: Laufer <laufer@globo.com>
- Date: Fri, 27 Mar 2015 15:55:31 -0300
- To: Steven Adler <adler1@us.ibm.com>
- Cc: Bart van Leeuwen <bart_van_leeuwen@netage.nl>, Christophe Guéret <christophe.gueret@dans.knaw.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG <public-dwbp-wg@w3.org>
- Message-ID: <CA+pXJihj0QnnRMsvgrvV8XchYQ5cjxUgPodK7gZeiZdum4LSww@mail.gmail.com>
When I say that it is a best practice to provide metadata, I am saying that this applies to all kind of data and formats. We do not have any best practice saying to publish data in a specific format, have we? Best, Laufer 2015-03-27 15:50 GMT-03:00 Steven Adler <adler1@us.ibm.com>: > I mean that a best practice applies even when you are doing things that > are less than perfect. For example: > > We recommend that published Open Data uses DCAT+ metadata. This should > apply to JSON, RDF, CSV *and* PDF, JPEG, AVI, or even to "ancient" > Wordperfect documents from the 1980's. > > I would not want us to say that our best practices only apply to W3C > blessed file types, because: > > 1. It ignores the reality of the way the rest of the world publishes data > (which btw, is exactly the issue the CSV WG is designed to address because > W3C was rightly critized before CSV of only advocating for its own > standards) > > 2. It limits the audience who will care about what we write > > > > > Best Regards, > > Steve > > Motto: "Do First, Think, Do it Again" > > [image: Inactive hide details for Laufer ---03/27/2015 02:40:15 > PM---Steve, I understand your concerns and, for me, I think that when w]Laufer > ---03/27/2015 02:40:15 PM---Steve, I understand your concerns and, for me, > I think that when we say that there > > > > From: > > > Laufer <laufer@globo.com> > > To: > > > Steven Adler/Somers/IBM@IBMUS > > Cc: > > > Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bart van Leeuwen < > bart_van_leeuwen@netage.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG > <public-dwbp-wg@w3.org> > > Date: > > > 03/27/2015 02:40 PM > > Subject: > > > Re: NY Property Tax Explorer > ------------------------------ > > > > Steve, > > I understand your concerns and, for me, I think that when we say that > there are some best practices, we are not saying to people to not publish > if they cannot do the best practices. If they don't have a choice, well, it > is better to publish in PDF. But it is not a best practice. It is a > practice better than no practice. > > As I was discussing in the thread of 5 stars LOD (as a scale of quality > that is understood many times as the absolute scale of quality of Data > Published on The Web), the LOD scale is not the absolute scale of quality > but it is one of them. But besides this scale, there are other quality axes > that could be enhanced, even using PDFs, for example, good metadata (about > licenses, SLAs, versions, update periods, etc.) good data, etc. > > So, IMHO, what we can say to someone that publish in PDF, and have no > other choice, is that the quality of the publication could be enhanced in > different ways, aggregating good metadata for example, etc. And when the > PDF could be replaced by another format, so, do it. > > Abraços, > Laufer > > 2015-03-27 13:07 GMT-03:00 Steven Adler <*adler1@us.ibm.com* > <adler1@us.ibm.com>>: > > So, does our BP document only apply to data published in the future in > the file types we bless? > > > Best Regards, > > Steve > > Motto: "Do First, Think, Do it Again" > > [image: Inactive hide details for Christophe Guéret ---03/27/2015 > 11:40:10 AM---Hoi, We are not writing a document that describes how p]Christophe > Guéret ---03/27/2015 11:40:10 AM---Hoi, We are not writing a document that > describes how people publish and consume > > > > From: > > > Christophe Guéret <*christophe.gueret@dans.knaw.nl* > <christophe.gueret@dans.knaw.nl>> > > To: > > > Makx Dekkers <*mail@makxdekkers.com* <mail@makxdekkers.com>> > > Cc: > > > Steven Adler/Somers/IBM@IBMUS, DWBP WG <*public-dwbp-wg@w3.org* > <public-dwbp-wg@w3.org>>, Bart van Leeuwen <*bart_van_leeuwen@netage.nl* > <bart_van_leeuwen@netage.nl>> > > Date: > > > 03/27/2015 11:40 AM > > Subject: > > > RE: NY Property Tax Explorer > > ------------------------------ > > > > Hoi, > > We are not writing a document that describes how people publish and > consume open data, we are writing guidelines on how they can best do it. > > The concept of "best" is obviously subjective but I hope we can at > list agree on some points. > > I was recently sitting with people dealing with crisis. They need a > lot of data and when asking for it they sometimes get a PDF with a picture > of a hand written table in it. According to the publisher this is good open > data. Is it really so? The consumers spent a lot of time extracting the > data from it... > > Our document could help there by letting the consumers having > something to help arguing with the publisher and hopefully get something > more usable. > > As for every best practices, there is no guarantee ours will be > followed but having somewhere an officially endorsed way of publishing good > open data will surely be welcomed by many data publishers and consumers. > > Cheers, > Christophe > > -- > Sent with difficulties. Sorry for the brievety and typos... > > Op 27 mrt. 2015 16:19 schreef "Makx Dekkers" <*mail@makxdekkers.com* > <mail@makxdekkers.com>>: > > Apologies for missing the call, again, today. > > > > In my mind, we really need to say what we mean with ‘best > practice’. Do we really think we can define one best practice implying that > all the rest is ‘bad practice’? I don’t think so. What I would like to see > is ‘practice related to objectives’ and then try to determine what kinds of > behaviour make sense for what kinds of objectives. > > > > For example, certain forms of PDF are really good if you want to > enable out-loud reading of documents for the blind, but not so good to > extract tabular information. If you want to make your tabular data useful > for applications, there are better ways to publish the data than PDF. > > > > > As I earlier argued for metadata best practices, I think the most > useful kind of advice should be something like: if you want to do A, then > if you publish data as X you will have the following advantages and > disadvantages, and you should really consider format Y to increase > usefulness of your data. > > > > Makx. > > > > > > *De:* Steven Adler [mailto:*adler1@us.ibm.com* <adler1@us.ibm.com>] > * Enviado el:* 27 March 2015 15:41 > * Para:* Bart van Leeuwen > * CC:* DWBP WG > * Asunto:* Re: NY Property Tax Explorer > > > > Bart, > > A PDF might not conform to your definition of a best practice, but > NYC is publishing tens of thousands of PDF's that describe property taxes, > hospitals, crime reports, and housing inspections. > > My point is that if we restrict our recommendations of best > practices to only conform to what we define as the best file types, we are > deliberately limiting the relevance of our work in the real world. > > > > > > Best Regards, > > Steve > > Motto: "Do First, Think, Do it Again" > > [image: Inactive hide details for Bart van Leeuwen ---03/27/2015 > 10:35:44 AM---I think we try to assemble a 'best practice' with this w]Bart > van Leeuwen ---03/27/2015 10:35:44 AM---I think we try to assemble a 'best > practice' with this working group. I sincerely hope you don't con > > > > From: > > Bart van Leeuwen <*bart_van_leeuwen@netage.nl* > <bart_van_leeuwen@netage.nl>> > > To: > > Steven Adler/Somers/IBM@IBMUS > > Cc: > > "DWBP WG" <*public-dwbp-wg@w3.org* <public-dwbp-wg@w3.org>> > > Date: > > 03/27/2015 10:35 AM > > Subject: > > Re: NY Property Tax Explorer > ------------------------------ > > > > > I think we try to assemble a 'best practice' with this working > group. > I sincerely hope you don't consider data published in a PDF to > conform to this best practice. > > I'm not arguing that it is possible to get usable data from these > formats, but they were not intended to carry data in a machine readable way. > > > Bart > > Steven Adler <*adler1@us.ibm.com* <adler1@us.ibm.com>> wrote on > 27-03-2015 15:09:32: > > > From: Steven Adler <*adler1@us.ibm.com* <adler1@us.ibm.com>> > > To: "DWBP WG" <*public-dwbp-wg@w3.org* <public-dwbp-wg@w3.org>> > > Date: 27-03-2015 15:10 > > Subject: NY Property Tax Explorer > > > > You may recall I submitted a use case about this example from NYC > > last year. The developer, Chris Wong, who works for Socrata, > wrote > > a Ruby routine to scrape 1000 PDF files for property tax data to > > fill out this map app: > > > > *http://www.w3.org/2013/dwbp/track/issues/56* > <http://www.w3.org/2013/dwbp/track/issues/56> > > > > Chris is a self-taught developer, by no means a pro. I think > this > > story well demonstrates that Data on the Web today is quite > > innovative and PDF, JPG, AVI, MP3, and MP4 are commonly machine > readable. > > > > Restricting our recommendations to file formats that conform only > > those covered by W3C WG's (JSON, CSV, RDF, etc) ignores the > reality > > of how Open Data is published and used. > > > > > > Best Regards, > > > > Steve > > > > Motto: "Do First, Think, Do it Again" > > > > > > -- > . . . .. . . > . . . .. > . .. . > > -- . . . .. . . . . . .. . .. .
Attachments
- image/gif attachment: graycol.gif
- image/gif attachment: ecblank.gif
Received on Friday, 27 March 2015 18:56:01 UTC