W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2015

Re: NY Property Tax Explorer

From: Eric Stephan <ericphb@gmail.com>
Date: Fri, 27 Mar 2015 12:03:35 -0700
Message-ID: <CAMFz4ji9+BU6PmkMhR6k62LwGuvJbmHeweDne3+c0DS5NF_Kug@mail.gmail.com>
To: Annette Greiner <amgreiner@lbl.gov>
Cc: Steven Adler <adler1@us.ibm.com>, DWBP WG <public-dwbp-wg@w3.org>
 >>For that reason, I would avoid mention of file types in those BPs
entirely.
+1

On Fri, Mar 27, 2015 at 11:57 AM, Annette Greiner <amgreiner@lbl.gov> wrote:

> Steve, I agree that we don't want BPs like the ones about metadata to be
> taken as applying only to data in certain file formats. I think the choice
> of file type for the data itself is orthogonal to our recommendations about
> metadata. For that reason, I would avoid mention of file types in those BPs
> entirely.
>
> On Mar 27, 2015, at 11:50 AM, Steven Adler <adler1@us.ibm.com> wrote:
>
> > I mean that a best practice applies even when you are doing things that
> are less than perfect.  For example:
> >
> > We recommend that published Open Data uses DCAT+ metadata.  This should
> apply to JSON, RDF, CSV and PDF, JPEG, AVI, or even to "ancient"
> Wordperfect documents from the 1980's.
> >
> > I would not want us to say that our best practices only apply to W3C
> blessed file types, because:
> >
> > 1.  It ignores the reality of the way the rest of the world publishes
> data (which btw, is exactly the issue the CSV WG is designed to address
> because W3C was rightly critized before CSV of only advocating for its own
> standards)
> >
> > 2.  It limits the audience who will care about what we write
> >
> >
> >
> >
> > Best Regards,
> >
> > Steve
> >
> > Motto: "Do First, Think, Do it Again"
> >
> > <graycol.gif>Laufer ---03/27/2015 02:40:15 PM---Steve, I understand your
> concerns and, for me, I think that when we say that there
> >
> > <ecblank.gif>
> > From:
> > <ecblank.gif>
> > Laufer <laufer@globo.com>
> > <ecblank.gif>
> > To:
> > <ecblank.gif>
> > Steven Adler/Somers/IBM@IBMUS
> > <ecblank.gif>
> > Cc:
> > <ecblank.gif>
> > Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bart van Leeuwen <
> bart_van_leeuwen@netage.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG
> <public-dwbp-wg@w3.org>
> > <ecblank.gif>
> > Date:
> > <ecblank.gif>
> > 03/27/2015 02:40 PM
> > <ecblank.gif>
> > Subject:
> > <ecblank.gif>
> > Re: NY Property Tax Explorer
> >
> >
> >
> > Steve,
> >
> > I understand your concerns and, for me, I think that when we say that
> there are some best practices, we are not saying to people to not publish
> if they cannot do the best practices. If they don't have a choice, well, it
> is better to publish in PDF. But it is not a best practice. It is a
> practice better than no practice.
> >
> > As I was discussing in the thread of 5 stars LOD (as a scale of quality
> that is understood many times as the absolute scale of quality of Data
> Published on The Web), the LOD scale is not the absolute scale of quality
> but it is one of them. But besides this scale, there are other quality axes
> that could be enhanced, even using PDFs, for example, good metadata (about
> licenses, SLAs, versions, update periods, etc.) good data, etc.
> >
> > So, IMHO, what we can say to someone that publish in PDF, and have no
> other choice, is that the quality of the publication could be enhanced in
> different ways, aggregating good metadata for example, etc. And when the
> PDF could be replaced by another format, so, do it.
> >
> > Abraços,
> > Laufer
> >
> > 2015-03-27 13:07 GMT-03:00 Steven Adler <adler1@us.ibm.com>:
> > So, does our BP document only apply to data published in the future in
> the file types we bless?
> >
> >
> > Best Regards,
> >
> > Steve
> >
> > Motto: "Do First, Think, Do it Again"
> >
> > Christophe Guéret ---03/27/2015 11:40:10 AM---Hoi, We are not writing a
> document that describes how people publish and consume
> >
> >
> > From:
> >
> > Christophe Guéret <christophe.gueret@dans.knaw.nl>
> >
> > To:
> >
> > Makx Dekkers <mail@makxdekkers.com>
> >
> > Cc:
> >
> > Steven Adler/Somers/IBM@IBMUS, DWBP WG <public-dwbp-wg@w3.org>, Bart
> van Leeuwen <bart_van_leeuwen@netage.nl>
> >
> > Date:
> >
> > 03/27/2015 11:40 AM
> >
> > Subject:
> >
> > RE: NY Property Tax Explorer
> >
> >
> >
> > Hoi,
> > We are not writing a document that describes how people publish and
> consume open data, we are writing guidelines on how they can best do it.
> >
> > The concept of "best" is obviously subjective but I hope we can at list
> agree on some points.
> >
> > I was recently sitting with people dealing with crisis. They need a lot
> of data and when asking for it they sometimes get a PDF with a picture of a
> hand written table in it. According to the publisher this is good open
> data. Is it really so? The consumers spent a lot of time extracting the
> data from it...
> >
> > Our document could help there by letting the consumers having something
> to help arguing with the publisher and hopefully get something more usable.
> >
> > As for every best practices, there is no guarantee ours will be followed
> but having somewhere an officially endorsed way of publishing good open
> data will surely be welcomed by many data publishers and consumers.
> >
> > Cheers,
> > Christophe
> >
> > --
> > Sent with difficulties. Sorry for the brievety and typos...
> >
> > Op 27 mrt. 2015 16:19 schreef "Makx Dekkers" <mail@makxdekkers.com>:
> >
> >
> > Apologies for missing the call, again, today.
> >
> > In my mind, we really need to say what we mean with ‘best practice’. Do
> we really think we can define one best practice implying that all the rest
> is ‘bad practice’? I don’t think so. What I would like to see is ‘practice
> related to objectives’ and then try to determine what kinds of behaviour
> make sense for what kinds of objectives.
> >
> >
> > For example, certain forms of PDF are really good if you want to enable
> out-loud reading of documents for the blind, but not so good to extract
> tabular information. If you want to make your tabular data useful for
> applications, there are better ways to publish the data than PDF.
> >
> >
> > As I earlier argued for metadata best practices, I think the most useful
> kind of advice should be something like: if you want to do A, then if you
> publish data as X you will have the following advantages and disadvantages,
> and you should really consider format Y to increase usefulness of your data.
> >
> >
> > Makx.
> >
> >
> >
> >
> > De: Steven Adler [mailto:adler1@us.ibm.com]
> > Enviado el: 27 March 2015 15:41
> > Para: Bart van Leeuwen
> > CC: DWBP WG
> > Asunto: Re: NY Property Tax Explorer
> >
> >
> > Bart,
> >
> > A PDF might not conform to your definition of a best practice, but NYC
> is publishing tens of thousands of PDF's that describe property taxes,
> hospitals, crime reports, and housing inspections.
> >
> > My point is that if we restrict our recommendations of best practices to
> only conform to what we define as the best file types, we are deliberately
> limiting the relevance of our work in the real world.
> >
> >
> >
> >
> >
> > Best Regards,
> >
> > Steve
> >
> > Motto: "Do First, Think, Do it Again"
> >
> > Bart van Leeuwen ---03/27/2015 10:35:44 AM---I think we try to assemble
> a 'best practice' with this working group. I sincerely hope you don't con
> >
> >
> >
> >
> > From:
> >
> > Bart van Leeuwen <bart_van_leeuwen@netage.nl>
> >
> >
> > To:
> >
> > Steven Adler/Somers/IBM@IBMUS
> >
> >
> > Cc:
> >
> > "DWBP WG" <public-dwbp-wg@w3.org>
> >
> >
> > Date:
> >
> > 03/27/2015 10:35 AM
> >
> >
> > Subject:
> >
> > Re: NY Property Tax Explorer
> >
> >
> >
> > I think we try to assemble a 'best practice' with this working group.
> > I sincerely hope you don't consider data published in a PDF to conform
> to this best practice.
> >
> > I'm not arguing that it is possible to get usable data from these
> formats, but they were not intended to carry data in a machine readable way.
> >
> > Bart
> >
> > Steven Adler <adler1@us.ibm.com> wrote on 27-03-2015 15:09:32:
> >
> > > From: Steven Adler <adler1@us.ibm.com>
> > > To: "DWBP WG" <public-dwbp-wg@w3.org>
> > > Date: 27-03-2015 15:10
> > > Subject: NY Property Tax Explorer
> > >
> > > You may recall I submitted a use case about this example from NYC
> > > last year.  The developer, Chris Wong, who works for Socrata, wrote
> > > a Ruby routine to scrape 1000 PDF files for property tax data to
> > > fill out this map app:
> > >
> > > http://www.w3.org/2013/dwbp/track/issues/56
> > >
> > > Chris is a self-taught developer, by no means a pro.  I think this
> > > story well demonstrates that Data on the Web today is quite
> > > innovative and PDF, JPG, AVI, MP3, and MP4 are commonly machine
> readable.
> > >
> > > Restricting our recommendations to file formats that conform only
> > > those covered by W3C WG's (JSON, CSV, RDF, etc) ignores the reality
> > > of how Open Data is published and used.
> > >
> > >
> > > Best Regards,
> > >
> > > Steve
> > >
> > > Motto: "Do First, Think, Do it Again"
> >
> >
> >
> >
> > --
> > .  .  .  .. .  .
> > .        .   . ..
> > .     ..       .
> >
>
>
>
Received on Friday, 27 March 2015 19:04:03 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 March 2015 19:04:04 UTC