W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2015

Re: NY Property Tax Explorer

From: Steven Adler <adler1@us.ibm.com>
Date: Fri, 27 Mar 2015 10:41:05 -0400
To: Bart van Leeuwen <bart_van_leeuwen@netage.nl>
Cc: "DWBP WG" <public-dwbp-wg@w3.org>
Message-ID: <OF01A35D34.C18196FE-ON85257E15.00506388-85257E15.0050AA9B@us.ibm.com>

Bart,

A PDF might not conform to your definition of a best practice, but NYC is
publishing tens of thousands of PDF's that describe property taxes,
hospitals, crime reports, and housing inspections.

My point is that if we restrict our recommendations of best practices to
only conform to what we define as the best file types, we are deliberately
limiting the relevance of our work in the real world.





Best Regards,

Steve

Motto: "Do First, Think, Do it Again"


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Bart van Leeuwen <bart_van_leeuwen@netage.nl>                                                                                                     |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Steven Adler/Somers/IBM@IBMUS                                                                                                                     |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Cc:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |"DWBP WG" <public-dwbp-wg@w3.org>                                                                                                                 |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |03/27/2015 10:35 AM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: NY Property Tax Explorer                                                                                                                      |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





I think we try to assemble a 'best practice' with this working group.
I sincerely hope you don't consider data published in a PDF to conform to
this best practice.

I'm not arguing that it is possible to get usable data from these formats,
but they were not intended to carry data in a machine readable way.

Bart

Steven Adler <adler1@us.ibm.com> wrote on 27-03-2015 15:09:32:

> From: Steven Adler <adler1@us.ibm.com>
> To: "DWBP WG" <public-dwbp-wg@w3.org>
> Date: 27-03-2015 15:10
> Subject: NY Property Tax Explorer
>
> You may recall I submitted a use case about this example from NYC
> last year.  The developer, Chris Wong, who works for Socrata, wrote
> a Ruby routine to scrape 1000 PDF files for property tax data to
> fill out this map app:
>
> http://www.w3.org/2013/dwbp/track/issues/56
>
> Chris is a self-taught developer, by no means a pro.  I think this
> story well demonstrates that Data on the Web today is quite
> innovative and PDF, JPG, AVI, MP3, and MP4 are commonly machine readable.

>
> Restricting our recommendations to file formats that conform only
> those covered by W3C WG's (JSON, CSV, RDF, etc) ignores the reality
> of how Open Data is published and used.
>
>
> Best Regards,
>
> Steve
>
> Motto: "Do First, Think, Do it Again"





graycol.gif
(image/gif attachment: graycol.gif)

ecblank.gif
(image/gif attachment: ecblank.gif)

Received on Friday, 27 March 2015 14:41:45 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 March 2015 14:41:46 UTC