W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2015

Re: NY Property Tax Explorer

From: Steven Adler <adler1@us.ibm.com>
Date: Fri, 27 Mar 2015 14:50:13 -0400
To: Laufer <laufer@globo.com>
Cc: Bart van Leeuwen <bart_van_leeuwen@netage.nl>, Christophe Guéret <christophe.gueret@dans.knaw.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <OF28612EBC.A166718B-ON85257E15.0066E0CB-85257E15.006779A5@us.ibm.com>
I mean that a best practice applies even when you are doing things that are
less than perfect.  For example:

We recommend that published Open Data uses DCAT+ metadata.  This should
apply to JSON, RDF, CSV and PDF, JPEG, AVI, or even to "ancient"
Wordperfect documents from the 1980's.

I would not want us to say that our best practices only apply to W3C
blessed file types, because:

1.  It ignores the reality of the way the rest of the world publishes data
(which btw, is exactly the issue the CSV WG is designed to address because
W3C was rightly critized before CSV of only advocating for its own
standards)

2.  It limits the audience who will care about what we write




Best Regards,

Steve

Motto: "Do First, Think, Do it Again"


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Laufer <laufer@globo.com>                                                                                                                         |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Steven Adler/Somers/IBM@IBMUS                                                                                                                     |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Cc:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Christophe Guéret <christophe.gueret@dans.knaw.nl>, Bart van Leeuwen <bart_van_leeuwen@netage.nl>, Makx Dekkers <mail@makxdekkers.com>, DWBP WG   |
  |<public-dwbp-wg@w3.org>                                                                                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |03/27/2015 02:40 PM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: NY Property Tax Explorer                                                                                                                      |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





Steve,

I understand your concerns and, for me, I think that when we say that there
are some best practices, we are not saying to people to not publish if they
cannot do the best practices. If they don't have a choice, well, it is
better to publish in PDF. But it is not a best practice. It is a practice
better than no practice.

As I was discussing in the thread of 5 stars LOD (as a scale of quality
that is understood many times as the absolute scale of quality of Data
Published on The Web), the LOD scale is not the absolute scale of quality
but it is one of them. But besides this scale, there are other quality axes
that could be enhanced, even using PDFs, for example, good metadata (about
licenses, SLAs, versions, update periods, etc.) good data, etc.

So, IMHO, what we can say to someone that publish in PDF, and have no other
choice, is that the quality of the publication could be enhanced in
different ways, aggregating good metadata for example, etc. And when the
PDF could be replaced by another format, so, do it.

Abraços,
Laufer

2015-03-27 13:07 GMT-03:00 Steven Adler <adler1@us.ibm.com>:
  So, does our BP document only apply to data published in the future in
  the file types we bless?


  Best Regards,

  Steve

  Motto: "Do First, Think, Do it Again"

  Inactive hide details for Christophe Guéret ---03/27/2015 11:40:10
  AM---Hoi, We are not writing a document that describes how pChristophe
  Guéret ---03/27/2015 11:40:10 AM---Hoi, We are not writing a document
  that describes how people publish and consume



                                                                           
                                                                           
       Fro Christophe Guéret <christophe.gueret@dans.knaw.nl>              
       m:                                                                  
                                                                           
                                                                           
       To: Makx Dekkers <mail@makxdekkers.com>                             
                                                                           
                                                                           
       Cc: Steven Adler/Somers/IBM@IBMUS, DWBP WG <public-dwbp-wg@w3.org>, 
           Bart van Leeuwen <bart_van_leeuwen@netage.nl>                   
                                                                           
                                                                           
       Dat 03/27/2015 11:40 AM                                             
       e:                                                                  
                                                                           
                                                                           
       Sub RE: NY Property Tax Explorer                                    
       jec                                                                 
       t:                                                                  
                                                                           





  Hoi,


  We are not writing a document that describes how people publish and
  consume open data, we are writing guidelines on how they can best do it.


  The concept of "best" is obviously subjective but I hope we can at list
  agree on some points.


  I was recently sitting with people dealing with crisis. They need a lot
  of data and when asking for it they sometimes get a PDF with a picture of
  a hand written table in it. According to the publisher this is good open
  data. Is it really so? The consumers spent a lot of time extracting the
  data from it...


  Our document could help there by letting the consumers having something
  to help arguing with the publisher and hopefully get something more
  usable.


  As for every best practices, there is no guarantee ours will be followed
  but having somewhere an officially endorsed way of publishing good open
  data will surely be welcomed by many data publishers and consumers.


  Cheers,
  Christophe


  --
  Sent with difficulties. Sorry for the brievety and typos...


  Op 27 mrt. 2015 16:19 schreef "Makx Dekkers" <mail@makxdekkers.com>:


        Apologies for missing the call, again, today.





        In my mind, we really need to say what we mean with ‘best
        practice’. Do we really think we can define one best practice
        implying that all the rest is ‘bad practice’? I don’t think so.
        What I would like to see is ‘practice related to objectives’ and
        then try to determine what kinds of behaviour make sense for what
        kinds of objectives.





        For example, certain forms of PDF are really good if you want to
        enable out-loud reading of documents for the blind, but not so good
        to extract tabular information. If you want to make your tabular
        data useful for applications, there are better ways to publish the
        data than PDF.





        As I earlier argued for metadata best practices, I think the most
        useful kind of advice should be something like: if you want to do
        A, then if you publish data as X you will have the following
        advantages and disadvantages, and you should really consider format
        Y to increase usefulness of your data.





        Makx.








        De: Steven Adler [mailto:adler1@us.ibm.com]
        Enviado el: 27 March 2015 15:41
        Para: Bart van Leeuwen
        CC: DWBP WG
        Asunto: Re: NY Property Tax Explorer





        Bart,

        A PDF might not conform to your definition of a best practice, but
        NYC is publishing tens of thousands of PDF's that describe property
        taxes, hospitals, crime reports, and housing inspections.

        My point is that if we restrict our recommendations of best
        practices to only conform to what we define as the best file types,
        we are deliberately limiting the relevance of our work in the real
        world.





        Best Regards,

        Steve

        Motto: "Do First, Think, Do it Again"

        Inactive hide details for Bart van Leeuwen ---03/27/2015 10:35:44
        AM---I think we try to assemble a 'best practice' with this wBart
        van Leeuwen ---03/27/2015 10:35:44 AM---I think we try to assemble
        a 'best practice' with this working group. I sincerely hope you
        don't con


                                                                           
                                                                           
                  Bart van Leeuwen <bart_van_leeuwen@netage.nl>            
       From:                                                               
                                                                           
                                                                           
                  Steven Adler/Somers/IBM@IBMUS                            
       To:                                                                 
                                                                           
                                                                           
                  "DWBP WG" <public-dwbp-wg@w3.org>                        
       Cc:                                                                 
                                                                           
                                                                           
                  03/27/2015 10:35 AM                                      
       Date:                                                               
                                                                           
                                                                           
                  Re: NY Property Tax Explorer                             
       Subject:                                                            
                                                                           








        I think we try to assemble a 'best practice' with this working
        group.
        I sincerely hope you don't consider data published in a PDF to
        conform to this best practice.

        I'm not arguing that it is possible to get usable data from these
        formats, but they were not intended to carry data in a machine
        readable way.

        Bart

        Steven Adler <adler1@us.ibm.com> wrote on 27-03-2015 15:09:32:

        > From: Steven Adler <adler1@us.ibm.com>
        > To: "DWBP WG" <public-dwbp-wg@w3.org>
        > Date: 27-03-2015 15:10
        > Subject: NY Property Tax Explorer
        >
        > You may recall I submitted a use case about this example from NYC

        > last year.  The developer, Chris Wong, who works for Socrata,
        wrote
        > a Ruby routine to scrape 1000 PDF files for property tax data to
        > fill out this map app:
        >
        > http://www.w3.org/2013/dwbp/track/issues/56

        >
        > Chris is a self-taught developer, by no means a pro.  I think
        this
        > story well demonstrates that Data on the Web today is quite
        > innovative and PDF, JPG, AVI, MP3, and MP4 are commonly machine
        readable.
        >
        > Restricting our recommendations to file formats that conform only

        > those covered by W3C WG's (JSON, CSV, RDF, etc) ignores the
        reality
        > of how Open Data is published and used.
        >
        >
        > Best Regards,
        >
        > Steve
        >
        > Motto: "Do First, Think, Do it Again"





--
.  .  .  .. .  .
.        .   . ..
.     ..       .






graycol.gif
(image/gif attachment: graycol.gif)

ecblank.gif
(image/gif attachment: ecblank.gif)

Received on Friday, 27 March 2015 18:50:55 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 March 2015 18:50:55 UTC