Re: audience for the BP doc

I agree with Laufer's comments.  The intended audience is Data Publishers,
who, by the way, are often also Data Consumers.  So it is more than
possible to address the former and include the latter without anyone
feeling left out.


Best Regards,

Steve

Motto: "Do First, Think, Do it Again"


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Laufer <laufer@globo.com>                                                                                                                         |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Annette Greiner <amgreiner@lbl.gov>                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Cc:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Bernadette Farias Lóscio <bfl@cin.ufpe.br>, Makx Dekkers <mail@makxdekkers.com>, Eric Stephan <ericphb@gmail.com>, DWBP Public List               |
  |<public-dwbp-wg@w3.org>                                                                                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |12/16/2014 06:48 PM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: audience for the BP doc                                                                                                                       |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





Well, I exposed my thoughts. I do not want to extend this discussion.

Cheers,
Laufer


Em terça-feira, 16 de dezembro de 2014, Annette Greiner <amgreiner@lbl.gov>
escreveu:
  I think we do need comments from data consumers in developing the best
  practices. That is why I suggested that, in developing our use cases, we
  try to talk with people who had been consumers of the data described in
  those use cases. That’s a good way to identify issues to address. But
  that is our process for developing the BPs, not who the audience for the
  final document should be. Reviewers are often not members of the intended
  audience of a piece. If I wrote a children’s book about astronauts, I
  would want an astronaut to review it, but I wouldn’t then write the book
  with astronauts as the audience.
  -Annette
  --
  Annette Greiner
  NERSC Data and Analytics Services
  Lawrence Berkeley National Laboratory
  510-495-2935

  On Dec 16, 2014, at 12:14 PM, Laufer <laufer@globo.com> wrote:

        Just thinking...

        We are doing a document for data publishers that may, should or
        must have best practices that could be valuable for data consumers,
        but we think that this document is not for data consumers...

        So, data publishers know what are the things that data consumers
        consider best practices for them... And we do not need comments
        from data consumers...

        Data consumers should (must?) assess the best practices...

        Best Regards,
        Laufer

        2014-12-16 17:43 GMT-02:00 Annette Greiner <amgreiner@lbl.gov>:
         I think the introduction is not suitable because it says that we
         are writing BPs for use by consumers of data, but none of our
         current BPs is written as a BP on which consumers (other than
         those who are re-publishing, and are therefore publishers) can
         take action. They do not address consumers as an audience.
         -Annette
         --
         Annette Greiner
         NERSC Data and Analytics Services
         Lawrence Berkeley National Laboratory
         510-495-2935

         On Dec 16, 2014, at 11:34 AM, Bernadette Farias Lóscio <
         bfl@cin.ufpe.br> wrote:

               Hi Annette,

               Thank you for answer! My comments are inline.

                 I think we need to have non-normative material that
                 matches our normative material. This discussion started up
                 because we have a disconnect there.

               The first four sections of the document are non-normative
               and the idea is to use them to explain our context and to
               give definitions that are relevant for readers to understand
               the document. Maybe, instead of having a separate document
               we should try to improve these sections.

                 If we want to keep the introduction as is, we would need
                 to change the best practices we are developing, broadening
                 the scope considerably. I think it’s much less work to
                 make the introduction work for the content it’s meant to
                 introduce.

               Could you please explain why the introduction is not
               suitable for the BP that will be developed? I'm sorry, but
               this is not clear for me.

               It is important to note that BP will be developed according
               to the challenges/requirements identified in the Use Cases
               Document [1].

                 I’d be happy to take a stab at rewriting if you like. My
                 feeling is that it doesn’t really need to change all that
                 much, because we do want to still mention the importance
                 of considering usage when you publish. (BTW, I think we
                 should be trying to get publishers to think of putting
                 data on the web as more than merely hosting files and
                 administering the data. In fact, we have a list of things
                 they should be thinking about: the best practices
                 document.)

               I agree with you! Data publishers have a really hard work to
               make data available on the Web and that's why the BP
               document is being proposed.

               kind regards,
               Bernadette


               [1] http://www.w3.org/TR/dwbp-ucr/


                 -Annette

                 --
                 Annette Greiner
                 NERSC Data and Analytics Services
                 Lawrence Berkeley National Laboratory
                 510-495-2935

                 On Dec 16, 2014, at 10:26 AM, Bernadette Farias Lóscio <
                 bfl@cin.ufpe.br> wrote:

                       Hi all,

                       Thanks for your comments!

                       I agree with Makx that it could be a good idea to
                       concentrate on the audience of data providers (data
                       publishers). However, if we do this then the whole
                       discourse that was built until now has to be changed
                       because we are always talking about data publication
                       and data usage. For example, the first sentence of
                       the abstract says: "This document provides best
                       practices related to the publication and usage of
                       data on the Web designed to help support a
                       self-sustaining ecosystem".

                       Moreover, the document is about "Data on the Web
                       Best Practices" and not only about "Publishing Data
                       on the Web Best Practices".

                       As proposed in the charter, the mission of our group
                       includes: "to develop the open data ecosystem,
                       facilitating better communication between developers
                       and publishers;". In this sense, I think that it is
                       also important to tell developers (or data consumers
                       in general) how they can interact with data
                       publishers, i.e., how they can provide feedback to
                       data publishers and also how they can provide
                       information that helps to find out how data has been
                       used.

                       However, before we decide if we're gonna abandon the
                       BP for data consumers, I think it is really
                       important to have an agreement about the role of
                       data publishers and data consumers.

                       In my point of view, data consumer concerns the one
                       who wants to use data available on the Web to
                       produce "something" instead of just reading the
                       data. For example, when a developer uses raw data
                       available on the Web to develop an application, then
                       the developer plays the role of a data consumer and
                       not the role of a data publisher.

                       Concerning data publishers, I agree with Eric that "
                       Publishers just focus on hosting and administering
                       their data on the web in an orderly way".

                       kind regards,
                       Bernadette


                       2014-12-16 8:36 GMT-03:00 Makx Dekkers <
                       mail@makxdekkers.com>:
                        Eric, Annette, all,

                        To me, it would make sense if we concentrated on
                        the audience of data providers, at least for now. I
                        think this is already a big order.

                        If we also want to cover best practices for the
                        re-users of data (developers, aggregators,
                        mix-and-matchers, brokers, whatever you want to
                        call them), we’ll be spreading a scarce resource
                        (ourselves) even thinner, and run the risk of
                        producing two sets of insufficient quality.

                        Let’s focus on the data providers first and then,
                        when we have a good set of best practices and still
                        have time left, turn our attention to the consumer
                        side of the picture.

                        Makx.


                        2014-12-16 6:29 GMT+01:00 Eric Stephan <
                        ericphb@gmail.com>:
                          Thanks Annette for sharing your thoughts on this
                          topic in the meeting last week and in this
                          email.  In your text the term consumers really
                          jumped out at me.  If consumers only has a
                          read-only connotation then I'd rather avoid this
                          term altogether.  Actually consumers was never
                          actually never mentioned originally as part of
                          the working group mission, instead the term
                          "developer" was used.

                          Developers to me, are technologists building
                          applications and devices that reuse published
                          data, including creating new data that can be
                          published, processing and modifying published
                          data, or strictly reading data in the life span
                          of a running application. Users rely on the tools
                          created by publishers and developers to edit
                          published data and provide feedback.  Publishers
                          to me just focus on hosting and administering
                          their data on the web in an orderly way.  Since
                          the original intent of BP was to "facilitate
                          better communication between developers and
                          publishers.'  Maybe there should be best
                          practices that target publishers and developers
                          divided into two documents.

                          The closest analogy is that off the shelf data
                          storage systems two types of documentation are
                          written:
                          1) Data administrators who manage the data system
                          2) End users (developers) who write applications
                          that interact with the data system

                          Thanks,

                          Eric S


                          On Mon, Dec 15, 2014 at 1:08 PM, Annette
                          Greiner <amgreiner@lbl.gov> wrote:
                            Hi folks,
                            To pick up the discussion about our audience, I
                            want to set down what I see as our audience for
                            the current BP document. By audience I mean the
                            people we expect to actually sit down and read
                            it, not the people whose interests we need to
                            consider in creating it (those are what I call
                            stakeholders). It’s possible that we all agree
                            but are just thinking of the terms differently.

                            To my mind, our audience includes anyone
                            involved in making data available to consumers
                            on the web. That is publishing data. It
                            includes anyone who collects or collates the
                            data, organizes the data, creates web pages or
                            apps to share the data, re-publishes it in such
                            a way that others can re-use it, or makes
                            decisions relevant to how people do those
                            tasks. They could be developers, lawyers, CIOs,
                            researchers, archivists, designers, almost any
                            job title. What matters, though, is not their
                            job title but what actions they take with
                            respect to the data. The action of consuming it
                            is not what we have been discussing, it isn’t
                            represented in any of the current best
                            practices or in our scoping criteria, and it
                            isn’t called for in the charter’s requirement
                            to create a BP document. Thus far, we are not
                            targeting our BPs to people who are *only*
                            consuming the data and not republishing it.

                            I’ve already talked about the charter and the
                            existing BPs in a previous email, so I’ll just
                            address the scoping criteria here. The first
                            one, being unique to publishing on the web, is
                            obviously about publishing rather than
                            consuming. The second one, encouraging reuse,
                            is also about publishing, just in such a way
                            that someone else can make use of the data. The
                            charter mentions re-use in its mission in list
                            item 2, which calls on us to "provide
                            _guidance_to_publishers_ that will improve
                            consistency in the way data is managed, thus
                            promoting the re-use of data". If a consumer
                            wants to publish something that makes the data
                            truly re-usable, they must include the data
                            itself, which means that they are publishing
                            the data. The third criterion, testability,
                            simply deals with the mechanics of making sure
                            that one is successful in achieving the best
                            practices.

                            It might help to consider an example: your
                            organization publishes data about traffic in
                            Rio. It's made available through an API. A data
                            scientist in Lisbon is interested in the data
                            and makes a visualization based on it that she
                            posts on her blog. The data scientist does not
                            make the data available in any form other than
                            the visualization itself. She has not really
                            enriched your data, because the original data
                            still has no connection to the visualization.
                            She cannot take action on any of the best
                            practices we have identified thus far unless
                            she re-publishes it herself, as data.

                            Your organization could link to the
                            visualization, thereby enriching the data, but
                            the data scientist in Lisbon cannot force it to
                            do that. Our best practice around data
                            enrichment calls on publishers to consider
                            making that link or creating the visualization
                            themselves. If we were writing that same best
                            practice for a consumer audience, it would have
                            to say something like "you should enrich other
                            people's data". So, we would end up telling
                            data enrichers that they should enrich data,
                            which strikes me as tautological. One could go
                            into detail about how to make good
                            visualizations (use good labels, don’t rely on
                            color alone, provide a zero point in your
                            scales, etc.), but that seems to me out of
                            scope. (I teach an entire semester course on
                            visualization, so I could come up with lots of
                            best practices about it, but I don't think we
                            want to go there in the BP document we’ve been
                            working on.)

                            Now suppose the consumer in Lisbon would like
                            to provide feedback. If we, as the publisher,
                            have not provided a mechanism for them to do
                            so, they cannot provide it. Our best practice
                            is about making it possible to provide feedback
                            and then acting on the feedback to improve the
                            published data. A consumer has a role here, but
                            again, there is little point to telling a
                            consumer who wants to give feedback that they
                            should give feedback. I certainly wouldn’t
                            expect a data consumer to wade through a long
                            list of publisher-oriented best practices to be
                            told that they should give feedback whenever
                            they are so inclined.

                            I would support the idea of putting together a
                            separate list of best practices for data
                            consumers if we can think of a way to scope it
                            that works.

                            -Annette


                            --
                            Annette Greiner
                            NERSC Data and Analytics Services
                            Lawrence Berkeley National Laboratory
                            510-495-2935




                        --
                        --------------------------------------------------------------------------------

                        Makx Dekkers
                        mail@makxdekkers.com
                        --------------------------------------------------------------------------------


                       --
                       Bernadette Farias Lóscio
                       Centro de Informática
                       Universidade Federal de Pernambuco - UFPE, Brazil
                       ----------------------------------------------------------------------------



               --
               Bernadette Farias Lóscio
               Centro de Informática
               Universidade Federal de Pernambuco - UFPE, Brazil
               ----------------------------------------------------------------------------



        --
        .  .  .  .. .  .
        .        .   . ..
        .     ..       .



--
.  .  .  .. .  .
.        .   . ..
.     ..       .

Received on Thursday, 18 December 2014 10:58:59 UTC