W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > January 2015

RE: audience for the BP doc

From: Steven Adler <adler1@us.ibm.com>
Date: Thu, 8 Jan 2015 12:11:47 -0500
To: "Makx Dekkers" <mail@makxdekkers.com>
Cc: "'Annette Greiner'" <amgreiner@lbl.gov>, "'Bernadette Farias Lóscio'" <bfl@cin.ufpe.br>, "'Eric Stephan'" <ericphb@gmail.com>, "'Laufer'" <laufer@globo.com>, "'DWBP Public List'" <public-dwbp-wg@w3.org>
Message-ID: <OF80DDF317.A45D3630-ON85257DC7.005E40AF-85257DC7.005E7685@us.ibm.com>
This would be true if all publishers were not also users and if all users
would also not be publishers.  That era existed from 1492-1992.  But since
the commercialization of the internet, the invention of social networking,
and the growing importance of citizen journalism the distinction between
publisher and user is academic.

Every publisher is a user and every user is a publisher.

Best Regards,


Motto: "Do First, Think, Do it Again"

| From:      |
  |"Makx Dekkers" <mail@makxdekkers.com>                                                                                                             |
| To:        |
  |"'Laufer'" <laufer@globo.com>, "'Bernadette Farias Lóscio'" <bfl@cin.ufpe.br>                                                                     |
| Cc:        |
  |"'Eric Stephan'" <ericphb@gmail.com>, "'Annette Greiner'" <amgreiner@lbl.gov>, "'DWBP Public List'" <public-dwbp-wg@w3.org>                       |
| Date:      |
  |12/16/2014 02:37 PM                                                                                                                               |
| Subject:   |
  |RE: audience for the BP doc                                                                                                                       |

Let me try and explain a bit better why I think it makes sense to focus on
the data publishers first and then maybe later on data consumers.

First of all, getting best practices for data publishers is already
difficult enough. As Annette argued earlier, even feedback is something
that publishers need to think about; after all, they need to provide the
feedback channels. Let’s do that first and not dilute our focus trying to
do too much at the same time.

Secondly, a re-user may not be very interested in reading about best
practices related to accessing and re-using data in general; a re-user will
be primarily interested in how the specific data in which he or she is
interested is published by a particular set of publishers – and if these
publishers follow best practice, it’s likely that the re-use of the data is
made easier for the re-user. For example, if the publisher provides a
feedback channel, in line with best practice, re-users will use that
channel if there’s something wrong with the data. We don’t have to tell
them to do that.

As an anecdote, at the recent Share-PSI workshop in Lisbon, there was an
entrepreneur who made clear that no-one should try to tell him what to do
or how to do it: he said that he and other developers were smart enough to
figure out how to do things the way that makes sense to them. After all,
they’re the innovators – in that sense, they may not be comparable to the
“developers who write applications that interact with the data system” that
Eric mentioned.

So, I am just afraid that best practices for developers are not going to be
read by many people. Should we then put time and energy into work that
might not reach its audience?


From: Laufer [mailto:laufer@globo.com]
Sent: Tuesday, December 16, 2014 7:57 PM
To: Bernadette Farias Lóscio
Cc: Makx Dekkers; Eric Stephan; Annette Greiner; DWBP Public List
Subject: Re: audience for the BP doc

Hi, All,
It is easy to see that is difficult to define exactly the audience of the
BP document. We have BPs that are related to the process of better
communicating to a consumer (a developer, a final user, etc.) the data in
the Dataset (structure, etc.). We have things related to the use of the
Dataset (license, etc.). Another set of BPs are related to how to maintain
these data alive on the Web (persistence, preservation, etc.). Others are
related to enhance the quality of data (quality, usage, etc.). And so on.
We have a lot of players around a Dataset. Persons that have a relation to
some set of "Published Data". We can call all of them "data publishers". No
problem. But these does not mean that we will have only persons with the
same competences. Even if we call all of them "data publishers", IMHO, I
think that we need to talk about these different types of professionals
related to data publishing. In some sense, it is similar to building a set
of Web Pages. We have a team of different professionals, with different
competences, working to publish that pages.

2014-12-16 16:26 GMT-02:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>:
 Hi all,

 Thanks for your comments!

 I agree with Makx that it could be a good idea to concentrate on the
 audience of data providers (data publishers). However, if we do this then
 the whole discourse that was built until now has to be changed because we
 are always talking about data publication and data usage. For example, the
 first sentence of the abstract says: "This document provides best
 practices related to the publication and usage of data on the Web designed
 to help support a self-sustaining ecosystem".

 Moreover, the document is about "Data on the Web Best Practices" and not
 only about "Publishing Data on the Web Best Practices".

 As proposed in the charter, the mission of our group includes: "to develop
 the open data ecosystem, facilitating better communication between
 developers and publishers;". In this sense, I think that it is also
 important to tell developers (or data consumers in general) how they can
 interact with data publishers, i.e., how they can provide feedback to data
 publishers and also how they can provide information that helps to find
 out how data has been used.

 However, before we decide if we're gonna abandon the BP for data
 consumers, I think it is really important to have an agreement about the
 role of data publishers and data consumers.

 In my point of view, data consumer concerns the one who wants to use data
 available on the Web to produce "something" instead of just reading the
 data. For example, when a developer uses raw data available on the Web to
 develop an application, then the developer plays the role of a data
 consumer and not the role of a data publisher.

 Concerning data publishers, I agree with Eric that "Publishers just focus
 on hosting and administering their data on the web in an orderly way".

 kind regards,

 2014-12-16 8:36 GMT-03:00 Makx Dekkers <mail@makxdekkers.com>:
  Eric, Annette, all,

  To me, it would make sense if we concentrated on the audience of data
  providers, at least for now. I think this is already a big order.

  If we also want to cover best practices for the re-users of data
  (developers, aggregators, mix-and-matchers, brokers, whatever you want to
  call them), we’ll be spreading a scarce resource (ourselves) even
  thinner, and run the risk of producing two sets of insufficient quality.

  Let’s focus on the data providers first and then, when we have a good set
  of best practices and still have time left, turn our attention to the
  consumer side of the picture.


  2014-12-16 6:29 GMT+01:00 Eric Stephan <ericphb@gmail.com>:
  Thanks Annette for sharing your thoughts on this topic in the meeting
  last week and in this email.  In your text the term consumers really
  jumped out at me.  If consumers only has a read-only connotation then I'd
  rather avoid this term altogether.  Actually consumers was never actually
  never mentioned originally as part of the working group mission, instead
  the term "developer" was used.

  Developers to me, are technologists building applications and devices
  that reuse published data, including creating new data that can be
  published, processing and modifying published data, or strictly reading
  data in the life span of a running application. Users rely on the tools
  created by publishers and developers to edit published data and provide
  feedback.  Publishers to me just focus on hosting and administering their
  data on the web in an orderly way.  Since the original intent of BP was
  to "facilitate better communication between developers and publishers.'
  Maybe there should be best practices that target publishers and
  developers divided into two documents.

  The closest analogy is that off the shelf data storage systems two types
  of documentation are written:
  1) Data administrators who manage the data system
  2) End users (developers) who write applications that interact with the
  data system


  Eric S

  On Mon, Dec 15, 2014 at 1:08 PM, Annette Greiner <amgreiner@lbl.gov>
   Hi folks,
   To pick up the discussion about our audience, I want to set down what I
   see as our audience for the current BP document. By audience I mean the
   people we expect to actually sit down and read it, not the people whose
   interests we need to consider in creating it (those are what I call
   stakeholders). It’s possible that we all agree but are just thinking of
   the terms differently.

   To my mind, our audience includes anyone involved in making data
   available to consumers on the web. That is publishing data. It includes
   anyone who collects or collates the data, organizes the data, creates
   web pages or apps to share the data, re-publishes it in such a way that
   others can re-use it, or makes decisions relevant to how people do those
   tasks. They could be developers, lawyers, CIOs, researchers, archivists,
   designers, almost any job title. What matters, though, is not their job
   title but what actions they take with respect to the data. The action of
   consuming it is not what we have been discussing, it isn’t represented
   in any of the current best practices or in our scoping criteria, and it
   isn’t called for in the charter’s requirement to create a BP document.
   Thus far, we are not targeting our BPs to people who are *only*
   consuming the data and not republishing it.

   I’ve already talked about the charter and the existing BPs in a previous
   email, so I’ll just address the scoping criteria here. The first one,
   being unique to publishing on the web, is obviously about publishing
   rather than consuming. The second one, encouraging reuse, is also about
   publishing, just in such a way that someone else can make use of the
   data. The charter mentions re-use in its mission in list item 2, which
   calls on us to "provide _guidance_to_publishers_ that will improve
   consistency in the way data is managed, thus promoting the re-use of
   data". If a consumer wants to publish something that makes the data
   truly re-usable, they must include the data itself, which means that
   they are publishing the data. The third criterion, testability, simply
   deals with the mechanics of making sure that one is successful in
   achieving the best practices.

   It might help to consider an example: your organization publishes data
   about traffic in Rio. It's made available through an API. A data
   scientist in Lisbon is interested in the data and makes a visualization
   based on it that she posts on her blog. The data scientist does not make
   the data available in any form other than the visualization itself. She
   has not really enriched your data, because the original data still has
   no connection to the visualization. She cannot take action on any of the
   best practices we have identified thus far unless she re-publishes it
   herself, as data.

   Your organization could link to the visualization, thereby enriching the
   data, but the data scientist in Lisbon cannot force it to do that. Our
   best practice around data enrichment calls on publishers to consider
   making that link or creating the visualization themselves. If we were
   writing that same best practice for a consumer audience, it would have
   to say something like "you should enrich other people's data". So, we
   would end up telling data enrichers that they should enrich data, which
   strikes me as tautological. One could go into detail about how to make
   good visualizations (use good labels, don’t rely on color alone, provide
   a zero point in your scales, etc.), but that seems to me out of scope.
   (I teach an entire semester course on visualization, so I could come up
   with lots of best practices about it, but I don't think we want to go
   there in the BP document we’ve been working on.)

   Now suppose the consumer in Lisbon would like to provide feedback. If
   we, as the publisher, have not provided a mechanism for them to do so,
   they cannot provide it. Our best practice is about making it possible to
   provide feedback and then acting on the feedback to improve the
   published data. A consumer has a role here, but again, there is little
   point to telling a consumer who wants to give feedback that they should
   give feedback. I certainly wouldn’t expect a data consumer to wade
   through a long list of publisher-oriented best practices to be told that
   they should give feedback whenever they are so inclined.

   I would support the idea of putting together a separate list of best
   practices for data consumers if we can think of a way to scope it that


   Annette Greiner
   NERSC Data and Analytics Services
   Lawrence Berkeley National Laboratory


  Makx Dekkers

 Bernadette Farias Lóscio
 Centro de Informática
 Universidade Federal de Pernambuco - UFPE, Brazil

.  .  .  .. .  .
.        .   . ..
.     ..       .

(image/gif attachment: graycol.gif)

(image/gif attachment: ecblank.gif)

Received on Thursday, 8 January 2015 17:12:31 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 8 January 2015 17:12:31 UTC